Jose Storopoli, PhD

Von Neumann

2024-06-22T05:54:46-03:00

Von Neumann's wartime Los Alamos ID badge photo

Warning: This post has KaTeX enabled, so if you want to view the rendered math formulas, you’ll have to unfortunately enable JavaScript.

John von Neumann was a Hungarian-American mathematician. But to define him as a simple mathematician would be an understatement. He is a fucking legend and one of my heroes! He single-handedly:

proposed an axiomatization of set theory along with a definition of cardinality that remains the standard one in mathematics today. Funny enough he started working on solving the paradoxes of set theory while he was only 11 years old!
laid the mathematical foundations of quantum mechanics.
made the atomic bomb possible by conceptualizing and designing the explosive lenses that were needed to compress the plutonium core of the Fat Man weapon that was later dropped on Nagasaki. He also invented the term “kiloton” of TNT as a unit of energy released in a nuclear explosion.
invented the modern computer. Yes, I know, Alan Turing came up with the idea while trying to solve the halting problem, but it was von Neumann that recognized the true potential of computing machines and designed the first computers. We even have the von Neumann architecture that pretty much underlies in every working universal Turing machine in the world.
created the field of game theory, along with the most important game theory’s theorem: the minimax theorem.
co-invented the Monte Carlo method while trying to solve the problem of neutron diffusion.
created the field of cellular automata and paved the way for the discovery of DNA.

The list above is just a quick summary of his achievements. Everywhere you look in mathematics, physics, computer science, and even biology and economics, you’ll find von Neumann’s fingerprints.

The Sharpest Mind of the 20th Century

Von Neumann was recognized by his peers as one of the most intelligent people to have ever lived. Johnny, as he was known to his friends, was a prodigy since early childhood. Some sources suggest that he could multiply two 8-digit numbers together in his head when he was six. As a child, von Neumann absorbed Ancient Greek and Latin, and spoke French, German and English as well as his native Hungarian. He devoured a forty-five-volume history of the world and was able to recite whole chapters verbatim decades later.

Here are some quotes from his contemporaries:

Enrico Fermi (Nobel Prize winner) while talking to Herbert Anderson: “You know, Herb, Johnny can do calculations in his head ten times as fast as I can! And I can do them ten times as fast as you can, Herb, so you can see how impressive Johnny is!”
George Pólya (mathematician, and whose lectures von Neumann attended as a student): “Johnny was the only student I was ever afraid of. If in the course of a lecture I stated an unsolved problem, the chances were he’d come to me at the end of the lecture with the complete solution scribbled on a slip of paper.”
Edward Teller (father of the hydrogen bomb): “Von Neumann would carry on a conversation with my 3-year-old son, and the two of them would talk as equals, and I sometimes wondered if he used the same principle when he talked to the rest of us.”
Hans Bethe (Nobel Prize winner): “I have sometimes wondered whether a brain like von Neumann’s does not indicate a species superior to that of man.”
Stanislaw Ulam (co-inventor of the Monte Carlo method): “I have had a brain, and von Neumann had a computer.”
Claude Shannon (father of information theory): “the smartest person I’ve ever met.”
Marina von Neumann Whitman (his daughter): “Although he genuinely adored my mother, my father’s first love in life was thinking, a pursuit that occupied most of his waking hours.”

Von Neumann and Oppenheimer together in-front of one of the computing machines used on the hydrogen bomb project. Von Neumann could do calculations in his head faster than these early computers and would sometimes face off against them in competitions meant to entertain the other people in the labs.

If you want to know more about von Neumann, I recommend his biography: The Man from the Future: The Visionary Life of John von Neumann by Ananyo Bhattacharya.

The Fly Puzzle

One of the most famous stories about von Neumann is the fly puzzle. This was reported by Eugene Wigner in the 1966 documentary on Von Neumann. Below is the exact part where Wigner tells the story:

The tale takes place in Los Alamos during the Manhattan Project. Max Born (Nobel Prize winner) told von Neumann’s the following puzzle:

Two bicycles begin 20 miles apart, and each travels toward the other at 10 miles per hour until they collide; meanwhile, a fly travels continuously back and forth between the bicycles at 15 miles per hour until it is squashed in the collision. How far does the fly travel in total?

One can solve this rather easily by not paying attention to the inherent infinite geometric series that the fly travels. Instead, focus on the fact that the fly is squashed when the bicycles collide, and that the two bicycles will collide in one hour. Thus, the fly travels 15 miles in total.

By the time Born had finished the question, von Neumann had already solved it. He said “Why? 15 miles, of course.” Born was surprise and said that Johnny was “one of my first scientist friends that saw the solution immediately.” Johnny then replied “I can’t understand that. It is a simple infinite geometrical series.”

Now, to understand how fast von Neumann’s mind was, let’s solve the problem the way he did. Note that the fly reaches the second bicycle when

$$ 15t = 20 - 10t,$$

where $t$ is the time in hours. Note that $t_1$ is

$$ t_1 = \frac{20}{25} = \frac{4}{5}.$$

This means that the total time it takes for the fly to reach the second bicycle for the first time is $\frac{4}{5}$ hours and the total distance traveled by the fly is $d_1 = 15 \times \frac{4}{5} = 12$ miles. It then turns around and reaches the first bicycle when

$$ 12 - 15t = 8 + 10t.$$

Solving for $t_2$ we get

$$ t_2 = \frac{4}{25}.$$

Continuing, the total distance traveled by the fly is given by summing the series

$$ 15 \sum_{n=1}^{\infty} \frac{4}{5^n} = 15.$$

This is a classical geometric series. In general, a geometric series is written as $a + ar + ar^{2} + ar^{3} + \ldots$, where a $a$ is the coefficient of each term and $r$ is the common ratio between adjacent terms.

Here we have $a = 12$ and $r = \frac{4}{5}$, and we know it converges to $\frac{a}{1-r} = 15$ when $r < 1$.

That’s how von Neumann solved the problem in his head in a matter of seconds.

License

This post is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International.

Zero-Knowledge Proofs

2024-06-08T15:48:33-03:00

Zero-Knowledge Proofs and the Meaning of Life

Warning: This post has KaTeX and mermaid.js enabled, so if you want to view the rendered math formulas, and diagrams, you’ll have to unfortunately enable JavaScript.

Lately, I’ve been diving a little into the world of Zero-Knowledge Proofs. The idea is to prove that you know something without revealing what you know. More specifically, a Zero-Knowledge Proof is a cryptographic protocol that allows a prover to convince a verifier that a statement is true without revealing any information beyond the validity of the statement. In essence, by the end of the protocol, the verifier is convinced that the prover knows the secret, and the verifier hasn’t learned anything (zero-knowledge) about the secret.

Zero-Knowledge Proofs (ZKPs) are kinda hot right now, since a lot of new Bitcoin innovations are being built on top of them. It allows for a higher level of privacy and potential scalability improvements in the Bitcoin network.

Zero-knowledge proofs are advantageous in a myriad of application, including (refer to Petkus19):

Proving statement on private data:
- Person $A$ has more than $X$ in his bank account
- In the last year, a bank did not transact with an entity $Y$
- Matching DNA without revealing full DNA
- One has a credit score higher than $Z$
Anonymous authorization:
- Proving that requester $R$ has right to access web-site’s restricted area without revealing its identity (e.g., login, password)
- Prove that one is from the list of allowed countries/states without revealing from which one exactly
- Prove that one owns a monthly pass to a subway/metro without revealing card’s id
Anonymous payments:
- Payment with full detachment from any kind of identity
- Paying taxes without revealing one’s earnings
Outsourcing computation:
- Outsource an expensive computation and validate that the result is correct without redoing the execution; it opens up a category of trustless computing
- Changing a blockchain model from everyone computes the same to one party computes and everyone verifies

The idea behind this post is to give a general overview of Zero-Knowledge Proofs, while providing further resources, especially which papers to read, to dive deeper into the subject. As always, I’ll try to keep it simple and intuitive. However, as you might guess, the subject is quite complex, and I’ll try to simplify it as much as possible; but some mathematical background is necessary.

What are ZKPs?

Let’s formalize the concept of Zero-Knowledge Proofs. A formal definition of zero-knowledge has to use some computational model, and without loss of generality, we can use the Turing Machine model. So let’s create three Turing machines:

$P$ (the prover),
$V$ (the verifier),
and $S$ (the simulator).

Let’s also spicy things up a bit and introduce an adversary $A$, and assume that it is also a Turing machine. The secret we want to prove knowledge without revealing is $x$.

The prover $P$ wants to prove to the verifier $V$ that it knows the secret $x$. They both share a common simulator $S$. The adversary $A$ is trying to fool the verifier $V$ into believing that it knows the secret $x$, without actually knowing it.

The prover $P$ generates a proof $\pi = P(S, x)$, and sends it to the verifier $V$. The verifier $V$ then checks the proof $\pi$, and decides whether to accept or reject it.

The tuple $(P, V, S)$ is a Zero-Knowledge Proof if the following properties hold:

Completeness: If the statement is true, the verifier will accept the proof.

$$ \Pr\big[V(S, \pi) = \text{accept} \big] = 1. $$

Here $\Pr\big[V(S, \pi) = \text{accept} \big]$ denotes the probability that the verifier accepts the proof given a simulator $S$ and a proof $\pi$.
Soundness: If the statement is true, no cheating prover can convince an honest verifier that it is true, except with some negligible probability ¹.

$$ \forall A, \forall x, \forall \pi: \Pr\big[V(A, S, \pi) = \text{accept} \big] < \text{negligible}. $$

Here $\Pr\big[V(A, S, \pi) = \text{accept} \big]$ denotes the probability that the verifier accepts the proof given an adversary $A$, a simulator $S$, and a proof $\pi$.
Zero-Knowledge: If the statement is true, the verifier learns nothing about the secret $x$. A proof is zero-knowledge if there exists a simulator $S$ that can simulate the verifier’s view without knowing the secret $x$.

$$ \forall x: \text{View}_V\big[P(x) \leftrightarrow V(\pi)\big] = S(x, \pi). $$

Here $\text{View}_V$ is the view of the verifier $V$, and $\leftrightarrow$ denotes the interaction between the prover and the verifier.

If you come up from a scheme that satisfies these properties, congratulations, you have a Zero-Knowledge Proof scheme and you can name it whatever you want, just like a Pokemon!

ZKPs Taxonomy

We can classify Zero-Knowledge Proofs into two broad categories:

Interactive Zero-Knowledge Proofs: In this case, the prover and the verifier interact multiple times. The prover sends a proof to the verifier, and the verifier sends a challenge to the prover, and this interaction continues until the verifier is convinced. The Fiat-Shamir Heuristic can transform an interactive ZKP into a non-interactive ZKP.
Non-Interactive Zero-Knowledge Proofs: In this case, the prover sends a proof to the verifier, and the verifier accepts or rejects the proof. No further interaction is needed.

Additionally, the setup of the simulator $S$ with respect to the data it uses can be further classified into three categories. Generally speaking, the data used by $S$ is some random bits. In trusted setups, if the data is compromised, the security of the proof is also compromised. In other words, anyone with the hold of the data can prove anything to anyone. This is bad, and we want to avoid it.

Trusted Setup: $S$ uses data that must be kept secret.
Trusted but Universal Setup: $S$ uses data that must be kept private, but it only uses for the initial setup. Future proofs can be verified without the need for the initial data, and can be considered transparent.
Transparent Setup: $S$ uses no data at all. This is the best setup, as it doesn’t require any data to be used by $S$.

Some of the most popular Zero-Knowledge Proof systems are:

zk-SNARKs: Zero-Knowledge Succinct Non-Interactive Argument of Knowledge. This is a non-interactive ZKP system with a trusted setup.
Bulletproofs: A non-interactive ZKP system with a transparent setup.
zk-STARKs: Zero-Knowledge Scalable Transparent Argument of Knowledge. This is a non-interactive ZKP system with a transparent setup, with an additional property of being (plausibly) post-quantum secure.

zk-SNARKs

zk-SNARKs are the most popular Zero-Knowledge Proof system. They are used in the Zcash protocol, and the defunct Tornado Cash smart contract. Ethereum also uses zk-SNARKs in its Layer 2 scaling solution, the zk-Rollups. BitVM also uses a SNARK-based VM to run smart contracts on top of Bitcoin.

Let’s go over the concepts behind zk-SNARKs².

The first idea: Proving Knowledge of a Polynomial

First some polynomial primer. A polynomial $f(x)$ is a function that can be written as:

$$ f(x) = c_d x^d + \ldots + c_1 x^1 + c_0 x^0 $$

where $c_d, \ldots, c_1, c_0$ are the coefficients of the polynomial, and $d$ is the degree of the polynomial.

Now, the Fundamental Theorem of Algebra states that a polynomial of degree $d$ can have at most $d$ (real-valued-only) roots³.

This can be extended to the concept that two non-equal polynomials of degree $d$ can have at most $d$ points of intersection.

The idea of proving knowledge of a polynomial is to show that you know the polynomial, without revealing the polynomial itself.

This simple protocol can be done in four steps, note that both the prover and the verifier have knowledge of the polynomial:

Verifier chooses a random value for $x$ and evaluates his polynomial locally
Verifier gives $x$ to the prover and asks to evaluate the polynomial in question
Prover evaluates his polynomial at $x$ and gives the result to the verifier
Verifier checks if the local result is equal to the prover’s result, and if so then the statement is proven with a high confidence

How much is “high confidence”? Suppose that the verifier chooses an $x$ at random from a set of $2^{256}$ values, that is a 256-bit number. According to Wolfram Alpha, the decimal approximation is $\approx 1.16 \times 10^{77}$. This is almost the number of atoms in the observable universe! The number of points where evaluations are different is $10^{77} - d$, where $d$ is the degree of the polynomial. Therefore, we can assume with overwhelming probability that the prover knows the polynomial. This is due to the fact that an adversary has $\frac{d}{10^{77}}$ chance of guessing the polynomial⁴, which we can safely consider negligible¹.

The second idea: Proving Knowledge of a Polynomial without Revealing the Polynomial

The protocol above has some implications, mainly that the protocol works only for a certain polynomial, and the verifier has to know the polynomial in advance. Which is not practical at all since we want to prove knowledge of a secret without revealing the secret itself.

We can do better, we can use the fact, also stated in the Fundamental Theorem of Algebra, that any polynomial can be factored into linear polynomials, i.e. a set of degree-1 polynomials representing a line. We can represent any valid polynomial as a product of its linear-polynomial factors:

$$ (x - a_0) (x - a_1) \ldots (x - a_d) = 0 $$

where $a_0, \ldots, a_{d}$ are the roots of the polynomial. If you wanna prove knowledge of a polynomial, it is just a matter of proving knowledge of its roots. But how do we do that without disclosing the polynomial itself? This can be accomplished by proving that a polynomial $p(x)$ is the multiplication of the factors $t(x) = (x - a_0) \ldots (x - a_d)$, called the target polynomial, and some arbitrary polynomial $h(x)$, called the residual polynomial:

$$ p(x) = t(x) \cdot h(x). $$

The prover can show that exists some polynomial $h(x)$ such that $p(x)$ can be made equal to $t(x)$. You can find $h(x)$ by simply dividing $p(x)$ by $t(x)$:

$$ h(x) = \frac{p(x)}{t(x)}. $$

Now we can create a protocol that can work for any polynomial $p(x)$ with only three steps:

Verifier samples a random value $r$, calculates $t = t(r)$ and gives $r$ to the prover
Prover calculates $h(x) = \frac{p(x)}{t(x)}$ and evaluates $p = p(r)$ and $h = h(r)$; the resulting values $p$, $h$ are provided to the verifier
Verifier then checks that $p = t \cdot h$, if so those polynomials are equal, meaning that $p(x)$ has $t(x)$ as a cofactor.

Note that the verifier has no clue about the polynomial $p(x)$, and can be convinced that the prover knows the polynomial $p(x)$.

For example, let’s consider two polynomials $p(x)$ and $t(x)$ of degree $3$:

$p(x) = x^3 - 3x^2 + 2x$
$t(x) = (x - 1) (x - 2)$

An example protocol interaction in this case could be:

Verifier samples a random value $23$, calculates $t = t(23) = (23 − 1)(23 − 2) = 462$ and gives $23$ to the prover
Prover calculates $h(x) = \frac{p(x)}{t(x)} = x$, evaluates $p = p(23) = 10626$ and $h = h(23) = 23$ and provides $p$, $h$ to the verifier
Verifier then checks that $p = t \cdot h$, i.e. $10626 = 462 \cdot 23$, which is true, and therefore the statement is proven

Great! We can prove stuff without revealing the stuff itself! Noice! We know only need to find a trick to represent any sort of computation as a polynomial.

The third idea: Representing Computations as Polynomials

We can represent any computation as a polynomial by using Arithmetic Circuits. An arithmetic circuit is a directed acyclic graph (DAG) where:

Every indegree⁵-zero node is an input gate that represents a variable $x_i$
Every node with indegree $>1$ is either:
- an addition gate, $+$, that represents the sum of its children
- a multiplication gate, $\times$, that represents the product of its children

Here’s an example of an arithmetic circuit that represents the polynomial $p(x_1, x_2) = x_2^3 + x_1 x_2^2 + x_2^2 + x_1 x_2$:

--- title: Arithmetic Circuit for p(x) --- graph BT X1(x₁) --> Plus1(+) X2(X₂) --> Plus1 X2 --> Plus2(+) One(1) --> Plus2 Plus1 --> Times(⨉) Plus2 --> Times X2 --> Times

In the circuit above, the input gates compute (from left to right) $x_{1},x_{2}$ and $1$, the sum gates compute $x_{1}+x_{2}$ and $x_{2}+1$, and the product gate computes $(x_{1}+x_{2})x_{2}(x_{2}+1)$ which evaluates to $x_{2}^{3}+x_{1}x_{2}^{2}+x_{2}^{2}+x_{1}x_{2}$.

The idea is to prove that the output of the circuit is equal to some target polynomial $t(x)$. This can be done by proving that the output of the circuit is equal to the target polynomial $t(x)$ multiplied by some arbitrary polynomial $h(x)$, as we did in the previous section.

Remarks

This is a very high-level overview of Zero-Knowledge Proofs. The subject is quite complex and requires a lot of mathematical background. I tried to simplify it as much as possible, to give a general intuition of how Zero-Knowledge Proofs work. Please check the resources below for more in-depth information.

Resources

We have tons of papers on the subject. Here are some selected few.

The whole idea of ZKPs as discussed above in three properties (Completeness, Soundness, and Zero-Knowledge) was first conceived by [SMR85]. Later [Kil92] showed that some of the properties’ assumptions can be relaxed, more specifically using computational soundness instead of statistical soundness. [Mic94] applied the Fiat-Shamir Heuristic to [Kil92]’s contributions to show that you can create any non-interactive ZKP system into a non-interactive ZKP system using the Random Oracle Model.

Going to the zk-SNARKs side, the term was introduced by [Bit11] and the first protocol, the Pinocchio protocol, was introduced by [Gen12] and [Par13]. The Bulletproofs protocol was introduced by [Bunz18], followed by the Bulletproofs++ protocol by [Eagen24].

zk-STARKs were introduced by [Ben-Sasson19].

Finally, if you want an intuitive but very comprehensive explanation of zk-SNARKs, then you should read [Petkus19].

The following video from YouTube is from the Blockchain Web3 MOOC from Berkeley University. It provides a good introduction to Zero-Knowledge Proofs, while being quite accessible to beginners.

This video from YouTube explains the math behind the Arithmetic Circuits and how to encode them as polynomials. I can’t embed the video here, since the video owner has disabled embedding.

License

This post is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International.

A function $f$ is negligible if for every polynomial $p$, there exists an $N$ such that for all $n > N$, $$ f(n) < \frac{1}{p(n)}. $$ If you want to learn more about negligible functions, read Chapter 3, Section 3.1 of the book Introduction to Modern Cryptography by Katz & Lindell. ↩︎ ↩︎
most of this section is based on Petkus19. ↩︎
the “at most” is because we are talking about real-valued-only roots. If we consider complex roots, then a polynomial of degree $d$ has exactly $d$ roots. ↩︎
the Birthday paradox states that any collision resistance scheme has a probability of $\frac{1}{2}$ of collision, hence we take the square root of the number of possible values. So, the security of the polynomial proof is $\sqrt{10^{77}} = 10^{38.5}$, which is still a huge number. ↩︎
the number of edges entering a node ↩︎

Shamir's Secret Sharing

2024-04-14T10:37:02-03:00

The Polynomial king and he can do anything!

Warning: This post has KaTeX enabled, so if you want to view the rendered math formulas, you’ll have to unfortunately enable JavaScript.

In this post, we’ll talk about Shamir’s Secret Sharing (SSS), a cryptographic algorithm that allows us to split a secret into multiple parts, called shares, in such a way that the secret can only be reconstructed if a certain number of shares are combined.

The idea is to give a visual intuition of how the algorithm works, and describe the mathematical details behind it.

The code for all the plots in this post can be found in storopoli/shamir-secret-sharing.

Polynomial Interpolation

If you have two points you can draw a unique line that passes through them. Suppose that you have the points $(3,3)$ and $(4,4)$. Hence, there is only one line that passes through these two points. See the plot below.

A line passing through two points

If you have three points you can draw a unique parabola that passes through them. Suppose that you have the points $(-4,16)$, $(1,1)$, and $(4,16)$. Hence, there is only one parabola that passes through these three points.

A parabola passing through three points

If you have four points you can draw a unique cubic polynomial that passes through them. Suppose that you have the points $(-2,8)$, $(-1,1)$, $(1,1)$, and $(2,8)$. Hence, there is only one cubic polynomial that passes through these four points.

A cubic polynomial passing through four points

As you might have guessed, if you have $n$ points you can draw a unique polynomial of degree $n-1$ that passes through them. This is called polynomial interpolation¹.

More formally, say that we have a polynomial $f(x)$ of degree $n$:

$$ f(x) = a_n x^n + a_{n-1} x^{n-1} + \ldots + a_1x + a_0 $$

and we have $n$ points $(x_1, y_1)$, $(x_2, y_2)$, $\ldots$, $(x_n, y_n)$. Then, there is a unique polynomial $f(x)$ of degree $n-1$ such that $f(x_i) = y_i$ for $i = 1, 2, \ldots, n$.

Ok now let’s connect this idea to Shamir’s Secret Sharing. Suppose you encode a secret $k$ as a number. Let’s say a private key for a Bitcoin wallet. As you’ve already know, a private key is just a very big number.

You want to split this secret into $N$ parts, called shares. You also want to specify a threshold $T$ such that the secret $k$ can only be reconstructed if at least $T$ shares are combined. Here’s how you can use polynomial interpolation to achieve this.

The idea is to use polynomial interpolation to generate a polynomial $f(x)$ of degree $T-1$ such that $f(0) = k$. In other words, the polynomial $f(x)$ when evaluated at $x = 0$ should give you the secret $k$. Then, you can generate $N$ shares by evaluating $f(x)$ at $N$ different points.

Here’s an example with $T = 4$ and $N = 5$. Our secret is $k = 5$ and since $T = 4$, we generate a polynomial of degree $T-1 = 3$. We’ve chosen the polynomial $f(x) = 2x^3 - 3x^2 + 2x + 5 $. Then, we evaluate $f(x)$ at $N = 5$ different points to generate the shares.

Shamir's Secret Sharing N=5 and T=4

Now this polynomial is guaranteed to pass through the point $(0, k)$. Hence if you evaluate $f(0)$ you get the secret $k$. To know the secret, you need to know the polynomial $f(x)$. And to know the polynomial $f(x)$, you need to know at least $T$ shares. Otherwise, you can’t reconstruct the polynomial and hence the secret.

In this setup we generate addresses from the extended public key (xpub) of a Bitcoin wallet that has the private key $k$. Then, we split the private key into shares and distribute them to different people. Only if at least $T$ people come together, they can reconstruct the private key and spend the funds.

Rotating Shares

Note that there’s nothing special about the points

$(-2, f(-2))$
$(-1, f(-1))$
$(\frac{1}{2}, f(\frac{1}{2}))$
$(1, f(1))$
$(2, f(2))$

that we’ve used in the previous example. You could have chosen any other $N$ points and the polynomial would still be the same.

Suppose now that your share buddy has lost his share. Then, the participants can get together and generate a new polynomial evaluation at any point $n \notin \{ -2, -1, \frac{1}{2}, 1, 2 \}$.

This is exactly what the image below shows:

Shamir's Secret Sharing N=5 and T=4

Here we’ve replaced the point $(-2, f(-2))$ with the point $(3, f(3))$. We also assume that the point $(-2, f(-2))$ is lost. The polynomial is still the same, and the secret can still be reconstructed if at least $T$ shares are combined.

We can also rotate all the shares. This is shown in the image below:

Shamir's Secret Sharing N=5 and T=4

Here all previous points have been replaced by new points.

The Polynomial King

I am the ~~Lizard~~ Polynomial King, I can do anything!

Jim Morrison

In the end if you somehow know the polynomial $f(x)$, you can do anything. You can rug-pull all you share buddies and take all the funds.

There are several ways that a malicious actor could learn the polynomial. For example, if the shares are generated in a predictable way, an attacker could guess the polynomial. Or, during the reconstruction phase, an attacker could learn the polynomial by observing the shares. Additionally, during a distributed share generation, an attacker could disrupt the process and force the participants to reveal their shares².

Conclusion

In this post, we’ve seen how polynomial interpolation can be used to split a secret into multiple shares. We’ve also seen how the secret can be reconstructed if a certain number of shares are combined. This is the basic idea behind Shamir’s Secret Sharing (SSS).

Note that the devil is in the details. A lot of the complexities of SSS come from the details of how the shares are generated and how the secret is reconstructed. There are several types of attacks that can be done by a malicious actor. Especially during the share generation and reconstruction phases.

The intent of this blog post is to show how elegant, simple and powerful the idea behind SSS is.

License

This post is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International.

and steams from the Lagrange interpolation. ↩︎
or force them to reuse nonces. Then, “poof”, private keys are gone. ↩︎

Sherlock Holmes Final Letter: A Simple Dead Man's Switch in Rust

2024-03-23T14:00:16-03:00

Sherlock Holmes fights Moriarty at the Reichenbach Falls

Got state secrets? Or maybe 50 BTC? Don’t trust your government or lawyers? And you want to make sure that if you die, your secrets are passed on? Don’t worry, I got you covered. In this post, I’ll introduce you to a simple no-bullshit dead man’s switch written in Rust.

Dead Man’s Switch

According to Wikipedia:

A dead man’s switch is a switch that is designed to be activated or deactivated if the human operator becomes incapacitated, such as through death, loss of consciousness, or being bodily removed from control. Originally applied to switches on a vehicle or machine, it has since come to be used to describe other intangible uses, as in computer software.

A Dead Man’s Switch (DMS) can be handy and common scenarios might be:

Password to your encrypted files: You gave a trusted person an encrypted USB drive and the DMS sends the password to decrypt it.
Bitcoin Multisig: Sending 1 of 3 keys to a trusted person. You hold 1 key, your friend holds another key, and the DMS holds the last key.
Instructions: Sending instructions on how to access something of value or importance.
Goodbye Note: Sending a goodbye note to loved ones.

A DMS is specially useful when you don’t trust the government or lawyers to handle your affairs after you die. It’s also useful when you don’t want to disclose your secrets while you are alive.

The idea is simple:

You set up a DMS.
You need to check in periodically.
If you don’t check in, the DMS is triggered.

In this post opening picture, is depicted an image from Conan Doyle’s story The Final Problem, where Sherlock Holmes fights Moriarty at the Reichenbach Falls. Eventually, both fall to their deaths. I am pretty confident that Sherlock Holmes had a DMS in place to send Watson a message.

I’ve tried finding nice implementations of DMS, but to no avail. They all were either unmaintained or plaged with spaghetti code. My inspiration to build one came from two sources. First, a friend of mine told me that he is using a bunch of badly-written shell scripts with some cron jobs to manage his DMS. Second, there might be a genuine need for a simple DMS in the privacy community. For example, finalmessage.io, despite being closed source, and you have no idea who’s behind it, has gathered enough users in a subscription model and they are not accepting new users anymore. If people are paying for this, they can pay for a Linux server somewhere. But they would need a simple DMS to run on it.

How to Use It

Disclaimer: Use at your own risk. Check the f****(as in friendly) code.

I invite you to check out the code on GitHub at storopoli/dead-man-switch. The license is AGPL-3.0, which means you can use it for free as long as you share your code. The package is also available on crates.io, Rust’s package manager.

DMS is very easy to use and deploy. There are several alternatives on how to deploy it. Here are the two easiest ways:

Building from Source:
1. In a fresh Debian/Ubuntu server install the following dependencies:
```
sudo apt-get install -y cargo pkg-config libssl-dev
```
2. Install the DMS:
```
cargo install dead-man-switch-tui
```
3. Run the app with:
```
dead-man-switch-tui
```
Using Nix. This is the easiest just do nix run github:storopoli/dead-man-switch.

Note: I’ve also released a Web Interface for the dead-man-switch. You can easily deploy it using Docker or Docker Compose. Check out the GitHub repository.

I’ve also launched a StartOS app with a simple interface for configuring and checking in with the Dead Man’s Switch. Check out the instructions on the dead-man-switch-startos repository.

Once, you successfully run the app, you will see the following output:

Initial Screen of Dead Man's Switch

If you read the instructions carefully, all you need to know is detailed in these 3 steps:

Edit the Config at /root/.config/deadman/config.toml and modify the settings.
Check-In with c within the warning time.
Otherwise the Dead Man’s Switch will be triggered and the message with optional attachment will be sent.

Upon the first run, the app will create a configuration file at an OS-agnostic config file location:

Linux: $XDG_CONFIG_HOME, i.e. $HOME/.config|/home/alice/.config
macOS: $HOME/Library/Application Support, i.e. /Users/Alice/Library/Application Support
Windows: {FOLDERID_RoamingAppData}, i.e. C:\Users\Alice\AppData\Roaming

In this example, I am running it from a Debian server as the root user. Hence, the configuration file is at /root/.config/deadman/config.toml.

If you open the configuration file, you will see the following content. I’ve added some default values for inspiration¹:

username = "me@example.com"
password = ""
smtp_server = "smtp.example.com"
smtp_port = 587
message = "I'm probably dead, go to Central Park NY under bench #137 you'll find an age-encrypted drive. Password is our favorite music in Pascal case."
message_warning = "Hey, you haven't checked in for a while. Are you okay?"
subject = "[URGENT] Something Happened to Me!"
subject_warning = "[URGENT] You need to check in!"
to = "someone@example.com"
from = "me@example.com"
timer_warning = 1209600
timer_dead_man = 604800

The configs are self-explanatory. You might need some help to set up and find a reliable SMTP server. One option is to use Gmail. Unfortunately, Proton or Tutanota are not supported because they don’t support SMTP. Just grab the support page of your email provider and search for SMTP settings. Plug the values in and you are good to go.

I want to bring your attention to the timer_warning and timer_dead_man configs. These are very important.

The way DMS works is by checking in periodically. If you don’t check in within the timer_warning time, the DMS will send a warning message to your own email, i.e. the from email declared in the config, with the message message_warning and subject subject_warning.

If you still don’t check in within the timer_dead_man time, the DMS will send the “Dead Man’s” message to the to email declared in the config, with the message message and subject subject.

The timers are in seconds, and the default values are:

Warning Timer: 2 weeks
Dead Man’s Timer: 1 week

Feel free to change these values to your liking.

You can also add an attachment to the Dead Man’s Message. Just add an attachment field to the config file with the absolute path to the file. For example:

attachment = "/root/important_file.txt"

A good idea is to make this file encrypted. Actually, it’s even better if you encrypted the whole fucking thing. You can use PGP or age. For example, this is a PGP-encrypted message:

In this message there’s a nice Easter Egg for you, my friend. The password is the name of the waterfall depicted in this post, all together and in PascalCase.

Upon checking in, the timer will be reset to the Warning Timer, even if you are already in the Dead Man’s Timer.

If both timers run out, the messages will be sent and DMS will exit.

The Implementation Details

For the stupid smelly nerds that want to go beyond the “JUST MAKE A FUCKING .EXE AND GIVE IT TO ME”.

Before we dive into the code, here are the dependencies that I am using. I’ve tried to keep them to a minimum, since I want this to be a dead-simple program. This also helps with reducing the incidence of bugs and narrowing the attack surface:

ratatui for the Terminal User Interface (TUI)
serde, toml, and directories-next for managing the TOML configuration file.
lettre to manage email sending, and mime_guess to robustly handle optional attachments.
chrono to handle timers and date/time formatting.

Note: the Dead Man’s Switch Web Interface uses axum, askama and tower.

The app is divided into a library and a binary. The library is contained in the lib.rs file and the binary in the main.rs, both under the src/ directory. Here’s a representation of the structure of src/:

src/
├── config.rs
├── email.rs
├── lib.rs
├── main.rs
├── timer.rs
└── tui.rs

As we can see, it is divided into 4 modules:

config.rs: Handles the configuration file.
email.rs: Handles the email sending.
timer.rs: Handles the timers and timer logic.
tui.rs: Handles the Terminal User Interface (TUI).

Feel free to dive in any of these files to understand the implementation details. I’ve made sure that the code is both well-tested and well-documented.

Contributions are Welcome

If you want to contribute to the project, feel free to open a pull request. I’ve marked a few issues as good first issue to help you get started. Check out the GitHub repository.

Conclusion

I’ve built a simple no-bullshit Dead Man’s Switch so that any person can use it. Feel free to use it and share it with your friends. Let’s hope that we don’t go to a dystopian future where everyone needs to use it. Although, I am pretty sure that Sherlock Holmes would have used it no matter what. Probably the way he would have used it is by:

Set-up a non-KYC email account that supports SMTP.
Sign-up for a non-KYC VPS with Bitcoin or Monero.
Access the VPS via Tor using Tails.
Change the server’s default SSH port to a random one.
Disallow password authentication and only allow key-based authentication.
Encrypt everything in the case the server is seized.

Note: Sherlock could also use a coreboot non-KYC piece of hardware that runs StartOS and the newly launched Dead Man’s Switch StartOS app that already uses an onion service for handling the check-ins via Tor.

License

This post is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International.

Please don’t go to bench 137 in Central Park, NY. That was just an example. ↩︎

Seed Phrases and Entropy

2024-02-11T15:59:02Z

Warning: This post has KaTeX enabled, so if you want to view the rendered math formulas, you’ll have to unfortunately enable JavaScript.

In this post, let’s dive into a topic that is very important for anyone who uses the internet: passwords. We’ll cover what the hell is Entropy, good password practices, and how it relates to Bitcoin “seed phrases”¹.

Entropy

Before we go into passwords, I’ll introduce the concept of Entropy.

Entropy is a measure of the amount of disorder in a system. It has its origins in Thermodynamics, where it’s used to measure the amount of energy in a system that is not available to do work.

The etymology of the word “Entropy” is after the Greek word for “transformation”.

It was given a proper statistical definition by Ludwig Boltzmann in 1870s. while establishing the field of Statistical Dynamics, a field of physics that studies the behavior of large collections of particles.

Ludwig Boltzmann

In the context of Statistical Dynamics, Entropy is a measure of the number of ways a system can be arranged. The more ways a system can be arranged, the higher its Entropy. Specifically, Entropy is a logarithmic measure of the number of system states with significant probability of being occupied:

$$S = -k \cdot \sum_i p_i \ln p_i$$

Where:

$S$: Entropy.
$k$: Boltzmann’s constant, a physical constant that relates temperature to energy.
$p_i$: probability of the system being in state $i$.

In this formula, if all states are equally likely, i.e $p_i = \frac{1}{N}$, where $N$ is the number of states, then the entropy is maximized. You can see this since a probability $p$ is a real number between 0 and 1, and as $N$ approaches infinity, the sum of the logarithms approaches negative infinity. Then, multiplying by $-k$ yields positive infinity.

How the hell Physics came to Passwords?

There’s once a great men called Claude Shannon, who single-handedly founded the field of Information Theory, invented the concept of a Bit, and was the first to think about Boolean algebra in the context of electrical circuits. He laid the foundation for the Digital Revolution.

If you are happy using your smartphone, laptop, or any other digital device, in you high speed fiber internet connection, through a wireless router to send cats pictures to your friends, then you should thank Claude Shannon.

Claude Shannon

He was trying to find a formula to quantify the amount of information in a message. He wanted three things:

The measure should be a function of the probability of the message. Messages that are more likely should have less information.
The measure should be additive. The information in a message should be the sum of the information in its parts.
The measure should be continuous. Small changes in the message should result in small changes in the measure.

He pretty much found that the formula for Entropy in statistical mechanics was a good measure of information. He called it Entropy to honor Boltzmann’s work. To differentiate it from the Statistical Dynamics’ Entropy, he changed the letter to $H$, in honor of Boltzmann’s $H$-theorem. So the formula for the Entropy of a message is:

$$H(X) = −\Sigma_{x \in X} P(x_i) \log P(x_i)$$

Where:

$X$: random discrete variable.
$H(X)$: Entropy of $X$
$P(x_i)$: probability of the random variable $X$ taking the value $x_i$. Also known as the probability mass function (PMF) of the discrete random variable $X$.
$\log$: base 2 logarithm, to measure the Entropy in bits.

In information theory, the Entropy of a random variable is the average level of “information”, “surprise”, or “uncertainty” inherent to the variable’s possible outcomes².

Let’s take the simple example of a fair coin. The Entropy of the random variable $X$ that represents the outcome of a fair coin flip is:

$$H(X) = −\Sigma_{x \in X} P(x_i) \log P(x_i) = -\left(\frac{1}{2} \log \frac{1}{2} + \frac{1}{2} \log \frac{1}{2}\right) = 1 \text{ bit}$$

So the outcome of a fair coin flip has 1 bit of Entropy. This means that the outcome of a fair coin flip has 1 bit of information, or 1 bit of uncertainty. Once the message is received, that the coin flip was heads or tails, the receiver has 1 bit of information about the outcome.

Alternatively, we only need 1 bit to encode the outcome of a fair coin flip. Hence, there’s a connection between Entropy, search space, and information.

Another good example is the outcome of a fair 6-sided die. The Entropy of the random variable $X$ that represents the outcome of a fair 6-sided die is:

$$H(X) = −\Sigma_{x \in X} P(x_i) \log P(x_i) = - \sum_{i=1}^6\left(\frac{1}{6} * \log \frac{1}{6} \right) \approx 2.58 \text{ bits}$$

This means that the outcome of a fair 6-sided die has 2.58 bits of Entropy. we need $\operatorname{ceil}(2.58) = 3$ bits to encode the outcome of a fair 6-sided die.

Entropy and Passwords

Ok now we come full circle. Let’s talk, finally, about passwords.

In the context of passwords, Entropy is a measure of how unpredictable a password is. The higher the Entropy, the harder it is to guess the password. The Entropy of a password is measured in bits, and it’s calculated using the formula:

$$H = L \cdot \log_2(N)$$

Where:

$H$: Entropy in bits
$N$: number of possible characters in the password
$L$: length of the password
$\log_2$: (N) calculates how many bits are needed to represent each character from the set.

For example, if we have a password with 8 characters and each character can be any of the 26 lowercase letters, the standard english alphabet, the Entropy would be:

$$H = 8 \cdot \log_2(26) \approx 37.6 \text{ bits}$$

This means that an attacker would need to try $2^{37.6} \approx 2.01 \cdot 10^{11}$ combinations³ to guess the password.

If the password were to include uppercase letters, numbers, and symbols (let’s assume 95 possible characters in total), the Entropy for an 8-character password would be:

$$H = 8 \cdot \log_2(95) \approx 52.6 \text{ bits}$$

This means that an attacker would need to try $2^{52.6} \approx 6.8 \cdot 10^{15}$ combinations to guess the password.

This sounds a lot but it’s not that much.

For the calculations below, we’ll assume that the attacker now your dictionary set, i.e. the set of characters you use to create your password, and the password length.

If an attacker get a hold of an NVIDIA RTX 4090, MSRP USD 1,599, which can do 300 GH/s (300,000,000,000 hashes/second), i.e. $3 \cdot 10^{11}$ hashes/second, it would take:

8-length lowercase-only password:

$$\frac{2.01 \cdot 10^{11}}{3 \cdot 10^{11}} \approx 0.67 \text{ seconds}$$

8-length password with uppercase letters, numbers, and symbols:

$$\frac{6.8 \cdot 10^{15}}{3 \cdot 10^{11}} \approx 22114 \text{ seconds} \approx 6.14 \text{ hours}$$

So, the first password would be cracked in less than a second, while the second would take a few hours. This with just one 1.5k USD GPU.

Bitcoin Seed Phrases

Now that we understand Entropy and how it relates to passwords, let’s talk about bitcoin seed phrases¹.

Remember that our private key is a big-fucking number? If not, check my post on cryptographics basics.

BIP-39 specifies how to use easy-to-remember seed phrases to store and recover private keys. The wordlist adheres to the following principles:

smart selection of words: the wordlist is created in such a way that it’s enough to type the first four letters to unambiguously identify the word.
similar words avoided: word pairs like “build” and “built”, “woman” and “women”, or “quick” and “quickly” not only make remembering the sentence difficult but are also more error prone and more difficult to guess.

Here is a simple 7-word seed phrase: brave sadness grocery churn wet mammal tube. Surprisingly enough, this badboy here gives you $77$ bits of Entropy, while also being easy to remember. This is due to the fact that the wordlist has 2048 words, so each word gives you $\log_2(2048) = 11$ bits of Entropy⁴.

There’s a minor caveat to cover here. The last word in the seed phrase is a checksum, which is used to verify that the phrase is valid.

So, if you have a 12-word seed phrase, you have $11 \cdot 11 = 121$ bits of Entropy. And for a 24-word seed phrase, you have $23 \cdot 11 = 253$ bits of Entropy.

The National Institute of Standards and Technology (NIST) recommends a minimum of 112 bits of Entropy for all things cryptographic. And Bitcoin has a minimum of 128 bits of Entropy.

Depending on your threat model, “Assume that your adversary is capable of a trillion guesses per second”, it can take a few years to crack a 121-bit Entropy seed phrase:

$$\frac{2^{121}}{10^{12}} \approx 2.66 \cdot 10^{24} \text{ seconds} \approx 3.08 \cdot 10^{19} \text{ days} \approx 8.43 \cdot 10^{16} \text{ years}$$

That’s a lot of years. Now for a 253-bit Entropy seed phrase:

$$\frac{2^{253}}{10^{12}} \approx 1.45 \cdot 10^{64} \text{ seconds} \approx 1.68 \cdot 10^{59} \text{ days} \approx 4.59 \cdot 10^{56} \text{ years}$$

That’s another huge number of years.

Seed Phrases and Passwords

You can also use a seed phrase as a password. The bonus point is that you don’t need to use the last word as a checksum, so you get 11 bits of Entropy free, compared to a Bitcoin seed phrase.

Remember the 7-words badboy seed phrase we generated earlier? brave sadness grocery churn wet mammal tube.

It has $66$ bits of Entropy. This would take, assuming “that your adversary is capable of a trillion guesses per second”:

$$\frac{2^{77}}{10^{12}} \approx 1.51 \cdot 10^{11} \text{ seconds} \approx 1.75 \cdot 10^{6} \text{ days} \approx 4.79 \cdot 10^{3} \text{ years}$$

That’s why tons of people use seed phrases as passwords. Even if you know the dictionary set and the length of the password, i.e. the number of words in the seed phrase, it would take a lot of years to crack it.

Conclusion

Entropy is a measure of the amount of disorder in a system. In the context of passwords, it’s a measure of how unpredictable a password is. The higher the Entropy, the harder it is to guess the password.

Bitcoin seed phrases are a great way to store and recover private keys. They are easy to remember and have a high amount of Entropy. You can even use a seed phrase as a password.

Even it your attacker is capable of a trillion guesses per second, like the NSA, it would take them a lot of years to crack even a 7-word seed phrase.

If you want to generate a seed phrase, you can use KeePassXC, which is a great open-source offline password manager that supports seed phrases⁵.

License

This post is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International.

seed phrases are technically called “mnemonic phrases”, but I’ll use the term “seed phrases” for the rest of the post. ↩︎ ↩︎
there is a Bayesian argument about the use of priors that should adhere to the Principle of Maximal Entropy ↩︎
technically, we need to divide the number of combinations by 2, since we are assuming that the attacker is using a brute-force attack, which means that the attacker is trying all possible combinations, and the password could be at the beginning or at the end of the search space. This is called the birthday paradox, and it assumes that the password is uniformly distributed in the search space. ↩︎
remember that $2^{11} = 2048$. ↩︎
technically, KeePassXC uses the EFF wordlist, which has 7,776 words, so each word gives you $\log_2(7776) \approx 12.9$ bits of Entropy. They were created to be easy to use with 6-sided dice. ↩︎

Cryptography Basics

2024-02-05T18:53:28-03:00

Euclid’s one-way function

Warning: This post has KaTeX enabled, so if you want to view the rendered math formulas, you’ll have to unfortunately enable JavaScript.

This is the companion post to the cryptography workshop that I gave at a local BitDevs. Let’s explore the basics of cryptography. We’ll go through the following topics:

One-way functions
Hash functions
Public-key cryptography
DSA
Schnorr
Why we don’t reuse nonces?
Why we can combine Schnorr Signatures and not DSA?

One-way functions

A one-way function is a function that is easy to compute on every input, but hard to invert given the image¹ of a random input. For example, imagine an omelet. It’s easy to make an omelet from eggs, but it’s hard to make eggs from an omelet. In a sense we can say that the function $\text{omelet}$ is a one-way function

$$\text{omelet}^{-1}(x) = \ldots$$

That is, we don’t know how to invert the function $\text{omelet}$ to get the original eggs back. Or, even better, the benefit we get from reverting the omelet to eggs is not worth the effort, either in time or money.

Not all functions are one-way functions. The exponential function, $f(x) = e^x$, is not a one-way function. It is easy to undo the exponential function by taking the natural logarithm,

$$f^{-1}(x) = \ln(x)$$

To showcase one-way functions, let’s take a look at the following example. Let’s play around with some numbers. Not any kind of numbers, but very special numbers called primes. A prime number is a natural number greater than 1 that has no positive divisors other than 1 and itself.

If I give you a big number $n$ and ask you to find its prime factors, and point a gun at your head, you’ll pretty much screwed. There’s no known efficient algorithm² to factorize a big number into its prime factors. You’ll be forced to test all numbers from 2 to $\sqrt{n}$ to see if they divide $n$.

Here’s a number:

$$90809$$

What are its prime factors? It’s $1279 \cdot 71$. Easy to check, right? Hard to find. That’s because prime factorization, if you choose a fucking big number, is a one-way function.

Hash Functions

Let’s spice things up. There is a special class of one-way functions called hash functions.

A hash function is any function that can be used to map data of arbitrary size to fixed-size values.

But we are most interested in cryptographic hash functions, which are hash functions that have statistical properties desirable for cryptographic application:

One-way function: easy to compute $y = f(x)$, hard as fuck to do the opposite, $x = f^{-1}(y)$.
Deterministic: given a function that maps elements from set $X$ to set $Y$, $f: X \to Y$, for every $x \in X$ there’s at least one $y \in Y$³. This means that if I give you a certain input, it will always map to the same output. It is deterministic.
Collision resistance: the possible values of $f: X \to Y$ follows a uniform distribution, that is, given the size of the set $Y$, it is hard to find two $x_1, x_2 \in X$ that have the same $y \in Y$ value⁴. This property is really important because if an attacker wants to brute-force the hash function, there’s no option than searching uniformly across the whole possible space of possible values that the hash function outputs⁵.

These properties make enable cryptographic hash functions to be used in a wide range of applications, including but not limited to:

Digital signatures: Hash functions are used to create a digest of the message to be signed. The digital signature is then generated using the hash, rather than the message itself, to ensure integrity and non-repudiation.
Password hashing: Storing passwords as hash values instead of plain text. Even if the hash values are exposed, the original passwords remain secure due to the pre-image resistance property.
Blockchain and cryptocurrency: Hash functions are used to maintain the integrity of the blockchain. Each block contains the hash of the previous block, creating a secure link. Cryptographic hashes also underpin various aspects of cryptocurrency transactions.
Data integrity verification: Hash functions are used to ensure that files, messages, or data blocks have not been altered. By comparing hash values computed before and after transmission or storage, any changes in the data can be detected.

We’ll cover just the digital signatures part in this post.

SHA-2 and its variants

The Secure Hash Algorithm 2 (SHA-2) is a set of cryptographic hash functions designed by the National Security Agency (NSA). It was first published in 2001.

It is composed of six hash functions with digests that are 224, 256, 384, 512, 512/224, and 512/256 bits long:

SHA-224
SHA-256
SHA-384
SHA-512
SHA-512/224
SHA-512/256

Amongst these, let’s focus on SHA-256, which is the most widely used while also being notoriously adopted by bitcoin.

SHA-256 does not have any known vulnerabilities and is considered secure. It comprises of 32-bit words and operates on 64-byte blocks. The algorithm does 64 rounds of the following operations:

AND: bitwise boolean AND
XOR: bitwise boolean XOR
OR: bitwise boolean OR
ROT: right rotation bit shift
ADD: addition modulo $2^{32}$

You can check SHA-256 Pseudocode on Wikipedia. It really scrambles the input message in a way that is very hard to reverse.

These operations are non-linear and very difficult to keep track of. In other words, you can’t reverse-engineer the hash to find the original message. There’s no “autodiff” for hash functions.

Since it is a cryptographic hash function, if we change just one bit of the input, the output will be completely different. Check this example:

$ echo "The quick brown fox jumps over the lazy dog" | shasum -a 256
c03905fcdab297513a620ec81ed46ca44ddb62d41cbbd83eb4a5a3592be26a69  -

$ echo "The quick brown fox jumps over the lazy dog." | shasum -a 256
b47cc0f104b62d4c7c30bcd68fd8e67613e287dc4ad8c310ef10cbadea9c4380  -

Here we are only adding a period at the end of the sentence, and the hash is completely different. This is due to the property of collision resistance that we mentioned earlier.

Fields

Before we dive into public-key cryptography, we need a brief interlude on fields.

Fields are sets with two binary operations, called addition $+$ and multiplication $\times$. We write

$$F = (F, +, \times)$$

to denote a field, where $F$ is the set, $+$ is the addition operation, and $\times$ is the multiplication operation.

Addition and multiplication behave similar to the addition and multiplication of real numbers. For example, addition is commutative and associative

$$a + b = b + a,$$

and multiplication is distributive

$$a \times (b + c) = a \times b + a \times c.$$

Also, there are two special elements in the field, called the additive identity $-a$ and the multiplicative identity $a^{-1}$, such that

$$a + (-a) = I,$$

and

$$a \times a^{-1} = I,$$

where $I$ is the identity element.

Note that this allows us to define subtraction

$$a - b = a + (-b),$$

and division

$$a \div b = a \times b^{-1}.$$

Finite Fields

Now we are ready for finite fields. A finite field, also called a Galois field (in honor of Évariste Galois), is a field with a finite number of elements. As with any field, a finite field is a set on which the operations of multiplication, addition, subtraction and division are defined and satisfy the rules above for fields.

Finite fields is a very rich topic in mathematics, and there are many ways to construct them. The easiest way to construct a finite field is to take the integers modulo a prime number $p$. For example $\mathbb{Z}_5$ is a finite field with 5 elements:

$$\mathbb{Z}_5 = \lbrace 0, 1, 2, 3, 4 \rbrace.$$

In general, $\mathbb{Z}_n$ is a finite field with $n$ elements:

$$\mathbb{Z}_n = \lbrace 0, 1, 2, \ldots, n - 1 \rbrace.$$

The number of elements in a finite field is called the order of the field. The order of a finite field is always a prime number $p$. The $\mathbb{Z}_5$ example above is a finite field of order 5. However, $\mathbb{Z}_4$ is not a finite field, because 4 is not a prime number, but rather a composite number.

$$4 = 2 \times 2.$$

And we can write $\mathbb{Z}_4$ as

$$\mathbb{Z}_4 = 2 \times \mathbb{Z}_2.$$

This means that every element in $a \in \mathbb{Z}_4$ can be written as

$$a = 2 \times b,$$

where $b$ is an element in $\mathbb{Z}_2$.

Hence, not every element of $\mathbb{Z}_4$ is unique, and they are equivalent to the elements in $\mathbb{Z}_2$.

In general if $n$ is a composite number, then $\mathbb{Z}_n$ is not a finite field. However, if $n = r \times s$ where $r$ and $s$ are prime numbers, and $r < s$, then $\mathbb{Z}_n$ is a finite field of order $r$.

Operations in Finite Fields

Addition in finite fields is defined as the remainder of the sum of two elements modulo the order of the field.

For example, in $\mathbb{Z}_3$,

$$1 + 2 = 3 \mod 3 = 0.$$

We can also define subtraction in finite fields as the remainder of the difference of two elements modulo the order of the field.

For example, in $\mathbb{Z}_3$,

$$1 - 2 = -1 \mod 3 = 2.$$

Multiplication in finite fields can be written as multiple additions. For example, in $\mathbb{Z}_3$,

$$2 \times 2 = 2 + 2 = 4 \mod 3 = 1.$$

Exponentiation in finite fields can be written as multiple multiplications. For example, in $\mathbb{Z}_3$,

$$2^2 = 2 \times 2 = 4 \mod 3 = 1.$$

As you can see addition, subtraction, and multiplication becomes linear operations. This is very trivial for any finite field.

However, for division we are pretty much screwed. It is really hard to find the multiplicative inverse of an element in a finite field. For example, suppose that we have numbers $a,b$ in a very large finite field $\mathbb{Z}_p$, such that

$$c = a \times b \mod p.$$

Then we can write division as

$$a = c \div b = c \times b^{-1} \mod p.$$

Now we need to find $b^{-1}$, which is the multiplicative inverse of $b$. This is called the discrete logarithm problem. Because we need to find $b^{-1}$ such that

$$b^{-1} = \log_b c \mod p.$$

Since this number is a discrete number and not a real number, that’s why it’s called the discrete logarithm problem.

Good luck my friend, no efficient method is known for computing them in general. You can try brute force, but that’s not efficient.

Why the Discrete Logarithm Problem is Hard as Fuck

To get a feeling why the discrete logarithm problem is difficult, let’s add one more concept to our bag of knowledge. Every finite field has generators, also known as primitive roots, which is also a member of the group, such that applying multiplication to this one single element makes possible to generate the whole finite field.

Let’s illustrate this with an example. Below we have a table of all the results of the following operation

$$b^x \mod 7$$

for every possible value of $x$. As you’ve guessed right this is the $\mathbb{Z}_7$ finite field.

$b$	$b^1 \mod 7$	$b^2 \mod 7$	$b^3 \mod 7$	$b^4 \mod 7$	$b^5 \mod 7$	$b^6 \mod 7$
$1$	$1$	$1$	$1$	$1$	$1$	$1$
$2$	$2$	$4$	$1$	$2$	$4$	$1$
$3$	$3$	$2$	$6$	$4$	$5$	$1$
$4$	$4$	$2$	$1$	$4$	$2$	$1$
$5$	$5$	$4$	$6$	$2$	$3$	$1$
$6$	$6$	$1$	$6$	$1$	$1$	$1$

You see that something interesting is happening here. For specific values of $b$, such as $b = 3$, and $b = 5$, we are able to generate the whole finite field. Hence, say that $3$ and $5$ are generators or primitive roots of $\mathbb{Z}_7$.

Now suppose I ask you to find $x$ in the following equation

$$3^x \mod p = 11$$

where $p$ is a very large prime number. Then you don’t have any other option than brute forcing it. You’ll need to try each exponent $x \in \mathbb{Z}_p$ until you find the one that satisfies the equation.

Notice that this operation is very asymmetric. It is very easy to compute $3^x \mod p$ for any $x$, but it is very hard to find $x$ given $3^x \mod p$.

Now we are ready to dive into public-key cryptography.

Numerical Example of the Discrete Logarithm Problem

Let’s illustrate the discrete logarithm problem with a numerical example.

Choose a prime number $p$. Let’s pick $p = 17$.
Choose a generator $g$ of the group. For $p = 17$, we can choose $g = 3$ because $3$ is a primitive root of $\mathbb{Z}_{17}$.
Choose an element $x$. Let’s pick $x = 15$.

The discrete logarithm problem is to find $x$ given $g^x \mod p$. So let’s plug in the numbers; find $x$ in

$$3^x = 15 \mod 17 $$

Try to find it. Good luck⁶.

Public-key cryptography

Public-key cryptography, or asymmetric cryptography, is a cryptographic system that uses pairs of keys: private and public. The public key you can share with anyone, but the private key you must keep secret. The keys are related mathematically, but it is computationally infeasible to derive the private key from the public key. In other words, the public key is a one-way function of the private key.

Before we dive into the details of the public-key cryptography, and signing and verifying messages, let me introduce some notation:

$p$: big fucking huge prime number (4096 bits or more)
$\mathbb{Z}_p$: the finite field of order $p$
$g$: a generator of $\mathbb{Z}_p$
$S_k$: secret key, a random integer in the finite field $\mathbb{Z}_p$
$P_k$: public key derived by $P_k = g^{S_k}$

If you know $S_k$ and $g$ (which is almost always part of the spec), then it’s easy to derive the $P_k$. However, if you only know $g$ and $P_k$, good luck finding $S_k$. It’s the discrete log problem again. And as long $p$ is HUGE you are pretty confident that no one will find your secret key from your public key.

Now what we can do with these keys and big prime numbers? We’ll we can sign a message with our secret key and everyone can verify the authenticity of the message using our public key. The message in our case it is commonly a hash function of the “original message”. Due to the collision resistance property, we can definitely assert that:

the message has not been altered
the message was signed by the owner of the private key

Fun fact, I once gave a recommendation letter to a very bright student, that was only a plain text file signed with my private key. I could rest assured that the letter was not altered, and the student and other people could verify that I was the author of the letter.

Next, we’ll dive into the details of the Digital Signature Algorithm (DSA) and the Schnorr signature algorithm.

DSA

DSA stands for Digital Signature Algorithm. It was first proposed by the National Institute of Standards and Technology (NIST) in 1991. Note that OpenSSH announced that DSA is scheduled for removal in 2025.

Here’s how you can sign a message using DSA:

Choose two prime numbers $p, q$ such that $p - 1 \mod q = 0$ (e.g., 1279 and 71).
Choose your private key $S_k$ as a random integer $\in [1, q-1]$.
Choose a generator $g$.
Compute your public key $P_k$: $g^{S_k} \mod p$.
Choose your nonce $k$: as a random integer $\in [1, q-1]$.
Compute your “public nonce” $K$: $(g^k \mod p) \mod q$ (also known as $r$).
Get your message ($m$) through a cryptographic hash function $H$: $H(m)$.
Compute your signature $s$: $(k^{-1} (H(m) + S_k K)) \mod q$.
Send to your buddy $(p, q, g)$, $P_k$, and $(K, s)$.

And here’s how you can verify the signature:

Compute $w = s^{-1} \mod q$.
Compute $u_1 = H{m} \cdot w \mod q$.
Compute $u_2 = K \cdot w \mod q$.
Compute $K^* = {g^{u_1} P^{u_2}_k \mod p} \mod q$.
Assert $K = K^*$.

How this works? Let’s go through a proof of correctness. I added some comments to every operation in parentheses to make it easier to follow.

$s = k^{-1} \cdot {H + S_k K} \mod q$ ($\mod p$ and $H(m)$ implicit).
$k = s^{-1} \cdot {H + S_k K} \mod q$ (move $s$ to $k$).
$k = H \cdot s^{-1} + S_k K \cdot s^{-1} \mod q$ (distribute $s^{-1}$).
$k = H \cdot w + S_k K \cdot w \mod q$ ($w = s^{-1}$).
$g^k = g^{H \cdot w + S_k K \cdot w \mod q}$ (put $g$ in both sides).
$g^k = g^{H \cdot w \mod q} \cdot g^{S_k K \cdot w \mod q}$ (product of the exponents).
$g^k = g^{H \cdot w \mod q} \cdot P^{K \cdot w \mod q}_k$ ($P_k = g^{S_k}$).
$g^k = g^{u_1} \cdot P^{u_2}_k$ (replace $u_1$ and $u_2$).
$K = K^*$ (replace $K$ and $K^*$).

There you go. This attest that the signature is correct and the message was signed by the owner of the private key.

Schnorr

Schnorr signature algorithm is a very similar algorithm to DSA. It was proposed by Claus-Peter Schnorr in 1989. It is considered to be more secure than DSA and is also more efficient. The patent for Schnorr signatures expired in 2008, just in time for Satoshi to include it in Bitcoin. However, it was probably not included due to the fact that there wasn’t good battle-tested software implementations of it at the time. However, it was added to Bitcoin in the Taproot upgrade⁷.

Schnorr is a marvelous algorithm. It is so much simpler than DSA. Here’s how you sign a message using Schnorr:

Choose a prime number $p$.
Choose your private key $S_k$ as a random integer $\in [1, p-1]$.
Choose a generator $g$.
Compute your public key $P_k$: $g^{S_k}$.
Choose your nonce $k$: as a random integer $\in [1, p-1]$.
Compute your “public nonce” $K$: $g^k \mod p$ (also known as $r$).
Get your message ($m$) through a cryptographic hash function $H$ concatenating with $K$: $e = H(K || m)$.
Compute your signature $s$: $k - S_k e$.
Send to your buddy $(p, g)$, $P_k$, and $(K, s)$.

And here’s how you can verify the signature:

Compute $e = H(K || m)$.
Compute $K^* = g^s P_k^e$.
Compute $e^* = H(K^* || m)$.
Assert $e = e^*$.

How this works? Let’s go through a proof of correctness. As before, I added some comments to every operation in parentheses to make it easier to follow.

$K^* = g^s P_k^e$ ($\mod p$ implicit).
$K^* = g^{k - S_k e} g^{S_k e}$ ($s = k - S_k e$ and $P_k = g^{S_k}$).
$K^* = g^k$ (cancel $S_k e$ in the exponent of $g$).
$K^* = K$ ($K = g^k$).
Hence $H(K^* || m) = H(K || m)$.

There you go. This attest that the signature is correct and the message was signed by the owner of the private key.

Why we don’t reuse nonces?

Never, ever, reuse a nonce. Why? First, because nonce is short for “number used once”. It is supposed to be used only once. Because if you reuse a nonce, then you are pretty much screwed. An attacker can derive your private key from two signatures with the same nonce. This is called the “nonce reuse attack”.

Fun fact: this is what happened to the PlayStation 3.

Let’s see how we can derive the private key from two signatures with the same nonce. Here we are in a context that we have two signatures $s$ and $s^\prime$, both using the same nonce $k = k^\prime$.

First, let’s do the ~~ugly~~ DSA math:

$$ \begin{aligned} s^\prime - s &= (k^{\prime {-1}} (H(m_1) + S_k K’)) - (k^{-1} (H(m_2) + S_k K)) \\ s^\prime - s &= k^{-1} (H(m_1) - H(m_2)) \\ k &= (H(m_1) - H(m_2)) (s^\prime - s)^{-1} \end{aligned} $$

Now remember you know $s$, $s^\prime$, $H(m_1)$, $H(m_2)$ $K$, and $K^\prime$. Let’s do the final step and solve for $S_k$:

$$S_k = K^{-1} (k s - H(m_1))$$

Now let’s do the Schnorr math. But in Schnorr, everything is simpler. Even nonce reuse attacks.

$$s^\prime - s = (k^\prime - k) - S_k (e^\prime - e)$$

If $k^\prime = k$ (nonce reuse) then you can easily isolate $S_k$ with simple algebra.

Remember: you know $s^\prime, s, e, e^\prime$ and $k^\prime - k = 0$.

Why we can combine Schnorr Signatures and not DSA?

In Bitcoin, we can combine Schnorr signatures and not DSA. Why? Because Schnorr signatures are linear. This means that you can add two Schnorr signatures and get a valid signature for the sum of the messages. This is not possible with DSA. This is called the “linearity property” of Schnorr signatures.

Remember that in $Z_p$ addition, multiplication, and exponentiation, i.e anything with $+, \cdot, -$, are linear operations However, division (modular inverse), .i.e anything that is $^{-1}$, is not linear. That is:

$$x^{-1} + y^{-1} \ne (x + y)^{-1}.$$

Here’s a trivial python code that shows that modular inverse is not linear:

>>> p = 71; x = 13; y = 17;
>>> pow(x, -1, p) + pow(y, -1, p) == pow(x + y, -1, p)
False

Let’s revisit the signature step of DSA and Schnorr:

DSA: $s = k^{-1} (H(m) + S_k K)$
Schnorr: $s = k - S_k H(K || m)$

So if you have two Schnorr signatures $s_1$ and $s_2$ for two messages $m_1$ and $m_2$, then you can easily compute a valid signature for the sum of the messages $m_1 + m_2$:

$$s = s_1 + s_2$$

Also note that we can combine Schnorr public keys:

$$P^\prime_k + P_k = g^{S^\prime_k} + g^{S_k} = g^{S_k^\prime + S_k}$$

And the signature $s$ for the sum of the messages $m_1 + m_2$ can be verified with the public key $P^\prime_k + P_k$.

This is not possible with DSA.

Because the signature step in DSA is not linear, it has a $k^{-1}$ in it.

Technical Interlude: Elliptic Curves

Technically speaking, Bitcoin uses the Elliptic Curve Digital Signature Algorithm (ECDSA), and the Schnorr signature algorithm is based on the same elliptic curve (EC) as ECDSA.

And trivially speaking EC public-key cryptography in the end is just a finite field on $\mathbb{Z}_p$. It has everything that we’ve seen so far:

Addition
Subtraction
Multiplication
Division
Exponentiation
Generators
Discrete Logarithm Problem

Conclusion

I hope you enjoyed this companion post to the cryptography workshop. Remember don’t reuse nonces.

License

This post is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International.

the image of a function $f$ is the set of all values that $f$ may produce. ↩︎
the problem of factoring a number into its prime factors is not known to be in the class of problems that can be solved in polynomial time, P. It is not known to be NP-complete, NP, either. Actually to find it P is NP or not is the hardest way to earn a million dollars, the P vs NP problem. ↩︎
this is called surjection. ↩︎
at least $\frac{1}{N}$ where $N$ is the size of $Y$. ↩︎
actually this is not true. Due to the birthday paradox, the probability of finding a collision is not $\frac{1}{N}$ but $\frac{1}{\sqrt{N}}$. Hence the search space is actually $2^{\frac{N}{2}}$ instead of the original $2^N$. ↩︎
The answer is $x = 6$. This means that $3^6 = 15 \mod 17$. ↩︎
Taproot is a proposed Bitcoin protocol upgrade that was deployed as a forward-compatible soft fork. The validation of Taproot is based on Schnorr signatures. You can find more in BIPS 340, 341, and 342. ↩︎

Fullstack and Progressive Web Apps in Rust: A Tale of a Sudoku Spyware

2024-01-30T08:57:33Z

It all started when I had to accompany my mom to the hospital. It was just a routine checkup, but I had to wait for a few hours. I brought my laptop with me, since they have good WiFi and I could work on my projects. Then I realized that my mom was playing a Sudoku¹ game on her phone. I couln’t help but notice that the game was full of ads and it was asking for a lot of permissions, like location and sensor data. So I decided to make a Sudoku game for her, without ads or using any permission. It wouldn’t even need to ask for the blessing of Google or Tim Apple since it was a Progressive Web App (PWA) and it would work offline.

You can play the game at storopoli.io/sudoku or check the source code at storopoli/sudoku.

Here’s a screenshot of the game:

Tools of Choice

So what would I use to build this game? Only one thing: Dioxus. Dioxus is a fullstack framework for Rust, that allows you to build web applications with Rust. You can benefit from the safety and performance of Rust, powerful type system and borrow checker, along with the low memory footprint.

That’s it. Just Rust and HTML with some raw CSS. No “YavaScript”. No Node.js. No npm. No webpack. No Tailwind CSS. Just cargo run --release and you’re done.

Package Management

Using Rust for fullstack development is an amazing thing. First, package management is a breeze with Cargo. Second, you don’t have to worry about “npm vulnerabilities”. Have you ever gone into your project and ran npm audit?

This is solvable with Rust.

Runtime Errors

An additional advantage is that you don’t have to worry about common runtime errors like undefined is not a function or null is not an object. These are all picked-up by Rust on compile time. So you can focus on the logic of your application knowing that it will work as expected.

A common workflow in Rust fullstack applications is to use Rust’s powerful type system to parse any user input into a type that you can trust, and then propagate that type throughout your application. This way you can be sure that you’re not going to have any runtime errors due to invalid input. This is not the case with “YavaScript”. You need to validate the input at every step of the way, and you can’t be sure that the input is valid at any point in time.

You can sleep soundly at night knowing that your application won’t crash and as long as the host machine has electricity and internet access, your app is working as expected².

Performance

Rust is known for its performance. This is due to the fact that Rust gives you control over deciding on which type you’ll use for a variable. This is not the case with “YavaScript”, where you can’t decide if a variable is a number or a string. Also you can use references and lifetimes to avoid copying data around.

So, if you make sane decisions, like u8 (unsigned 8-bit integer) instead of i32 (signed 32-bit integer) for a number that will never be greater than 255, you can have a very low memory footprint. Also you can use &str (string slice) instead of String to avoid copying strings around.

You just don’t have this level of control with “YavaScript”. You get either strings or numbers and you can’t decide on the size of the number. And all of your strings will be heap-allocated and copied around.

Progressive Web Apps

Progressive Web Apps (PWAs) are web applications that are regular web pages or websites, but can appear to the user like traditional applications or native mobile applications. Since they use the device’s browser, they don’t need to be installed through an app store. This is a great advantage, since you don’t have to ask for permissions to Google or Tim Apple.

In Dioxus making a PWA was really easy. There is a PWA template in the examples/ directory in their repository. You just have to follow the instructions in the README and you’re done. In my case, I only had to change the metadata in the manifest.json file and add what I wanted to cache in the service worker .js file. These were only the favicon icon and the CSS style file.

Sudoku Algorithm

I didn’t have to worry about the algorithm to generate the Sudoku board. This was already implemented in the sudoku crate. But I had to implement some Sudoku logic to make the user interface work.

Some things that I had to implement were:

find the related cells. Given a cell, find the cells in the same row, column and sub-grid.
find the conflicting cells. Given a cell, find the cells in the same row, column and sub-grid that have the same value.

This was a simple task, yet it was very fun to implement.

To get the related cells, you need to find the row and column of the cell. Then you can find the start row and start column of the 3x3 sub-grid. After that, you can add the cells in the same row, column and sub-grid to a vector. Finally, you can remove the duplicates and the original cell from the vector.

Here’s the code:

pub fn get_related_cells(index: u8) -> Vec<u8> {
    let mut related_cells = Vec::new();
    let row = index / 9;
    let col = index % 9;
    let start_row = row / 3 * 3;
    let start_col = col / 3 * 3;

    // Add cells in the same row
    for i in 0..9 {
        related_cells.push(row * 9 + i);
    }

    // Add cells in the same column
    for i in 0..9 {
        related_cells.push(i * 9 + col);
    }

    // Add cells in the same 3x3 sub-grid
    for i in start_row..start_row + 3 {
        for j in start_col..start_col + 3 {
            related_cells.push(i * 9 + j);
        }
    }

    // Remove duplicates and the original cell
    related_cells.sort_unstable();
    related_cells.dedup();
    related_cells.retain(|&x| x != index);

    related_cells
}

Find the Conflicting Cells

To find the conflicting cells, you need to get the value of the target cell. Then you can get the related cells and filter the ones that have the same value as the target cell. Easy peasy.

Here’s the code:

pub fn get_conflicting_cells(board: &SudokuState, index: u8) -> Vec<u8> {
    // Get the value of the target cell
    let value = board[index as usize];

    // Ignore if the target cell is empty (value 0)
    if value == 0 {
        return Vec::new();
    }

    // Get related cells
    let related_cells = get_related_cells(index);

    // Find cells that have the same value as the target cell
    related_cells
        .into_iter()
        .filter(|&index| board[index as usize] == value)
        .collect()
}

Note that I am using 0 to represent empty cells.

But if the user ignores the conflicting cells and adds a number to the board, there will be more conflicting cells than the ones related to the target cell. This can be done with another helper function.

Here’s the code, and I took the liberty of adding the docstrings (the /// comments that renders as documentation):

/// Get all the conflictings cells for all filled cells in a Sudoku board
///
/// ## Parameters
///
/// - `current_sudoku: SudokuState` - A reference to the current [`SudokuState`]
///
/// ## Returns
///
/// Returns a `Vec` representing all cell's indices that are conflicting
/// with the current Sudoku board.
pub fn get_all_conflicting_cells(current_sudoku: &SudokuState) -> Vec<u8> {
    let filled: Vec<u8> = current_sudoku
        .iter()
        .enumerate()
        .filter_map(|(idx, &value)| {
            if value != 0 {
                u8::try_from(idx).ok()
            } else {
                None // Filter out the item if the value is 0
            }
        })
        .collect();

    // Get all conflicting cells for the filled cells
    let mut conflicting: Vec<u8> = filled
        .iter()
        .flat_map(|&v| get_conflicting_cells(current_sudoku, v))
        .collect::<Vec<u8>>();

    // Retain unique
    conflicting.sort_unstable();
    conflicting.dedup();

    conflicting
}

The trick here is that we are using a flat_map since a naive map would return a nested Vec>> of u8s, and we don’t want that. We want a flat Vec of all conflicting cells. Recursion is always tricky, go ask Alan Turing.

Sudoku App State

As you can see, I used a SudokuState type to represent the state of the game. This is just a type alias for a [u8; 81] array. This is a very simple and efficient way to represent the state of the game.

Here’s the code:

pub type SudokuState = [u8; 81];

The Sudoku app has also an undo button. This is implemented by using a Vec to store the history of the game. Every time that the user adds a number to the board, the new update state is pushed to the history vector. When the user clicks the undo button, the last state is popped from the history vector and the board is updated.

There’s one additional problem with the undo button. It needs to switch the clicked cell to the one that was clicked before. Yet another simple, but fun, task. First you need to find the index at which two given SudokuState, the current and the last, differ by exactly one item.

Again I’ll add the docstrings since they incorporate some good practices that are worth mentioning:

/// Finds the index at which two given [`SudokuState`]
/// differ by exactly one item.
///
/// This function iterates over both arrays in lockstep and checks for a
/// pair of elements that are not equal.
/// It assumes that there is exactly one such pair and returns its index.
///
/// ## Parameters
///
/// * `previous: SudokuState` - A reference to the first [`SudokuState`] to compare.
/// * `current: SudokuState` - A reference to the second [`SudokuState`] to compare.
///
/// ## Returns
///
/// Returns `Some(usize)` with the index of the differing element if found,
/// otherwise returns `None` if the arrays are identical (which should not
/// happen given the problem constraints).
///
/// ## Panics
///
/// The function will panic if cannot convert any of the Sudoku's board cells
/// indexes from `usize` into a `u8`
///
/// ## Examples
///
/// ```
/// let old_board: SudokuState = [0; 81];
/// let mut new_boad: SudokuState = [0; 81];
/// new_board[42] = 1; // Introduce a change
///
/// let index = find_changed_cell(&old_board, &new_board);
/// assert_eq!(index, Some(42));
/// ```
pub fn find_changed_cell(previous: &SudokuState, current: &SudokuState) -> Option<u8> {
    for (index, (&cell1, &cell2)) in previous.iter().zip(current.iter()).enumerate() {
        if cell1 != cell2 {
            return Some(u8::try_from(index).expect("cannot convert from u8"));
        }
    }
    None // Return None if no change is found (which should not happen in your case)
}

The function find_changed_cell can panic if it cannot convert any of the Sudoku’s board cells indexes from usize into a u8. Hence, we add a ## Panics section to the docstring to inform the user of this possibility. Additionally, we add an ## Examples section to show how to use the function. These are good practices that are worth mentioning³ and I highly encourage you to use them in your Rust code.

Tests

Another advantage of using Rust is that you can write tests for your code without needing to use a third-party library. It is baked into the language and you can run your tests with cargo test.

Here’s an example of a test for the get_conflicting_cells function:

#[test]
    fn test_conflicts_multiple() {
        let board = [
            1, 0, 0, 0, 0, 0, 0, 0, 1, // Row 1 with conflict
            0, 1, 0, 0, 0, 0, 0, 0, 0, // Row 2 with conflict
            0, 0, 0, 0, 0, 0, 0, 0, 0, // Row 3
            0, 0, 0, 0, 0, 0, 0, 0, 0, // Row 4
            0, 0, 0, 0, 0, 0, 0, 0, 0, // Row 5
            0, 0, 0, 0, 0, 0, 0, 0, 0, // Row 6
            0, 0, 0, 0, 0, 0, 0, 0, 0, // Row 7
            0, 0, 0, 0, 0, 0, 0, 0, 0, // Row 8
            1, 0, 0, 0, 0, 0, 0, 0, 0, // Row 9 with conflict
        ];
        assert_eq!(get_conflicting_cells(&board, 0), vec![8, 10, 72]);
    }

And also two tests for the find_changed_cell function:

#[test]
    fn test_find_changed_cell_single_difference() {
        let old_board: SudokuState = [0; 81];
        let mut new_board: SudokuState = [0; 81];
        new_board[42] = 1; // Introduce a change

        assert_eq!(find_changed_cell(&old_board, &new_board), Some(42));
    }

    #[test]
    fn test_find_changed_cell_no_difference() {
        let old_board: SudokuState = [0; 81];

        // This should return None since there is no difference
        assert_eq!(find_changed_cell(&old_board, &old_board), None);
    }

Conclusion

I had a lot of fun building this game. I gave my mother an amazing gift that she’ll treasure forever. Her smartphone has one less spyware now. I deployed a fullstack web app with Rust that is fast, safe and efficient; with the caveat that I didn’t touched any “YavaScript” or complexes build tools.

I hope you enjoyed this post and that you’ll give Rust a try in your next fullstack project.

License

This post is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International.

According to Wikipedia, Sudoku is a logic-based, combinatorial number-placement puzzle. The objective is to fill a 9×9 grid with digits so that each column, each row, and each of the nine 3×3 subgrids that compose the grid contain all of the digits from 1 to 9. ↩︎
in my case I am sending the bill to Bill Gates, since it is using the GitHub Pages to host the app. ↩︎
The clippy linter can warn you if you don’t add these sections to your docstrings. Just add pedantic = "deny" inside your Cargo.toml file in the [lints.clippy] section and you’re good to go. ↩︎

htmx: an Oasis in a Desert of Soy

2024-01-14T06:13:19-03:00

Warning: This post has mermaid.js enabled, so if you want to view the rendered diagrams, you’ll have to unfortunately enable JavaScript.

I love to learn new things and I’m passionate about Stoic philosophy. So, when I acquired the domain stoicquotes.io¹, I’ve decided to give htmx a try.

What is `htmx`?

htmx is a small JavaScript library that allows you to enhance your HTML with attributes to perform AJAX (Asynchronous JavaScript and XML) without writing JavaScript². It focuses on extending HTML by adding custom attributes that describe how to perform common dynamic web page behaviors like partial page updates, form submission, etc. htmx is designed to be easy to use, requiring minimal JavaScript knowledge, so that you can add interactivity³ to web pages with just HTML.

Let’s contrast this with the Soy stuff like the notorious React framework. React, on the other hand, is a JavaScript library for building user interfaces, primarily through a component-based architecture. It manages the creation of user interface elements, updates the UI efficiently when data changes, and helps keep your UI in sync with the state of your application. React requires a deeper knowledge of JavaScript and understanding of its principles, such as components, state, and props.

In simple terms:

htmx enhances plain HTML by letting you add attributes for dynamic behaviors, so you can make webpages interactive with no JavaScript coding; you can think of it as boosting your HTML to do more.
React is more like building a complex machine from customizable parts that you program with JavaScript, giving you full control over how your application looks and behaves but also requiring more from you in terms of code complexity and architecture.

Additionally, React can be slower and less performant than htmx. This is due to htmx manipulating the actual DOM itself, while React updates objects in the Virtual DOM. Afterward, React compares the new Virtual DOM with a pre-update version and calculates the most efficient way to make these changes to the real DOM. So React has to do this whole trip around diff’ing all the time the Virtual DOM against the actual DOM for every fucking change.

Finally, htmx receives pure HTML from the server. React needs to the JSON busboy thing: the server sends JSON, React parses JSON into JavaScript code, then it parses it again to HTML for the browser.

Here are some mermaid.js diagrams to illustrate what is going on under the hood:

--- ## title: htmx flowchart LR HTML --> DOM

--- ## title: React flowchart LR JSON --> JavaScript --> HTML --> VDOM[Virtual DOM] --> DOM

A consequence of these different paradigms is that htmx don’t care about what the server sends back and will happily include in the DOM. Hence, front-end and back-end are decoupled and less complex. Whereas in Reactland, we need to have a tight synchronicity between front-end and back-end. If the JSON that the server sends doesn’t conform to the exact specifications of the front-end, the application ~~becomes a dumpster fire~~ breaks.

Hypermedia

When the web was created it was based on the concept of Hypermedia. Hypermedia refers to a system of interconnected multimedia elements, which can include text, graphics, audio, video, and hyperlinks. It allows users to navigate between related pieces of content across the web or within applications, creating a non-linear way of accessing information.

HTML follows the Hypermedia protocol. HTML is the native language of browsers⁴. That’s why all the React-like frameworks have to convert JavaScript into HTML. So it’s only natural to rely primarily on HTML to deliver content and sprinkle JavaScript sparingly when you need something that HTML cannot offer.

Unfortunately, HTML has stopped in time. Despite all the richness of HTTP with the diverse request methods: GET, HEAD, POST, PUT, DELETE, CONNECT, OPTIONS, TRACE, PATCH; HTML only has two elements that interact with the server:

: sends a GET request to fetch new data.

: sends a POST request to create new data.

That’s the main purpose of htmx: allowing HTML elements to leverage all the capabilities of HTTP.

`htmx` in Practice

OK, enough of abstract and theoretical concepts. Let’s see how htmx works in practice.

First, the only thing you need to do enable htmx is to insert this

$b$	$b^1 \mod 7$	$b^2 \mod 7$	$b^3 \mod 7$	$b^4 \mod 7$	$b^5 \mod 7$	$b^6 \mod 7$
$1$	$1$	$1$	$1$	$1$	$1$	$1$
$2$	$2$	$4$	$1$	$2$	$4$	$1$
$3$	$3$	$2$	$6$	$4$	$5$	$1$
$4$	$4$	$2$	$1$	$4$	$2$	$1$
$5$	$5$	$4$	$6$	$2$	$3$	$1$
$6$	$6$	$1$	$6$	$1$	$1$	$1$

$b$	$b^1 \mod 7$	$b^2 \mod 7$	$b^3 \mod 7$	$b^4 \mod 7$	$b^5 \mod 7$	$b^6 \mod 7$
$1$	$1$	$1$	$1$	$1$	$1$	$1$
$2$	$2$	$4$	$1$	$2$	$4$	$1$
$3$	$3$	$2$	$6$	$4$	$5$	$1$
$4$	$4$	$2$	$1$	$4$	$2$	$1$
$5$	$5$	$4$	$6$	$2$	$3$	$1$
$6$	$6$	$1$	$6$	$1$	$1$	$1$

Jose Storopoli, PhD

Von Neumann

The Sharpest Mind of the 20th Century

The Fly Puzzle

License

Zero-Knowledge Proofs

What are ZKPs?

ZKPs Taxonomy

zk-SNARKs

The first idea: Proving Knowledge of a Polynomial

The second idea: Proving Knowledge of a Polynomial without Revealing the Polynomial

The third idea: Representing Computations as Polynomials

Remarks

Resources

License

Shamir's Secret Sharing

Polynomial Interpolation

Shamir’s Secret Sharing

Rotating Shares

The Polynomial King

Conclusion

License

Sherlock Holmes Final Letter: A Simple Dead Man's Switch in Rust

Dead Man’s Switch

How to Use It

The Implementation Details

Contributions are Welcome

Conclusion

License

Seed Phrases and Entropy

Entropy

How the hell Physics came to Passwords?

Entropy and Passwords

Bitcoin Seed Phrases

Seed Phrases and Passwords

Conclusion

License

Cryptography Basics

One-way functions

Hash Functions

SHA-2 and its variants

Fields

Finite Fields

Operations in Finite Fields

Why the Discrete Logarithm Problem is Hard as Fuck

Numerical Example of the Discrete Logarithm Problem

Public-key cryptography

DSA

Schnorr

Why we don’t reuse nonces?

Why we can combine Schnorr Signatures and not DSA?

Technical Interlude: Elliptic Curves

Conclusion

License

Fullstack and Progressive Web Apps in Rust: A Tale of a Sudoku Spyware

Tools of Choice

Package Management

Runtime Errors

Performance

Progressive Web Apps

Sudoku Algorithm

Find the Related Cells

Find the Conflicting Cells

Sudoku App State

Tests

Conclusion

License

htmx: an Oasis in a Desert of Soy

What is htmx?

Hypermedia

htmx in Practice

What is `htmx`?

`htmx` in Practice

$b$	$b^1 \mod 7$	$b^2 \mod 7$	$b^3 \mod 7$	$b^4 \mod 7$	$b^5 \mod 7$	$b^6 \mod 7$
$1$	$1$	$1$	$1$	$1$	$1$	$1$
$2$	$2$	$4$	$1$	$2$	$4$	$1$
$3$	$3$	$2$	$6$	$4$	$5$	$1$
$4$	$4$	$2$	$1$	$4$	$2$	$1$
$5$	$5$	$4$	$6$	$2$	$3$	$1$
$6$	$6$	$1$	$6$	$1$	$1$	$1$