Introduction

This book is an ongoing effort to gather some notes about cryptography with a focus on schemes which are relevant to the decentralized web such as multiparty signatures, zero-knowledge proofs, etc.

For now it consists of the following parts:

  • Mathematics:
  • Cryptography:
  • Proof Systems:
  • ZK Hack Puzzles Walk-through:

We assume that the reader has some basic knowledge of arithmetic and algebra and of common concepts from cryptography (hash functions, signatures, ...).

Here are a number of freely available textbooks to learn more (we will point to specific sections of them when needed):

Mathematical Notation

  • Given a set we let denote the set of strings of length over and the set of all strings, i.e., where denotes the singleton consisting of the empty string; the length of a string is denoted
  • Given a non-empty finite set the sampling of a variable according to the uniform distribution is denoted
  • Unless specified otherwise, groups are denoted additively.
  • Main algebraic structures:
notationalgebraic structure
arbitrary group
arbitrary ring
integral domain
arbitrary field
natural numbers
integers
rational numbers
real numbers
complex numbers

Note that all proofs throughout the book are collapsible:

Proof

You can choose to display it or leave it hidden forever.

Acknowledgments

This book is built with mdBook using the following preprocessors:

If you spot anything off, I'd be happy to get your feedback and acknowledge it here.

Chapter status:   🚧   draft   🚧

Basic Arithmetic

In this chapter, we cover a number of fundamental notions and results about arithmetic over integers, such as divisibility, greatest common divisors, and unique factorization. Most of them will be generalized in the chapter about rings. We will return to the axiomatic definition of integers in a later chapter (to wit, the set of integers is, up to isomorphism, the unique non-trivial, ordered, unitary commutative ring whose positive elements are well-ordered).

Contents

Basic Definitions

In all the following, we adopt the following notation and conventions:

  • we let denote the set of non-negative integers (also called natural numbers):
  • we let denote the set of positive integers:
  • we let denote the set of integers:

We will not explicitly formulate here all standard properties of addition, multiplication, and of the order relation and defer a more formal treatment to the chapter about rings.

We only state two salient properties of the integers: the cancellation law (a consequence of being a so-called integral domain) and the well-ordering principle (or well-ordering axiom), which is one of the axioms governing integers.

The absolute value of an integer denoted is defined as

Proposition 3.1 (cancellation law for integers). Let be integers such that Then implies

Proposition 3.2 (well-ordering axiom). Let be a non-empty subset of Then contains a smallest element, i.e., there exists such that for any

It is immediate that the smallest element is necessarily unique.

Proposition 3.3. Let be a non-empty subset of Then it has a unique smallest element.

Proof

Assume that has two smallest elements and Then (since is a smallest element) and (since is a smallest element). Hence,

It is also the case that a finite non-empty subset of (or has a largest element.

Proposition 3.4. Let be a finite non-empty subset of Then contains a unique largest element, i.e., there exists a unique such that for any

Divisibility

Let We say that divides or that is a divisor of or that is a multiple of denoted if there exists such that

Let us list a number of basic properties of divisibility.

Proposition 3.5. For every

Proof

Assume that Then for some This implies that hence All other implications can be proven in a similar way.

Proposition 3.6. The only integers dividing are and

Proof

Let be a positive divisor of By [??], On the other hand, since there exists such that Clearly, must be positive and cannot be Hence, by [??], Multiplying both sides by we obtain Hence, we have both and which implies that

Jumping ahead

In a general ring with a multiplicative identity divisors of are called units. The previous proposition states that units of are and See Rings, Units.

Proposition 3.7. For every with

Proof

If then for some This implies that and hence

Conversely, assume that Then for some Since by the cancellation law for integers, and hence

Proposition 3.8. For every

Proof

If then clearly and

Conversely, assume that and Then and for some integers If then and the conclusion holds. Assume now that either or is not zero. Let us consider the case (the case is similar). Multiplying by we get hence Since by the cancellation law for integers, this implies that i.e., divides By Proposition 3.6, this implies hence

Jumping ahead

In a general commutative ring, two elements and such that and are called associates. The previous proposition shows that associates in are opposites. See Rings, Divisibility.

Proposition 3.9. For every

Proof

If then for some Similarly, if c then for some This implies that i.e.,

Divisibility defines an order relation over

  • it is reflexive since
  • it is anti-symmetric since by Proposition 3.8, if and then but for this implies
  • it is transitive by Proposition 3.9.

Unlike which is total, this order relation is only partial since there are integers and such that neither nor hold. These two order relations are "consistent" if the sense of the following proposition.

Proposition 3.10. For every

Proof

Let such that Then for some We cannot have as this would imply Moreover, as would imply Hence, which implies

Euclidean Division

Proposition 3.11 (Euclid's division lemma). Let be integers with Then there exists unique integers and such that and Moreover, if and only if

Proof

Let us show existence first. Consider the set defined as This set is non-empty: if then whereas if then By the well-ordering axiom, has a minimal element Since there exists such that Moreover, by definition of Assume that Then contradicting the minimality of Hence, with as claimed.

Let us now show uniqueness. Assume that with and Then meaning is a multiple of But note that This implies that since the only multiple of in is Hence, which by the cancellation law for integers implies that

Finally, let us show that if and only if If then and hence Conversely, if then for some But then uniqueness of and implies that and

The four integers and each have a name:

  • is called the dividend,
  • is called the divisor,
  • is called the quotient,
  • is called the remainder.

Example

  • Taking and we have and
  • This also works with a negative dividend: taking and we have and

Greatest Common Divisor

Let be integers such that and are not both zero. A common divisor of and is an integer such that divides both and A greatest common divisor (GCD) of and is a common divisor of and such that and any other common divisor of and divides

Note that this is different from the definition one usually encounters in high school, which is as follows. Consider the set of all common divisors of and This set is non-empty since It is also finite since any common divisor of and satisfies if and if and and are not both zero. Then the GCD of and is defined as the largest element of which by Proposition 3.4 exists and is unique.

The reason we prefer to former definition is that it is closer to the definition of a GCD in a commutative ring (the only difference is that condition is dropped since in general, there might be no order relation over an arbitrary ring). Working with this definition, it is not immediately clear though that a greatest common divisor always exists and is unique (in fact, there exists commutative rings where two elements might not have a GCD), nor that it is equal to the GCD according to the later definition. The following proposition proves this, and more.

This set has the following interesting property: if then every multiple of also belongs to

Consider for example and We can quickly establish (using any of the two definitions above) that the greatest common divisor of and is

Proposition 3.12 (Bézout's lemma). Let be integers such that and are not both zero. Then there exists a unique greatest common divisor of and Moreover, is the smallest non-negative integer which can be written as a linear combination of and In particular, there exists integers such that

Proof

Let us show uniqueness first. Assume that and have two GCDs and Then and and hence by Proposition 3.8, one has Since a GCD must be positive, it follows that

Let us show existence by proving that the smallest non-negative integer which can be written as a linear combination of and in indeed their GCD. consider the set of all non-negative linear combinations of and namely This set is non-empty. Indeed, assume that (the case is similar). Then either or is in By the well-ordering axiom, has a minimal element We claim that is the GCD of and

First, we clearly have Second, let us show that is a common divisor of and For this, we will show the following stronger statement: every element in is divisible by Since and this will imply in particular that divides both of them. Let be any element in Since let us also write By Euclid's division lemma, there exists and with such that Then Hence, But since and is the minimal element of one must have Hence, i.e., divides

It remains to show that every common divisor of and divides

The greatest common divisor of and is denoted (or sometimes but this notation is confusing and will not be used here). By convention, although the set of common divisors of and is is defined as (we will see why this makes sense when relating GCDs and ideals).

Expressing the GCD as a linear combination of and i.e., writing with explicit values is often called a Bézout relation.

Note

In a general commutative ring, greatest common divisors are defined as above, except condition is dropped (indeed, in a general ring, an order relation might not exist). Under this more general definition, two integers and have two greatest common divisors and The fact that is ordered allows to single out the positive one as the greatest common divisor of and

Two integers are said coprime or relatively prime if which is equivalent to and having only and as common divisors.

Proposition 3.13. Let Then and are coprime if and only if there exists such that

Proof

If and are coprime, then existence of such that is established by Proposition 3.12. Conversely, assume that there exists such that

Computing GCDs and Bézout Relations

The GCD can be computed using Euclid's algorithm. It relies on the following lemma.

Proposition 3.14. Let such that and are not both zero. Then, for any

Proof

GCDs and Ideals

In this section, we give another interpretation of GCDs in terms of ideals of As pretty much any notion we encounter in this chapter, ideals can be defined for any ring.

A subset of the integers is called an ideal of if it satisfies the following properties:

Proposition 3.15. Let Then

Prime Numbers

Chapter status:   ✅   in good shape   ✅

TODO: some proofs missing

Groups

We cover the elementary theory of groups, one of the most fundamental algebraic structure consisting of a set with a binary operation satisfying associativity, the existence of an identity element, and the existence of an inverse for every element of the set. We focus in particular on abelian (commutative) groups and cyclic groups, which are of particular relevance for cryptography.

Contents

Basic Definitions

A binary operation over a set is a function usually denoted in infix notation, meaning the image of is written

A group is a non-empty set equipped with a binary operation satisfying the following properties:

  • associativity: for every
  • identity element: there exists such that for every such an is called the identity element of
  • inverse element: for every there exists such that such a is called the inverse of

The use of determiner the for the identity element and the inverse of an element is justified by the following proposition.

Proposition 5.1. Let be a group. Then has a unique identity element and every element has a unique inverse.

Proof

Assume that has two identity elements and Then (because is an identity element) and (because is an identity element), hence

Assume that some group element has two inverses and Then

The group consisting of a single element such that is called the trivial group.

If the binary operation is commutative, i.e., for every then is said to be abelian.

If is finite, the number of elements of is called the order of and denoted If is infinite, is said to have infinite order.

Note

A group consists of a set and a binary operation. Hence, strictly speaking, it is a pair although one commonly speaks of "the group ", the binary operation being left implicit.

Example

  • and equipped with addition are abelian groups of infinite order. The identity element is and the inverse of is
  • is not a group for addition since non-zero elements don't have an inverse in
  • and equipped with multiplication are abelian groups of infinite order. The identity element is and the inverse of is
  • Neither nor are groups for multiplication since no elements except and have an inverse.
  • Given a set and two functions and from to define the composition of and denoted as for every Then the set of all permutations (bijections) of a set of size is a group for the operation of finite order The identity element is the identity function and the inverse of is the inverse function mapping to the unique such that This group is non-abelian when

Additive/Multiplicative Notation

There are two standard notation types for the group operation:

  • additive: the group operation is denoted the identity element is denoted (or if one wants to avoid confusion with integer and the inverse of is denoted
  • multiplicative: the group operation is denoted or the identity element is denoted (or if one wants to avoid confusion with integer and the inverse of is denote

In the case of multiplicative notation, the operation symbol might be omitted and the group law simply denoted by juxtaposition. By convention, for an "abstract" group, additive notation is restricted to abelian groups (meaning multiplicative notation is used either when the group is known to be non-abelian, or when the abelian/non-abelian character of the group is unspecified).

For the rest of this chapter, unless specified otherwise, the group operation will be denoted multiplicatively but the identity element will be denoted for clarity.

Repeated Group Operation

Give a group element it is possible to apply the group operation repeatedly to itself, i.e., compute

When the group operation is denoted additively, for we define This is usually called the scalar multiplication of by

When the group operation is denoted multiplicatively, for we define This is usually called the exponentiation of by

Proposition 5.2. Let be a group denoted multiplicatively. Then for every and every one has

Moreover, if is abelian, then for every and every

Direct Product

Let and be two groups. The direct product of and is the Cartesian product equipped with the binary operation defined component-wise:

Proposition 5.3. The direct product as defined above is a group. Its identity element is where and are respectively the identity element of and The inverse of is

For abelian groups, the direct product is sometimes called direct sum and denoted 1

Subgroups

Let be a group and be a non-empty subset of The subset is a subgroup of if equipped with the binary operation of is a group.

Proposition 5.4. Let be a group and be subset of Then is a subgroup of if and only if the following three properties are satisfied:

  • for every
  • for every

Note

The condition can be replaced by the condition that is non-empty. Clearly, Conversely, if the second and third conditions are met, then implies that there is some which by the third condition implies that which by the second condition implies that

The following proposition gives a slightly more compact subgroup criterion.

Proposition 5.5. Let be a group and be subset of Then is a subgroup of if and only if and for every

A subgroup of is said to be proper if it is different from Any non-trivial group has at least one proper subgroup, namely called the trivial subgroup.

Proposition 5.6. The intersection of a (finite or infinite) set of subgroups is a subgroup.

Proof

Let be the intersection of a collection of subgroups Then, by Proposition 5.5, for every and hence Let Then, for every and hence again by Proposition 5.5. Thus, It follows that is a subgroup.

Cosets and Lagrange Theorem

Let be a subgroup of a group Consider the relation defined by if and only if and the dual one defined by if and only if The following proposition shows that these are equivalence relations.

Proposition 5.7. Let be a group. Let be a subgroup of Then the relation defined by if and only if is an equivalence relation. The proposition also holds by replacing by

Proof

Let be a subgroup of and be relation defined by if and only if Let us show that is reflexive, symmetric, and transitive.

  • reflexivity: implies that for every and hence
  • symmetry: being closed under under inverses implies that for every
  • transitivity: being closed under the binary operation implies that for every

The proof is similar for the relation defined by

Let be a subgroup of and be the equivalence relation An equivalence class for is called a right coset of Being equivalence classes, right cosets form a partition of For the right coset to which belongs is easily seen to be Similarly, an equivalence class for the relation relation is called a left coset of Left cosets form a partition of and for the left coset to which belongs is

When is abelian, the set of right cosets and the set of left cosets are the same, but when is non-abelian this is not necessarily the case. Note that itself is both a right and a left coset.

A cornerstone of group theory is Lagrange's theorem, which essentially follows from the fact that right cosets (as well as left cosets) all have the same size.

Theorem 5.1 (Lagrange's Theorem). Let be a finite group. Then the order of any subgroup of divides the order of

Proof

Let be a subgroup of For every the mapping is a bijection from to the right coset it is obviously surjective and hence it is injective. Hence, all right cosets have elements. Since right cosets form a partition of we have where is the number of right cosets.

A similar reasoning with left cosets shows that the number of left cosets is equal to the number of right cosets.

The number of right (or left) cosets of is called the index of in and denoted Hence, Lagrange theorem states that

Normal Subgroups

Having defined an equivalence relation associated with a subgroup, one may ask whether the set of right (or left) cosets can be equipped with a group structure. This is where the notion of normal subgroup comes into play.

Let be a group. A subgroup of is said to be normal if for every (i.e., left and right cosets are equal).

Normality can be characterized by a number of other equivalent conditions. The easiest to check is often the following one.

Proposition 5.8. A subgroup of is normal if and only if for every

For abelian groups, the situation is pretty simple.

Proposition 5.9. Every subgroup of an abelian group is normal.

Proof

If is abelian and is a subgroup of then for any and hence By Proposition 5.8, this implies that is normal.

Let us see now how normal subgroups allow us to construct quotient groups.

Quotient Groups

Let be a group and let be an equivalence relation on We say that is compatible with the group structure of if and implies If is compatible with the group structure of then one can equip the quotient set (the set of all equivalence classes) with a binary operation defined as where denotes the equivalence class of This is well defined as compatibility of with the group structure ensures that this binary operation does not depend on the specific representatives and of each equivalence class. The following proposition states that normal subgroups completely characterize the equivalence relations which are compatible with the group structure of

Proposition 5.10. Let be a group and be a normal subgroup of Then the equivalence relation defined by is compatible with the group structure of Conversely, let be an equivalence relation compatible with the group structure of Then is a normal subgroup of and

Proof

Let be a normal subgroup of Let us show that defined by (which is an equivalence relation by Proposition 5.7) is compatible with the group structure of Let such that and We want to show that i.e., Note that We have because which implies that because is normal. We also have because hence and

Conversely, assume that is an equivalence relation which is compatible with the group structure of Define as the equivalence class of the identity element. Let us first show that is a normal subgroup. Clearly, Let i.e., and Then, by compatibility of with the group structure, we have Hence and by Proposition 5.5, is a subgroup.

To show that is normal, let us show that for every Let and Then Hence and is normal.

It remains to show By compatibility of with the group structure, we have and Hence which concludes the proof.

Let be a normal subgroup of and let be the equivalence relation defined by Then the quotient set equipped with the binary operation defined by is a group (as shown in the proposition below) called the quotient group associated with and denoted Note that the order of is the index of

Proposition 5.11. Let be a group and be a normal subgroup of Then is a group. Its identity element is and the inverse of is If is abelian then so is

Proof

This follows straightforwardly from the definition of the binary operation

Homomorphisms and Isomorphisms

Let and be two groups.

  • A group homomorphism is a function from to such that for every
  • If then is called a group endomorphism.
  • If is bijective, then is called a group isomorphism and groups and are said isomorphic, denoted
  • If and is bijective, then is called a group automorphism (hence, a group automorphism is both a group endomorphism and a group isomorphism).

The following proposition gives a number of properties of group homomorphisms.

Proposition 5.12. Let and be two groups and be a group homomorphism. Then:

  • for every
  • for every subgroup of is a subgroup of
  • for every subgroup of is a subgroup of
  • if is a group isomorphism, then the inverse function is also a group isomorphism;
  • if is another group and is a group homomorphism, then is a group homomorphism from to

Let be a group homomorphism. Two sets related to are particularly important:

  • The kernel of is the subset of defined as
  • The image of is the subset of defined as

By Proposition 5.12, is a subgroup of since it is equal to and is a subgroup of since it is equal to

Theorem 5.2 (First Isomorphism Theorem). Let be a group homomorphism. Then is a normal subgroup of and

Proof

Let us first show that is normal. Let and Then Hence and hence is normal.

Consider now the mapping defined by It is well-defined since In other words, equivalence classes of are just subsets of elements of with the same image under Consequently, the definition of does not depend on the representative of the equivalence class.

It is a group homomorphism since Moreover, it is injective since It is also surjective since any element in is of the form for some and hence Hence, is a group isomorphism.

There are three other isomorphism theorems but they are not as useful as the first one.

Group Generation

Let be a group and be a subset of The subgroup generated by , denoted is the intersection of all subgroups of containing (Recall that by Proposition 5.6, an intersection of subgroups is a subgroup.) Informally, it is the "smallest" (for inclusion) subgroup of which contains any subgroup containing contains

The following proposition gives a more explicit characterization.

Proposition 5.13. Let be a group and be a subset of Then is the subgroup of all elements of that can be expressed as the finite product of elements of and inverse of elements of

Note in particular that (in which case Proposition 5.13 still holds with the convention that an empty product of group elements is equal to

If is finite, the subgroup generated by is also denoted When is abelian, then

A group is said to be finitely generated is there exists a finite number of elements such that in which case is called a generating set of

A group is said cyclic (or monogenous2) if there exists such that in which case is called a generator of

The order of an element is the order of the subgroup If has infinite order, the order of an element can be finite or infinite.

Below we list a number of properties of the order of an element.

Proposition 5.14. Let be a group and be a group element. Then has finite order if and only if there exists such that In that case, 's order is the smallest integer such that and one has

Proposition 5.15. If has finite order then the order of any element divides In particular, for any

Proof

The first part is a direct consequence of Lagrange's Theorem. For the second part, let be the order of and write Then

Proposition 5.16. Let be a group and be an element of order Then for every if and only if divides

Proof

If divides then for some integer which implies Conversely, assume that By Euclid's division lemma, there exists such that and Then and consequently This implies that as otherwise would have order Hence, and divides

Proposition 5.17. Let be a group, be an element of order and Then the order of is

Proof

Let and let be the order of Then and hence by Proposition 5.16, which implies Since this implies

On the other hand, and hence by Proposition 5.16, We conclude that

Properties of Cyclic Groups

Proposition 5.18. Any cyclic group is abelian.

Proof

Let be a cyclic group, let be a generator of and let be two group elements. Then there exists such that and which implies that

Proposition 5.19. Let be a cyclic group and be a subgroup of Then and are cyclic.

Proof

Let be a cyclic group, be a generator of and be a subgroup of Let us first show that is cyclic. If then is clearly cyclic. Otherwise, let be the smallest integer such that (which necessarily exists since contains at least one element different from and either this element or its inverse can be written for some We will prove that Clearly, since is a subgroup. Conversely, let Then for some By Euclid's division lemma, there exists such that and Then and consequently This implies that as otherwise this would contradict the minimality of Hence and Thus, and hence is cyclic.

Let us now show that is cyclic. More precisely, let us prove that Let be an element of the quotient group specified by an arbitrary representative Then there exists such that Thus, and hence is a generator of

Proposition 5.20. Any group with prime order is cyclic and any element different from the identity element is a generator of

Proof

Let be a group of prime order Let be an element different from the identity element and let be the order of Since the order of an element divides the order of the group by Proposition 5.15, one has either or Since one cannot have hence and generates

Proposition 5.21. Let be a cyclic group of order be a generator of and Then if and only if In particular, has generators, where is Euler's function.

Proof

We have if and only if the order of is which by Proposition 5.17 is equivalent to

For the second part of the proposition, write Then generators of are exactly elements of the form with and hence there are such elements.

Proposition 5.22. Let and be two cyclic groups of order and respectively. Then the direct product is cyclic if and only if Moreover, is a generator of if and only if is a generator of and is a generator of

Proof

Let us first show the following lemma: Let be an element of order and be an element of order Then has order

Let be the identity element of and be the identity element of Then The smallest such positive integer is by definition which proves the lemma.

Let us now prove the proposition. Assume that is cyclic and let be a generator of Then clearly must be a generator of and a generator of By the previous lemma, has order On the other hand, has order as it generates Hence, which implies

Conversely, assume that and let be a generator of and be a generator of By the lemma, has order and hence generates

Classification of Cyclic Groups

The set of integers equipped with addition is a cyclic group of infinite order with identity element and generators and

What are its subgroups? In general, a simple way to construct subgroups of any group is given by the following proposition.

Proposition 5.23. Let be a group denoted additively and be an integer. Then is a subgroup of

Proof

First, one has since Second, let Then and for some Hence, Thus, by Proposition 5.5, is a subgroup of

Hence, for any integer the set is a subgroup of By Proposition 5.19, it is cyclic. One can easily check that it has infinite order and that it is generated by and These are in fact the only subgroups of

Proposition 5.24. Let be a subgroup of Then there exists a unique such that

Proof

Le us show existence first. If then Assume now that implying that contains at least one positive integer (it contains at least one non-zero integer and either or is positive). Let be the smallest positive integer in Let us show that Clearly, since Conversely, let By Euclid's division lemma, there exists such that and Since and are in is also in which implies as otherwise this would contradict the minimality of Thus, and Hence, which proves existence.

For uniqueness, assume that for Then implies and implies and consequently since and are in

For any we can consider the quotient group The equivalence class of is also called the residue class of modulo . There are distinct classes, namely Hence, the index of is and has order It is cyclic with generator (and More generally, by Proposition 5.21, the generators of are for such that

Another way to think of is as the set equipped with "modulo " addition and inverses. More precisely, let be the set of symbols equipped with the binary operation This is a group with identity and inverse One can easily see that and are isomorphic, the isomorphism being From now on, we will work with the latter formalism.

As for let us characterize the subgroups of By Proposition 5.23, for any integer is a subgroup of What does this subgroup look like? One has where the last equivalence follows from Bézout's lemma.

Hence, letting we see that is the subset of containing multiples of i.e., In particular, has order and index (As we will see shorty, being cyclic, one has )

Again, we can show that these are in fact the only subgroups of

Proposition 5.25. Let and be a subgroup of Then there is a unique such that and

Proof

Let be a subgroup of Let us prove existence first. Let Then is a subgroup of it is clearly non-empty and for and one has hence Thus, there exists such that Moreover, since and hence Since it follows that

Let us now prove uniqueness. Assume that for with and dividing Since this implies and hence Conversely, and hence since and are in

We can now prove the following two "structure theorems" stating that, up to isomorphism, and are the only cyclic groups of infinite, resp. finite order

Theorem 5.3 (Fundamental Theorem of Cyclic Groups). Let be a cyclic group. Then:

  • If has infinite order, then it is isomorphic to and the subgroups of are exactly the subsets for
  • If has finite order then it is isomorphic to and the subgroups of are exactly the subsets for such that In particular, has exactly one subgroup of order for each divisor of namely

Proof

Let be a cyclic group and be a generator of Consider the mapping defined by Then is a group homomorphism since It is clearly surjective since is a generator of

Consider first the case where has infinite order. Let us show that is injective. Assume that for distinct integers and with Then contradicting the fact that has infinite order. Hence, is an isomorphism from to Since the subgroups of are exactly for the subgroups of are exactly where the last equality follows from being a generator and hence

Consider now the case where has finite order Let us show that Since has order by Proposition 5.16, for every which exactly means that Hence, By the First Isomorphism Theorem,

Let be the isomorphism defined by Since the subgroups of are exactly for such that the subgroups of are exactly

Cauchy's Theorem for Abelian Groups

Lagrange's theorem states that the order of a subgroup of a finite group divides the order of the group. Conversely, given a divisor of the order of the group, does there always exist a subgroup of order ? A group where this property holds is called a converse Lagrange theorem (CLT) group. The answer is no in general. However, there are specific cases where the existence of a subgroup is guaranteed. Cauchy's theorem states that for every group (non-necessarily abelian) of finite order and every prime divisor of has a subgroup of order Cauchy's theorem is a special case of Sylow's (first) theorem.

Here, we will only prove it in the easier case where is abelian. Actually, a more general result (that will follow easily from the fundamental theorem of finite abelian groups) is that any finite abelian group is CLT. In other words, for a finite abelian group of order the existence of a subgroup of order is guaranteed for any divisor of not only for prime divisors.

Theorem 5.4 (Cauchy's Theorem, Abelian Case). Let be an abelian group of finite order Then, for every prime divisor of there exists a subgroup of of order (or, equivalently, there exists an element of of order

Proof

The equivalence between the two conclusions follows from the fact that a group of prime order is necessarily cyclic.

Let be a generating set of and for let denote the order of Consider the mapping defined by Since is abelian, it is a group homomorphism. Moreover, since is a generating set, is surjective (indeed, by definition of a generating set of an abelian group, any element can be written as for some integers and for each By the First Isomorphism Theorem, is isomorphic to which implies that and hence divides

Let be a prime divisor of Then divides and hence divides for some Then we can write and by Proposition 5.17, has order which concludes the proof.

Exponent of a Group

Let be a group. Consider the subset of defined as One can easily check that this is a subgroup of Hence, by Proposition 5.24, there is a unique integer such that this subgroup is equal to This integer is called the exponent of Equivalently, it is defined as the smallest positive integer such that If no such integer exists, depending on the convention, is said to have exponent 0 or infinite exponent. A finite group of order necessarily has finite exponent satisfying (since by Proposition 5.15, for every and hence Moreover, the order of any group element divides by Proposition 5.16. Conversely, a group with infinite exponent necessarily has infinite order. However, a group with finite exponent is not necessarily finite.

In the following, we prove that an abelian group with finite exponent always contains an element of order This will be a key lemma for proving the fundamental theorem of finite abelian groups. Note that none of the three following propositions holds for a non-abelian group.

Proposition 5.26. Let be an abelian group and and be two elements of respective order and such that Then the order of is More generally, if are group elements of respective orders such that then the order of is

Proof

Let be the order of Since by Proposition 5.16, we have

On the other hand, since one has and hence This implies that hence But since Symmetrically, one also has Since this implies and hence The generalization can be proved by induction on

Proposition 5.27. Let be an abelian group and and be two elements of respective order and Then there exists an element of of order

Proof

Let be the prime factor decomposition of For each divides either or Say it divides (the reasoning is similar if it divides Then, by Proposition 5.17, has order Hence, for each there exists an element of order By Proposition 5.26, has order

Proposition 5.28. Let be an abelian group of finite exponent Then there exists an element of of order

Proof

By Proposition 5.16, the order of any group element divides In particular, all group elements have order at most and hence there exists a group element of maximal order Assume towards a contradiction that If the order of every group element divides then for every contradicting the minimality of Otherwise, assume that there is a group element of order which does not divide Then, by Proposition 5.27, there exists an element of order contradicting the maximality of Hence, it must be that which concludes the proof.

As a direct corollary, we have that a finite abelian group is cyclic if and only if its order is equal to its exponent.

Structure Theorem for Finite Abelian Groups

This section presents the fundamental theorem of finite abelian groups, sometimes called Kronecker theorem. As we will see, finite abelian groups can be "decomposed" in two ways. The equivalence between these two decompositions relies on the following theorem.

Theorem 5.5 (Chinese Remainder Theorem for Groups). Let and be two positive integers. Then

Proof

Assume that Consider the mapping defined by One can easily check that is a group homomorphism. Moreover, where the last equivalence follows from Hence, By the First Isomorphism Theorem, is isomorphic to In particular, hence Thus,

Conversely, assume that By Proposition 5.22, is not cyclic. As is cyclic, these two groups cannot be isomorphic.

Theorem 5.6 (Fundamental Theorem of Finite Abelian Groups). Let be a non-trivial finite abelian group. Then:

  • (primary decomposition): is isomorphic to a direct product of cyclic groups where the 's are (not necessarily distinct) primes and the 's are positive integers. This decomposition is unique up to the order of factors. The prime powers are called the elementary divisors of
  • (invariant factor decomposition): is isomorphic to a direct product of cyclic groups where the 's are positive integers such that for This decomposition is unique and is the exponent of the group. The integers are called the invariant factors of

Proof

TODO

Consider for example What are the primary and invariant factor decompositions of this group? By the Chinese remainder theorem, we have The penultimate form is the primary decomposition, while the last form is the invariant factor decomposition.

The smallest abelian non-cyclic group is of order 4, usually called the Klein group.

Proposition 5.29. A finite abelian group is cyclic if and only if in its invariant factor decomposition.

For an integer one may ask how many different abelian groups of order there are, up to isomorphism. Let denote the partition function defined as follows: for an integer is the number of distinct ways of writing as a sum of positive integers, where the order of these integers does not matter. For example, since has partitions: and

Proposition 5.30 (Number of Finite Abelian Groups of Fixed Order). Let be an integer and let its decomposition in prime factors be Then the number of abelian groups of order up to isomorphism, is where is the partition function. In particular, there is a unique abelian group of order up to isomorphism (namely if and only if is square-free, i.e.,

For example, there is a unique (up to isomorphism) abelian group of order namely (primary/invariant factor decomposition). On the other hand, there are two (up to isomorphism) abelian groups of order namely and

Structure Theorem for Finitely Generated Abelian Groups

We now consider abelian groups which are finitely generated (but not necessarily of finite order).

Let be an abelian group. For an element is said to be an -torsion element if (or equivalently, if has finite order dividing An element is said to be a torsion element if it has finite order.

Proposition 5.31. Let be an abelian group. Then the set of all -torsion elements of denoted is a subgroup called the -torsion subgroup of and the set of all torsion elements of denoted is a subgroup called the torsion subgroup of

Proof

First, is clearly an -torsion element. Let and be two -torsion elements. Then hence is an -torsion element. Hence is a subgroup of by Proposition 5.5.

Similarly, is a torsion element. Let and be two torsion elements of order respectively and Then Hence has finite order, i.e., it is a torsion element. Hence is a subgroup of

If then is called a torsion group (or periodic group). If then is said torsion-free.

Theorem 5.7 (Fundamental Theorem of Finitely Generated Abelian Groups). Let be a finitely generated abelian group. Let be the torsion subgroup of Then is finite and abelian and there exists a free abelian subgroup such that In particular, there exists an integer (called the free rank or simply rank of and integers with such that Integers and are unique.

Proof

TODO


1: While the direct sum is the same as the direct product for a finite number of groups, this is not the case for a infinite number of groups. See this StackExchange question.

2: Sometimes cyclic is used for groups which are both monogenous and finite, which makes sense since for an infinite monogenous group such as one never "cycles back" when computing

Chapter status:   👷   in progress   👷

TODO:

Polynomials

In this chapter, we cover general results about polynomials with coefficients in a ring.

We recall some abbreviations from the chapter about rings:

  • UCR stands for unitary commutative ring,
  • PID stands for principal ideal domain,
  • UFD stands for unique factorization domain.

Contents

Generalities

Let be a ring. A (univariate) polynomial with coefficients in is an infinite sequence such that for all but a finite number of indices Polynomials are traditionally denoted where is a symbol called indeterminate.

The set of all univariate polynomials over is denoted

Polynomials can be added:

Polynomials can also be multiplied:

Proposition 7.1. Let be a ring. Then equipped with operations and as defined above is a ring. If is commutative, then so is and if has a unity 1, then the constant polynomial is the unity of

The set is called the polynomial ring in over If we embed into by identifying with the constant polynomial then is a subring of

The degree of a polynomial is the largest power of occurring in with a non-zero coefficient, with the convention that polynomial has degree Hence, has degree provided We let denote the degree of a polynomial and denote the set of polynomials over of degree at most The leading term of is its highest degree term and the leading coefficient is If the leading coefficient is 1 then the polynomial is said to be monic.

We will often drop the indeterminate from the notation, simply writing "polynomial ", keeping it mostly when writing the polynomial coefficients explicitly as in

Proposition 7.2. Let be a ring and let and be two polynomials in Then Moreover, if the leading coefficient of either or is not a zero divisor, then

Proof

The first two inequalities follow straightforwardly from the definition of addition and multiplication in For the last part of the proposition, assume that has leading term and has leading term with and Then has leading term with as otherwise and would be zero divisors. Hence,

On the other hand, when has zero divisors, then one might have if the leading terms of and are and with

When the coefficients are in an integral domain, we have additional properties.

Proposition 7.3. Let be an integral domain. Then

  • is an integral domain;
  • for any polynomials
  • the units of are the constant polynomials for

Proof

Let and be two non-zero polynomials in If has leading term and has leading term with and then has leading term with since is an integral domain. Hence, is not the zero polynomial. This also shows that Clearly, for any unit the constant polynomial is a unit with inverse the constant polynomial Conversely, if polynomials and are such that then (by the second point) necessarily i.e., and are constant polynomials, and by definition these constants are units of

Divisibility in Polynomial Rings

All definitions regarding divisibility that we gave for general rings apply to polynomial rings. We restate these definitions here for convenience.

Let be a UCR. Given two polynomials and in we say that divides or that is a factor of or that is a multiple of denoted if there exists a polynomial such that

Proposition 7.4. Let be a UCR and be a non-zero polynomial. Then for every such that the leading coefficient of is not a zero divisor, In particular, this always holds when is an integral domain.

Proof

Let be a polynomial dividing By definition, there exists such that Since the leading coefficient of is not a zero divisor, by Proposition 7.2, Note that cannot be the zero polynomial as this would imply hence and thus

It is easy to see that this proposition does not hold when the leading coefficient of is a zero divisor: for example, over and hence

Two polynomials and are said to be associates if and By [??], over an integral domain are associates if and only if there exists such that

In general, might not be Euclidean (as we will see shortly, this holds if and only if is a field). However, one can perform division with remainder for polynomials as soon as the leading coefficient of the divisor is a unit.

Proposition 7.5. Let be a UCR. Then for every polynomials such that the leading coefficient of is in (i.e., a unit), there exists unique polynomials and such that and

Proof

Consider the set of all polynomials of the form for Let be a polynomial of minimal degree in and be such that Let us show that Indeed, assume that this does not hold. Let and be the leading terms of and respectively, with and Consider the polynomial defined as Then Since the leading terms of and are both they cancel and the leading term of has degree at most so that contradicting the assumption that has minimal degree in Hence, which proves existence of a suitable pair

Let us show uniqueness. Assume that there exists two pairs of polynomials such that and Then Assume that Note that the leading coefficient of is a unit and hence, by [??], is not a zero divisor. Hence, by Proposition 7.2, where the last inequality holds because On the other hand, by Proposition 7.2, This is a contradiction and hence we must have and hence proving uniqueness.

Ring vs. Polynomial Ring: Summary

impl./equ.see
integral domainintegral domainProposition 7.3
UFDUFD
PID/EuclideanUFD
fieldPID/Euclidean

Polynomial Evaluation and Roots

Let be a commutative ring. Given a polynomial in and an element the evaluation of at written is The function from to mapping to is called the polynomial function associated with

In general, there is not a one-to-one correspondence between polynomials and polynomial functions. For example, over a finite commutative ring the polynomial evaluates to 0 at every element but is clearly different from the constant polynomial 0. Hence, this gives an example where two different polynomials yield the same polynomial function.

As wee will see below, if is an infinite integral domain, though, there is a one-to-one correspondence between polynomials and polynomial functions.

We say that is a root of if The following result gives an important sufficient and necessary condition for a ring element to be a root of a polynomial.

Theorem 7.1 (factor theorem). Let be a UCR, be a polynomial and be a ring element. Then is a root of if and only if divides

The factor theorem is actually a special case of the following result.

Theorem 7.2 (polynomial remainder theorem). Let be a UCR, be a polynomial and be ring elements. Then if and only if divides

Proof

Assume that divides Then there exists such that Evaluating the two sides of this equality at yields i.e.,

Conversely, assume that Since the leading coefficient of is a unit, by Proposition 7.5, there exists polynomials and such that where Hence, the polynomial must be a constant. Evaluating the polynomial equality at we obtain that Hence, which exactly means that divides

Note that the statement " divides " is equivalent to the statement " is the remainder of the division of by ", hence the name of the theorem.

The factor theorem (which holds over any UCR) generalizes to multiple roots naturally, however only over integral domains. (We will see how the polynomial remainder theorem generalizes to multiple evaluations in a moment.)

Theorem 7.3 (generalized factor theorem). Let be an integral domain, be a polynomial, and be distinct ring elements. Then are roots of if and only if divides

Proof

Assume that divides Then there exists such that which implies that

We will prove the converse by induction on The case is simply the factor theorem. Assume that the implication holds for and let us prove that it holds for Let and be distinct roots of Since is in particular a root of by the factor theorem, there exists such that Moreover, for every which implies that since and are distinct and has no zero divisors. Hence, are roots of which by the induction hypothesis implies that divides Since divides

This has an important consequence regarding the maximal number of roots of a polynomial.

Proposition 7.6. Let be an integral domain and let be a non-zero polynomial of degree Then has at most distinct roots in

Proof

This follows easily from the generalized factor theorem. Indeed, assume that has degree and has roots. Let denote the roots of Then divides a contradiction with Proposition 7.4 as a polynomial of degree cannot divide a non-zero polynomial of degree

Let us prove the proposition directly by induction on The result clearly holds for Let and assume that the result holds for degree Let be a polynomial of degree If has no root then the result holds again. Otherwise, assume that has a root Then, by the factor theorem, for some polynomial where, by Proposition 7.3, has degree If is a root of distinct from then which implies that since and has no zero divisors. Since has at most distinct roots by the induction hypothesis, has at most distinct roots.

This proposition allows us to reconsider the relation between polynomials and polynomial functions.

Proposition 7.7. Let be an integral domain and be a polynomial such that for every If is infinite, then is the zero polynomial.

Proof

Assume that is not the zero polynomial and let be the degree of Then, by Proposition 7.6, has at most distinct roots in a contradiction with the assumption that for every since is infinite. Hence, must be the zero polynomial.

Hence, over an infinite integral domain, if and are two polynomials such that for every then In other words, there is a one-to-one mapping between polynomials and polynomial functions. However, not every function from to is a polynomial function: for example, the function such that and for is not a polynomial since it has infinitely many roots yet it cannot be the function corresponding to the zero polynomial.

We can in fact be more precise with the following proposition.

Proposition 7.8. Let be an integral domain. Let be the ring homomorphism mapping a polynomial to the corresponding polynomial function. Then:

  1. If is finite, then is surjective but not injective.
  2. If is infinite, then is injective but not surjective.

Remark 7.1. Note that being infinite and being an integral domain are two necessary conditions. Over infinite rings with zero divisors, a non-zero polynomial may evaluate to zero over the entire ring. See here.

Lagrange Interpolation

In all the following, a set of distinct field elements will be called an evaluation domain (or simply domain) of size

Theorem 7.4 (Lagrange interpolation theorem). Let be a finite field and let be a set of pairs of field elements such that for Then there is a unique polynomial called the Lagrange interpolation polynomial for such that for every

Proof

Uniqueness is proved as follows: assume there exists two polynomials and interpolating the points. Then the polynomial has roots but has degree at most hence by Proposition 7.6 it must be the zero polynomial, which implies that

To establish existence, one introduces the Lagrange basis associated with the domain This is the tuple of polynomials of degree defined as One can easily check that Then the Lagrange interpolating polynomial for is given by This polynomial has degree at most and it is easy to see that for every

Note that the Lagrange basis associated with any domain is indeed a basis for the -vector space in the linear algebra sense. The coordinates of a polynomial in this basis are

A polynomial specified by a tuple such that is sometimes said to be in coefficients form, while when it is specified by the values it takes over some domain it is said to be in evaluation form. This is merely a change of basis.

Another way to look at Lagrange interpolation is as follows. Considering the coefficients of a polynomial of degree as unknowns, each evaluation yields a linear equation In matrix form, this yields The matrix on the left-hand side is called a Vandermonde matrix. It is invertible if and only if the 's are distinct, which gives another way to see that there is a unique polynomial of degree at most such that for every

Generalized Polynomial Remainder Theorem

Lagrange interpolation allows us to formulate a generalization of the polynomial remainder theorem. Given an evaluation domain the vanishing polynomial over denoted is the polynomial defined by It is such that for every but it is not the Lagrange interpolation polynomial for since it has degree (the Lagrange interpolation polynomial for is actually the zero polynomial).

Theorem 7.5 (generalized polynomial remainder theorem). Let be a polynomial, be an integer, be distinct field elements, and be field elements (not necessarily distinct). Let be the vanishing polynomial for and be the Lagrange interpolation polynomial for Then for every if and only if divides or equivalently if and only if is the remainder of the division of by

Proof

Assume that divides i.e., there exists such that Evaluating this equality at and using and implies that for every

Conversely, assume that for every Since is Euclidean, there exists polynomials and such that with Evaluating this equality at yields Since is necessarily the Lagrange interpolation polynomial for

For one exactly recovers the polynomial remainder theorem since for a single point the vanishing polynomial is and the Lagrange interpolation polynomial is simply the constant polynomial

Computational Aspects

Newton's method and Neville's algorithm have quadratic complexity.

The Barycentric Formula

Assume we are given a set of evaluations over domain One can consider various task related to the Lagrange interpolation polynomial for One is to compute the coefficients of this polynomial. Another is to evaluate on a field element

For the first task, one may be tempted to solve Eq. (7.1). However, this requires to invert a square matrix of size which requires field operations.

A very useful form for Lagrange interpolation is the so-called barycentric formula [BT04]. Let be the vanishing polynomial for and for let denote the barycentric weights defined as Note that the formal derivative of is hence one also has Then the -th polynomial of the Lagrange basis is from which it follows that the Lagrange interpolation polynomial for can be written as This is the barycentric Lagrange interpolation formula.

Based on this, here is how one can compute the coefficients of and evaluate on a point outside in quasilinear time:

  • compute the coefficients of using a divide-and-conquer approach (see here)
  • compute the coefficients of from the ones of (this requires multiplications)
  • compute using multipoint evaluation

See also this.

Once this is done, computing for takes linear time using

All in all, this yields an algorithm with complexity field operations.

The Special Case of Roots of Unity

The most favorable case from a computational point of view is when is the subgroup of consisting of -th roots of unity. Then the vanishing polynomial takes a very simple form, namely Its formal derivative is simply

Applications

An important application of Lagrange interpolation is Shamir's secret sharing.

Chapter status: 👷 in progress 👷

TODO:

Elliptic Curves

Contents

Generalities

Let be a field of characteristic 1 Let be two field elements such that An elliptic curve in short Weierstrass form over denoted is the set of all pairs that satisfy the equation together with a distinguished element denoted called the point at infinity: This is the so-called affine representation of points of the elliptic curve. There is another family of representations called projective (more attractive from an algorithmic point of view) that we will discuss shortly.

Hasse's bound establishes that the number of points of called the order of is where is an integer (called the Frobenius trace or simply trace of the curve) satisfying

It is possible to equip with a commutative group law for which the point at infinity is the identity element with the so-called chord-and-tangent rule. This group operation is denoted additively and called addition law. The inverse of a point is the point

From the addition law we can define scalar multiplication: for and the scalar multiplication of by denoted or if there is no ambiguity, is given by

The group structure of is either cyclic or "almost" cyclic. Namely, a general theorem establishes that has at most two invariant factors (see Theorem 5.6 for the definition of an invariant factor). In other words, is isomorphic to either the cyclic group or a direct product of cyclic groups with

Affine versus Projective Coordinates

Earlier we saw how to defined an elliptic curve using affine coordinates. This is not how elliptic curves are usually defined by mathematicians, who usually prefer to use projective geometry.

Projective Plan

Let be a finite field of order and let The projective plan over denoted is the set of equivalence classes of where two tuples and are equivalent, denoted if there is a scalar such that Such an equivalence class is called a projective point. A projective point contains tuples The convention is to denote projective points with capital letters and colon separators, i.e., (or sometimes will denote the equivalence class

Equivalently, projective points can be seen as 1-dimensional subspaces ("lines") of the 3-dimensional vector space over How many projective points are there? There are non-zero vectors in but each of the non-zero vectors in a subspace generates this subspace, hence the total number of 1-dimensional subspaces is

Another way to count the number of projective points is as follows:

  • there are projective points of the form
  • there are projective points of the form
  • there is one projective point of the form

It is customary to identify the ordinary "affine" plane with the first type of projective points, meaning there is an injective map from to given by The inverse of this map is

The points of the second and third types (sharing the property that are called "points at infinity" and form the so-called "line at infinity".

It is possible to define projective lines in a similar way to projective points: projective lines are 2-dimensional subspaces of (with vector removed). There are also projective lines in which is quite natural since any 1-dimensional subspace of defines a unique 2-dimensional subspace via its orthogonal complement. A projective point "lies on" a projective line if it is included (in the set-theoretical sense) in the projective line. From this definition, it follows that (i) given any two projective points, there is exactly one projective line containing both of them, and (ii) given any two projective lines, there is exactly one projective point lying on both of them (meaning there are no parallel lines). Properties (i) and (ii) are in fact the axiomatic definition of a projective plan. Each projective line contains vectors and is the disjoint union of projective points. In particular, the projective points of the second and third type are indeed on the same projective line corresponding to the 2-dimensional subspace orthogonal to vector

Elliptic Curves in Projective Coordinates

To obtain the equation defining an elliptic curve in projective coordinates, we substitute to and to in the affine short Weierstrass equation and multiply by to clear the denominators. This way, we obtain the projective short Weierstrass equation:

It is easy to see that a projective point with satisfies (11.2) if and only if the corresponding affine point with and satisfies (11.1). Moreover, a projective point on the line at infinity ( satisfies (11.2) if and only meaning the only of the projective points at infinity satisfying (11.2) is This is the "curve point at infinity", the identity element of the group law, that we denoted when we defined the elliptic curve in affine coordinates.

Hence, one of the main advantages of projective coordinates over affine ones is that it unifies ordinary points and the point at infinity which now has a projective representation as any other point, namely

Another advantage is that computing the group law is more efficient because it does not require to perform modular division (which is only required to perform projective-to-affine conversion). A ballpark estimation is that a modular inversion is 20 to 100 times more costly than a modular multiplication depending on the platform and the implementation.

The projective coordinates obtained with the substitution is just one possibility among others, called homogeneous projective coordinates because the resulting projective equation (11.2) for the curve is homogeneous, meaning all terms have the same total degree, 3 here. An very common alternative are Jacobian coordinates defined by the substitution The resulting projective equation is Projective points in Jacobian coordinates are defined by the equivalence relation The point at infinity ( is the equivalence class

See http://www.hyperelliptic.org/EFD/ for a list of various other possible coordinates systems.


1: It is possible to define elliptic curves over fields of characteristic 2 or 3 but equations are more complicated.

Chapter status: 👷 in progress 👷

TODO:

Pairings

Pairings have allowed a great number of new applications in cryptography. They were first used by Menezes, Okamoto, and Vanstone [MOV91] to break the discrete logarithm problem in supersingular elliptic curves (this is the so-called MOV attack). Later, they were used constructively by Joux [Jou00] to obtain a tripartite Diffie-Hellman protocol and by Boneh and Franklin [BF03] to obtain an identity-based encryption scheme. This got the three of them the Gödel prize in 2013.

We will start with the abstract definition of pairing groups and then we will see how to construct such groups from specific elliptic curves.

Contents

Abstract Definition

Let and be three cyclic groups of prime order For reasons that will become clear later, we will denote and additively and multiplicatively (in particular, and 's identity elements are denoted while 's identity element is denoted A pairing is an efficiently computable function which satisfies two properties:

  • non-degeneracy: and implies
  • bilinearity: for every and and for every and

Note that bilinearity implies the (more useful in practice) property that for every and

Non-degeneracy has many other definitions (all equivalent assuming bilinearity), such as:

  • is not the constant function
  • if and are generators of respectively and then is a generator of

Constructing groups admitting a pairing is not very hard. For example, take (the group of integers mod equipped with addition), take for any cyclic group of order let be a generator of and define

However, for being useful from a cryptographic point of view, we need the discrete logarithm problem to be hard in the three groups and (In the example above, while it may be hard in it is certainly not in and ) This is where elliptic curves come to the rescue.

Some more vocabulary: A pairing is said symmetric when and asymmetric when As we will see, both types can be constructed from elliptic curves. Historically, the first proposed constructions were symmetric, but nowadays the asymmetric type prevails for efficiency reasons. It is an open problem to construct a pairing such that and the discrete logarithm problem is (conjectured) hard in

Further Resources

Here are some good resources about pairing-friendly elliptic curves:

Chapter status: in progress

TODO:

  • modify advantage notation to display the scheme

Games, Models, and Assumptions

This section presents how security assumptions and security properties of cryptographic schemes are formalized. It also gives a list of cryptographic assumptions relevant for this book (see also the ECRYPT II MAYA report).

Contents

Big-O Notation and Negligible Functions

In all the following, we consider functions from to and we let denote the variable.

A function is said to be negligible if or equivalently if

In words, is negligible if it approaches zero faster than the inverse of any polynomial.

The set of all negligible functions is denoted We often write in place of

Cryptographic Games

Game: Tentative Definition

A game consists of a main algorithm and a (potentially empty) finite tuple of oracle algorithms Then main algorithm (to which we simply refer as the game from here) takes as input where is an integer called security parameter and runs in three phases:

  • initialization: the game initializes variables and generates some input
  • attack: the game invokes an algorithm called adversary on input the adversary has oracle access to
  • finalization: when the adversary halts and returns some output the game evaluates a predicate of and all variables and returns the truth value of this predicate.

To make this definition rigorous, the programming language used to write the game should be completely specified. Here, we will content ourselves with specifying games in pseudocode.

A very simple game called ADD drawing two random -bit integers and returning if the adversary successfully adds them would look like this:

Notation and Conventions for Writing Games

  • Given a predicate we use as a shorthand for
  • Games return by default, meaning if the end of the game code is reached and the game has not returned yet, then it returns Finalization often consists in returning the truth vale of Under this convention, the following finalization code is equivalent to

Advantage

For each security parameter a game together with an adversary define a finite probability space specified by all random choices made by the game and the random coins of the adversary. This allows to define the probability of various events related to the execution of the game with a specific adversary.

In particular, given a game and an adversary we write or simply when the context is clear, for the event that an execution of with adversary for security parameter returns When the game returns we also say that the adversary "wins".

A game can be computational or decisional depending on how a quantity called advantage, measuring how well an adversary performs at winning the game, is defined.

The advantage of an adversary against a game is a function of the security parameter into defined as if the game is computational and if the game is decisional. (We write the name of the game in small caps in the advantage superscript to lighten notation.)

We say that a game is computationally hard if for every probabilistic polynomial-time (ppt) algorithm We say that a game is statistically hard (or information-theoretically hard or unconditionally hard) if for every algorithm (not necessarily polynomial-time), In the special case where the advantage of any algorithm is zero, one says that the game is perfectly hard (for example, a commitment scheme that can be perfectly hiding or perfectly binding).

When we simply say that a game is hard, it usually means computationally hard (but this should always be clear from the context).

Hence, for a computationally, resp. statistically hard game, every ppt, resp. unbounded adversary "wins" with probability negligible close to 0 for a computational game and with probability negligibly close to for a decisional game.

Reductions

Given two games and we say that reduces to denoted if there exists a probabilistic polynomial-time algorithm (called reduction) with access to an oracle such that for every algorithm solving with non-negligible advantage, solves with non-negligible advantage.

If reduces to and reduces to we say that and are equivalent, denoted

Proposition 13.1. Assume that reduces to Then being hard implies being hard.

Proof

Contraposing, assume that is not hard, which by definition means that there exists a ppt algorithm such that Consider the reduction from to Since and both run in polynomial time, runs in polynomial time as well. Moreover, by definition of a reduction, Hence, is not hard.

Thus, can be read as " is not harder than " or is at least as hard as ".

In cryptography, we are constantly making assumptions of the form "X is hard". Proposition 13.1 can be used to compare the strength of various assumptions. Indeed, assuming we proved that then the assumption that is hard implies that is hard too. If, in addition, there are some indications that does not hold, then the assumption that is hard is stronger than the assumption that is hard. Indeed, if but is not known to hold, then it might be that is hard yet is easy.

For example, consider the discrete logarithm (DL) problem on one hand (given a group and group elements and compute and the Computational Diffie-Hellman (CDH) problem on the other hand (given a group and group elements and compute One can easily prove that CDH DL (CDH reduces to DL) by constructing a reduction from CDH to DL: given an algorithm solving DL, one can solve CDH by first computing the discrete logarithm of and then computing However, there is no proof that DL CDH (except in very specific situations). Hence, the assumption that CDH is hard is (for most groups) stronger than the assumption that DL is hard.

Another way Proposition 13.1 is often used in cryptography is for security proofs. Here, is some hardness assumption such as DL and is a security game, say unforgeability (in the precise sense of EUF-CMA security in the random oracle model) of Schnorr signatures. Then, a security proof for Schnorr signatures consists in proving that reduces to i.e., the DL problem reduces to the EUF-CMA security of Schnorr signatures in the ROM. By Proposition 13.1, if DL is hard, then Schnorr signatures are EUF-CMA secure in the ROM.

Idealized Models

The Random Oracle Model

The Generic Group Model

The Algebraic Group Model

Assumptions

Group Setup Algorithms

A standard group setup algorithm is an algorithm which on input the security parameter returns a pair where is a -bit prime and is a cyclic group of order

A pairing group setup algorithm is an algorithm which on input the security parameter returns a tuple where is a -bit prime, and are cyclic groups of order and is an efficiently computable pairing.

We adopt the convention that group/pairing setup algorithms do not return generators of the groups. They will be explicitly sampled in the games.

One usually distinguishes three types of pairing group setup algorithms [GPS08]:

  • a type-1 pairing group setup algorithm (also called symmetric pairing setup algorithm) is such that
  • a type-2 pairing group setup algorithm is such that and there exists an efficiently computable isomorphism
  • a type-3 pairing group setup algorithm is such that an no efficiently computable isomorphism is known.

Type-2 and type-3 pairing group setup algorithms are called asymmetric.

In all the following, we simply talk about "type-1/2/3 pairings" rather than "type-1/2/3 pairing group setup algorithms".

Assumptions in Standard Groups

Discrete Logarithm (DL)

  • type: computational
  • interactive: no
  • falsifiable: yes
  • references:
  • notes:

Computational Diffie-Hellman (CDH)

  • type: computational
  • interactive: no
  • falsifiable: yes
  • references:
  • notes: CDH DL

Decisional Diffie-Hellman (DDH)

  • type: decisional
  • interactive: no
  • falsifiable: yes
  • references:
  • notes: DDH CDH

-Discrete Logarithm (-DL)

  • type: computational
  • interactive: no
  • falsifiable: yes
  • references: [Che10, Lip10, FKL18, Rot22]
  • notes:
    • sometimes called -strong DL or DL with auxiliary inputs
    • 1-Dl = DL
    • -DL -DL

Assumptions in Product Groups

The assumptions listed in this section are defined for a pair of groups of equal order They are usually applied to groups returned by a pairing group setup algorithm but don't make use of the group nor the pairing For this reason, we simply write when parsing the parameters returned by

-co-Discrete Logarithm (-co-DL)

  • type: computational
  • interactive: no
  • falsifiable: yes
  • references: [Che10, Lip10, FKL18, Rot22]
  • notes:

Computational co-Diffie-Hellman (co-CDH)

  • type: computational
  • interactive: no
  • falsifiable: yes
  • references: [BLS04]
  • notes:
    • co-CDH DL in
    • co-CDH CDH in for type-1 pairings
    • co-CDH CDH in for type-2 pairings (see Proposition 17.3)

Computational co-Diffie-Hellman* (co-CDH*)

  • type: computational
  • interactive: no
  • falsifiable: yes
  • references: [CHKM10]
  • notes:
    • sometimes simply called (confusingly) co-CDH (e.g. [BDN18])
    • co-CDH CDH in
    • co-CDH DL in
    • co-CDH co-CDH
    • co-CDH CDH in for type-1 pairings (see Proposition 17.1)
    • co-CDH co-CDH for type-2 pairings (see Proposition 17.2)

Computational -co-Diffie-Hellman (-co-CDH)

  • type: computational
  • interactive: yes
  • falsifiable: no (for type-3 pairings)
  • references: [SV07, BDN18]
  • notes:
    • -co-CDH CDH in
    • -co-CDH DL in
    • -co-CDH co-CDH co-CDH
    • -co-CDH co-CDH for type-2 pairings (the isomorphism enables to compute efficiently as assuming

-Strong Diffie-Hellman (-SDH)

  • type: computational
  • interactive: no
  • falsifiable: yes
  • references: [BB08]
  • notes:
    • -SDH usually called -SDH
    • not to be confused with another assumption named SDH introduced in [ABR01]
    • -SDH -SDH
    • -SDH -SDH

Assumptions in Pairing Groups

-Bilinear Diffie-Hellman Inversion (-BDHI)

  • type: computational
  • interactive: no
  • falsifiable: yes
  • references: [BB04]
  • notes:
    • -BDHI -BDHI
    • -BDHI -BDHI

-Bilinear Strong Diffie-Hellman (-BSDH)

  • type: computational
  • interactive: no
  • falsifiable: yes
  • references: [KZG10a]
  • notes:
    • -BSDH is usually simply called -BSDH
    • -BSDH -BSDH
    • -BSDH -BSDH
    • -BSDH -SDH
    • -BSDH -BDHI

Signatures: Generalities

In this chapter, we define the syntax of a signature scheme and explore security definitions, asking ourselves: what makes a signature scheme secure?

Syntax

A signature scheme consists of four algorithms:

  • a algorithm which takes as input the security parameter and returns public parameters
  • a key generation algorithm which takes as input parameters and returns a secret key and a public key
  • a signature algorithm which takes as input parameters a secret key and a message and returns a signature
  • a verification algorithm which takes as input parameters a public key a message and a signature and returns if the signature is valid for the pair and otherwise.

The scheme is correct if for every security parameter and every message the following game capturing the nominal execution of algorithms returns true with probability one:

This syntax considers that the message space is meaning that the signing algorithm takes any string as input message. In practice, this is never the case: quite often, the signing algorithm starts with hashing the input message with a hash function that has a finite message space. More generally, the message space could depend on the public parameters returned by the setup algorithm (e.g., the public parameters could specify a group and admissible messages could be restricted to group elements.) This can be accommodated by adapting the syntax as follows: on input the algorithms returns either a signature or a distinguished error symbol indicating the message is invalid (i.e., cannot be signed). Correctness must be modified to require that only if Other variables ( and usually have a specific format depending on the security parameter. Similarly, we assume that and return an error symbol in case any input does not abide to the expected format.

EUF-CMA, the Standard Security Notion

The standard security notion for a signature scheme is existential unforgeability against chosen message attacks (EUF-CMA): no polynomial-time adversary, being given a target public key and having access to a signature oracle for the corresponding secret key should be able to compute a valid signature for a message it has not queried to the signature oracle, except with negligible probability. This is formally captured by the following security game:

Here, chosen message attack means that the adversary can query the signature oracle on messages of its choice, while existential unforgeability means that the adversary wins if it returns a forgery on any message that has not been queried to the signature oracle (there exists some message for which the adversary can forge a signature).

One can weaken this security definition in two directions. One one hand, one can restrict how the adversary can access the signature oracle. For example, in a no message attack, the adversary does not have access to the signature oracle. In a known message attack, the adversary cannot choose the message when querying the signature oracle (the message is randomly drawn according to some specific probability distribution). On the other hand, one can modify how the adversary can choose the message for which it forges its message. For example, selective unforgeability requires the adversary to commit to the message for which it will forge at the beginning of the game, before receiving the target public key and making any query to the signature oracle. For universal unforgeability, the message is imposed by the game rather than chosen by the adversary (typically, it is drawn at random by the game according to some specific distribution).

All these weaker notions are mostly of theoretical interest (for example, there exists generic conversion methods to construct signature schemes that are EUF-CMA-secure from schemes that are only meet selective unforgeability against chosen message attacks). In practice, a signature scheme must be EUF-CMA-secure to be deployed (more precisely, conjectured EUF-CMA-secure, preferably supported by a security proof).

Strong EUF-CMA and Non-malleability

Binding

Chapter status: ✅ in good shape ✅

Related puzzles: Puzzle 1

TODO:

  • add details about hashing into groups
  • write a more rigorous game-based proof for BLS
  • write the security proof for aggregate BLS signatures

BLS Signatures

The BLS signature scheme was proposed by Boneh, Lynn, and Shacham [BLS01] (see also the corresponding journal paper [BLS04] and the IETF draft) and was one of the first applications of pairings to cryptography (shortly after tripartite Diffie-Hellman and identity-based encryption). Note that the BLS initialism can refer to two different things in cryptography: Boneh-Lynn-Shacham signatures discussed in this chapter and Barreto-Lynn-Scott curves, a family of pairing-friendly elliptic curves. It is possible to implement BLS (signatures) on BLS (curves).

Contents

Description of BLS

For the general syntax and the EUF-CMA security definition, refer to the corresponding chapter.

Let be a pairing group setup algorithm. The BLS signature scheme is defined as follows:

  • The algorithm, on input the security parameter runs and draws a random generator of It also specifies a hash function mapping bit strings to group elements in The public parameters consist of the tuple

  • The key generation algorithm draws a random scalar and computes the point on curve the secret key is and the public key is

  • The signature algorithm on input a secret key and a message computes the point on curve the signature is

  • The verification algorithm, on input a public key a message and a signature returns 1 (i.e., declares the signature valid) if otherwise it returns 0 (invalid signature).

A good way to think about BLS signatures is as follows: given a message with corresponding point in the signature of is the point such that the discrete logarithm of in base is equal to the discrete logarithm of in base (that is, the secret key This is reminiscent of the Chaum-Antwerpen undeniable signature scheme [CA89]. Checking discrete logarithm equality is exactly what the pairing enables to do efficiently.

Note that the signature algorithm is deterministic and hence always returns the same signature for a given secret key/message pair; this is called a unique signature scheme.

The roles of and can be swapped, in which case public keys are in and signatures in (recall that bit string representations of elements are larger and arithmetic is less efficient for than for so the choice depends on what is more important to optimize depending on the use case). In that case, the hash function must have outputs in and hence this is not possible with a type-2 pairing.

The scheme is correct, meaning that for every key pair possibly output by the key generation algorithm and every message the signature computed by the signature algorithm is declared valid by the verification algorithm. Indeed, hence Eq. (17.1) is satisfied.

Security of BLS

Security Proof

The BLS scheme is provably EUF-CMA-secure assuming the so-called co-CDH problem is hard for and modeling the hash function as a random oracle. The co-CDH problem for two groups and of order is defined as follows: given random generators and of and respectively, and for and a random group element compute This might be thought of as CDH in with additional "hint"

Theorem 17.1. Assume that the co-CDH problem is hard for Then the BLS signature scheme is EUF-CMA-secure in the random oracle model for More precisely, for any adversary against the EUF-CMA security of BLS making at most random oracle queries and signature queries, there exists an adversary for the co-CDH problem running in time similar to the time of and such that

Proof

Let be an adversary against the EUF-CMA security of BLS making at most random oracle queries and signature queries. From one can easily construct an adversary having the same advantage as and such that (i) always makes the random oracle query before returning its forgery and (ii) makes exactly random oracle queries. From now one, we assume that satisfies properties (i) and (ii).

We construct an adversary for the co-CDH problem as follows. takes as input pairing group parameters and a co-CDH instance with and It runs on input BLS parameters where is the interface to the random oracle (that will simulate) and public key At the beginning of the simulation, draws a random integer and initializes a table for storing answers of the simulated random oracle When makes a random oracle query such that has not been defined yet, either draws and returns if this is the -th query, or returns if this is the -th query. When makes a signing query then:

  • if is undefined, then draws sets and returns the signature note that this signature is valid since
  • if has already been defined, this was necessarily during the query made by if this was the -th query, then returns the signature again, this signature is valid since otherwise, if was defined by the -th query, then aborts the simulation of the game and returns

Eventually, when halts and returns its forgery three cases can occur (recall that by assumption made the random oracle query hence is necessarily defined):

  • if is not a valid forgery (either because was queried to the sign oracle or because the signature is invalid), then returns
  • if is a valid forgery and was defined during the -th query to the random oracle, the returns
  • if is a valid forgery and was defined during the -th query to the random oracle, then returns as solution to the co-CDH instance.

In the third case, one can easily check that is the correct solution since was defined as so that Moreover, conditioned on being successful, the view of is independent from and hence the third case occurs with probability Hence,

This reduction loses a factor roughly in its success probability. One can obtain a better reduction losing a factor by using a slightly more careful strategy for "embedding" the challenge in the random oracle answers, see Section 15.5.1 of the Boneh-Shoup textbook.

Discussion

The various statements and proofs of the security of BLS signatures in the literature can be confusing. The security theorem that we just proved holds for any pairing group setup algorithm. However, depending on the pairing type, it can be rephrased slightly differently.

The original conference paper [BLS01] was only considering symmetric (a.k.a. type-1) pairings (i.e., and proved EUF-CMA security assuming the CDH problem in is hard. Theorem 17.1 implies this specific case since for type-1 pairings, hardness of co-CDH is equivalent to hardness of CDH in as we prove now.

Proposition 17.1. Let be a type-1 pairing group setup algorithm. Then co-CDH CDH.

Proof

One the one hand, co-CDH CDH is trivial. On the other hand, CDH co-CDH can be proved as follows. Let be an algorithm for the co-CDH problem. We construct an algorithm for solving CDH as follows. On input draws a random value and computes and Then and is distributed uniformly in and independently from Hence, is a correctly distributed instance of co-CDH that can pass as input to The solution of this co-CDH instance is also the solution to the original CDH instance. Hence, the advantage of is equal to the advantage of

Later, the journal paper [BLS04] considered asymmetric pairings ( for which an efficiently computable group isomorphism from to is available (i.e., a type-2 pairing) and proved security assuming the co-CDH problem is hard.1 The co-CDH problem is similar to the co-CDH problem except the adversary is only given and and must compute Again, Theorem 17.1 implies this specific case since for type-2 pairings, co-CDH is equivalent to co-CDH.

Proposition 17.2. Let be a type-2 pairing group setup algorithm. Then co-CDH co-CDH.

Proof

On the one hand, co-CDH co-CDH is straightforward. Conversely, one can prove that co-CDH co-CDH as follows. Let be an adversary for the co-CDH problem. We construct an adversary for the co-CDH problem as follows. On input draws a random value and computes and Then and is distributed uniformly in and independently from Hence, is a correctly distributed instance of co-CDH that can pass as input to The solution of this co-CDH instance is also the solution to the original co-CDH instance. Hence, the advantage of is equal to the advantage of

Interestingly, [BLS04] gives an example of type-3 pairing group setup algorithm (i.e., such that no efficiently computable isomorphism from to is known) for which co-CDH is conjectured to be hard but BLS over this pairing group setup algorithm is insecure. Let be a subgroup of order of the multiplicative group let be the additive group and let be defined as Note that DL is easy in which implies that DL in reduces to co-CDH (an algorithm solving co-CDH returns which allows to solve for by computing the discrete log of in base in As DL in is conjectured hard for sufficiently large and co-CDH is conjectured hard for for such parameters as well. Hence, maybe counter-intuitively, hardness of co-CDH does not necessarily imply hardness of DL in ! On the other hand, DL being easy in implies that BLS is not EUF-CMA-secure. This shows the necessity of the stronger co-CDH assumption for proving the security of BLS for type-3 pairings. For type-2 pairings, hardness of co-CDH does imply hardness of DL in (obviously) but also in as shown by the following proposition.

Proposition 17.3. Let be a type-2 pairing group setup algorithm. Then

Proof

That CDH DL is clear. Let us show that co-CDH CDH Let be an algorithm for solving CDH in We construct an algorithm for the co-CDH problem as follows. On input computes and Then is a correctly distributed CDH instance that can pass as input to The solution of this CDH instance is also a solution of the original co-CDH instance. Hence, the advantage of is equal to the advantage of

To finish, we note that hardness of co-CDH is actually equivalent to EUF-CMA-security of BLS (even for a type-3 pairing where no efficiently computable isomorphism is known). Indeed, given an algorithm solving co-CDH one can construct an adversary breaking BLS as follows. Given and a public key makes queries and for an arbitrary message Note that Then it chooses an arbitrary message and queries It runs on input which is a correctly distributed co-CDH challenge, unless which happens with negligible probability. If returns the correct answer to this co-CDH challenge, then is a valid forgery.

This equivalence holds because we considered the "random generator" variant of co-CDH where and are drawn at random. As noted in [CHKM10], for type-3 pairings, it is not known whether breaking EUF-CMA-security of BLS reduces to the "fixed generator" variant of co-CDH where and are fixed (the previous reduction does not work anymore since there is only a negligible chance to "hit" the fixed generator when making queries

The following figure summarizes the above results (a double-arrow between two problems meaning equivalence):

graph TD;
    A(CDH) <-- type-1 --> C(co-CDH*);
    B(co-CDH) <-- type-2 --> C;
    C <-- type-1/2/3 --> D(EUF-CMA-sec of BLS);

Note that the notion of type-1/2/3 pairing was only introduced in 2008 [GPS08], a few years after the journal version of the BLS paper [BLS04]. For more discussion about type-2/type-3 pairings and the consequences of an efficiently computable isomorphism in security proofs of pairing-based schemes, see [SV07], [CHKM10], and [CM09].

Signature Aggregation

BLS signatures have a handy property: they can be aggregated non-interactively [BGLS03]. This means that given signature for respective public key/message pairs it is possible to compute a compact aggregate signature (of size similar to the one of a single signature) that proves that message was signed by the owner of public key for every (exactly as the tuple would).

To turn BLS into an aggregate signature scheme, the aggregate signature for a tuple of signatures is simply defined as the sum of all signatures Note that aggregation can be performed "publicly" by an untrusted party different from all signers. It can also be performed "incrementally", meaning new signatures can be aggregated to an aggregate signature. One can even aggregate aggregate signatures (sic). The only requirement is to keep track of all public key/message pairs the signatures of which have been aggregated.

Given public key/message pairs an aggregate signature is valid for these pairs if Even though the size of the aggregate signature is the complexity of verification still scales linearly with the number of signatures. Correctness of the scheme can be verified similarly to standard BLS signatures.

The EUF-CMA security notion can easily be adapted to aggregate signature schemes: no polynomial-time adversary, being given a target public key and having access to a standard signature oracle for the corresponding secret key (i.e., returning non-aggregate signatures), should be able to return an -tuple of public key/message pairs and an aggregate signature such that:

  • for some
  • was not queried to the signature oracle,
  • is valid for the tuple

Note that the adversary can choose extra public keys as it wishes, in particular as a function of the target public key

BLS aggregate signatures are provably EUF-CMA-secure assuming hardness of the -co-CDH problem and modeling the hash function as a random oracle. There is one caveat though: messages must be distinct, otherwise a so-called rogue key attack is possible. To see why, say the adversary is given a target public key Then it can draw and compute a second public key Now, for any message the adversary can compute a forgery for the pair as This forgery satisfies the verification equation as

Note that the adversary does not know the secret key corresponding to public key hence the name "rogue key attack".

A simple fix against this attack is simply to prepend the public key to the message in the hash function, meaning a signature on message for secret/public key pair is redefined as Then, the condition that messages should be different can safely be lifted [BNN07].

This, however, precludes an interesting optimization of the verification in case all messages are equal.2 Indeed, when Eq. (17.2) simplifies to Hence, on the right hand-side, instead of a product of pairings, we now have to compute a single pairing applied to a sum of elliptic curve points, which can be computed more efficiently.

In order to be able to use (17.3) for verification when messages are equal, a different fix can be used to thwart rogue key attacks. Namely, each user must provide a proof of possession (PoP) of the secret key associated with his public key. The notion of PoP is somewhat informal as, to the best of my knowledge, there is not rigorous definition of the security properties it should have. The idea is that it should demonstrate that the user has access to the functionality associated with his public key, namely signing. Hence, a simple PoP consists of the user signing its own public key, Intuitively, this makes sense since in a rogue key attack, the adversary cannot compute the secret key associated with the rogue public key (which it computes as a function of the target public key). This solution was proved secure assuming the hash function used for the PoP is different from the hash function used for actual message signing, which can be achieved from a single hash function by enforcing domain separation [RY07].

A Note about Non-Repudiation

Non-repudiation is often presented as a logical consequence of existential unforgeability (if no one except Alice can produce a valid signature on message and a valid signature on message is presented to a judge, the judge should conclude that Alice actually did sign message But what if Alice was able to choose her secret/public key pair in a way that allows her to find two messages and and a signature which is valid for both messages, i.e., Then, confronted by the judge with a valid signature for message Alice could claim that she has in fact signed instead. Hence, non-repudiation should rather be seen as the combination of existential unforgeability with a second security notion, often called message biding in the literature, which demands that no malicious user be able to generate a public key two messages and and a signature such that and (17.4) holds.

It is easy to see that standard BLS signatures are message binding if is collision-resistant. Things are more subtle though when considering aggregate signatures [Qua21]. In particular, if users collide (or if a single malicious user controls public keys), they can arrange to choose their public keys such that Then, in the PoP-based variant of the scheme where Eq. (17.3) is used for verification, is a valid signature for any message the scheme is not message binding. Note that checking that in the verification algorithm does not thwart the attack as can be further aggregated with other valid signatures for arbitrary public key/message pairs resulting in an aggregate signature which still does not bound signers to a specific message.

Further Topics to Cover

  • BLS multi-signatures [BDN18]
  • BLS threshold signatures [BL22]
  • use of BLS signatures in the Ethereum beacon chain


1: Somehow confusingly, recent papers tend to use the name co-CDH for what we call co-CDH here.

2: The setting where all users sign the exact same message requires a primitive slightly different from an aggregate signature scheme called a multisignature scheme.

Chapter status: ✅ in good shape ✅

Related puzzles: Puzzle 1

TODO:

Commitment Schemes

This section contains a brief introduction to commitment schemes, focusing on Pedersen commitments.

Contents

Generalities

A commitment scheme involves two parties, a committer (or prover) and a verifier. It allows the committer to send to the verifier a commitment to some secret value and later on to open this commitment to reveal The commitment should not reveal any information about the message (hiding property) and the committer should not be able to open a commitment in two distinct ways (binding property).

For a physical analogy, imagine the committer as writing the message on a piece of paper and placing it in an opaque, unbreakable, locked box which he sends to the verifier. At this point, the verifier cannot learn anything about the message and the committer cannot modify the message. Later, when the committer wants to open the commitment, he sends the key opening the box to the verifier to reveal the message.

Syntax

More formally, a commitment scheme consists of three algorithms (the exact syntax can vary slightly in the literature):

  • a probabilistic setup algorithm which on input the security parameter returns public parameters1 which in particular specify a message space (in the following, we simply denote the message space leaving the dependency on implicit);
  • a probabilistic commitment algorithm which on input parameters and a message returns a commitment and a decommitment2
  • a deterministic verification algorithm which on input parameters a commitment a message and a decommitment return 1 if the decommitment is valid for and 0 otherwise.

Quite often, the decommitment simply consists of the random coins used by the commitment algorithm, and the verification algorithm simply recomputes the commitment given and and compares with When this is the case, we say that the commitment scheme has canonical verification and, overloading the notation, we let denote the function explicitly taking the random coins of the commitment algorithm as input and returning the commitment (letting the decommitment implicit in that case).

Note that what we just defined here is the syntax for a non-interactive commitment scheme, where the algorithm is run once and for all and committing consists of a single message sent by the prover to the verifier. There exists more complex commitment schemes where committing requires some interaction between the prover and the verifier.

Correctness requires that for every security parameter the following game capturing the nominal execution of algorithms returns true with probability 1:

Security

A commitment scheme should satisfy two security properties informally defined as follows:

  • hiding: the commitment should not reveal any information about the secret value to the verifier,
  • binding: the committer should not be able to open the commitment in two different ways.

Let us formalize these two properties more precisely, starting with hiding, defined by the following game:

In some cases, it might be necessary to check additional conditions and messages and queried to oracle (e.g., when consists of bit strings of various lengths and does not hide the message length, and should have the same length).

Binding is defined by the following game:

For some commitment schemes, one of these two properties holds statistically (i.e., cannot be broken with non-negligible advantage even by a computationally unbounded adversary) or even perfectly. However, a commitment scheme cannot be both statistically hiding and statistically binding at the same time. Hence, at best, a commitment scheme can be either statistically hiding and computationally binding or computationally hiding and statistically binding.

Homomorphic Commitments

Informally, a commitment scheme is homomorphic if the message space equipped with some binary operation forms a group and given two commitments and to respectively and anyone can compute a commitment that the committer can open to

More formally, a commitment scheme is homomorphic (with respect to group operation if there exists two algorithms and such that

  • takes parameters and two commitments and and returns a commitment
  • takes parameters and two decommitments and and returns a decommitment
  • for any security parameter the following game returns true with probability 1:

Algorithms and are often quite simple (e.g., when the commitment and decommitment spaces also have a group structure, they simply consist in applying the corresponding group operation to and or and respectively).

Pedersen Commitments

Description and Security

The Pedersen commitment scheme, initially introduced in [Ped91], is widely used, in particular to build zero-knowledge proof systems. It is specified as follows. Let be a group setup algorithm. Then:

  • the setup algorithm on input runs draws two random generators and of and returns parameters the message space is

  • the commitment algorithm on input parameters and a message draws and returns a commitment and a decommitment

  • the verification algorithm on input parameters a commitment a message and a decommitment returns 1 if and 0 otherwise.

Theorem 18.1. The Pedersen commitment scheme is perfectly hiding, computationally binding under the discrete logarithm assumption, and homomorphic with respect to addition over

Proof

Let us sketch the proof of each property:

  • perfectly hiding: as is uniformly random in for any message is uniformly random in and hence does not reveal any information about
  • computationally binding: assume an adversary can output two message/decommitment pairs and with for the same commitment then which yields the discrete logarithm of in base (note that implies as and are generators of
  • additively homomorphic: given two commitments and anyone can compute and the committer can compute which is a valid decommitment for and message

A Note About the Setup

Importantly, the setup algorithm should ensure that nobody knows the discrete logarithm of in base In particular, it is not safe to allow the committer to choose the public parameters: if the committer knows the discrete logarithm of in base then the scheme is not binding anymore. Say the committer knows such that Then it can send as commitment; later, it can open this commitment to any value it wants by computing and sending decommitment it satisfies

For this reason, Pedersen's scheme is sometimes referred to as a trapdoor (or equivocal) commitment scheme, which can be useful in security proofs but should also make us wary. However, there are secure ways to select the commitment key without a trusted third party, such as using a hash-to-group function (a.k.a. hash-to-curve function in case is based on an elliptic curve) applied to some NUMS (nothing-up-my-sleeve) input. Hence, even though Pedersen commitments do not require a trusted setup, one should always verify that parameters were correctly generated. For a real-world example where this trapdoor property could have been used, see Section VI of [HLPT20] about the Scytl/SwissPost e-voting system.

Variants

The Pedersen commitment scheme can be generalized to messages which are vectors the parameters are extended to where and are uniformly random and independent generators of and the commitment for message with randomness is This is usually called the generalized Pedersen commitment scheme, or sometimes the Pedersen vector commitment scheme, although this is somehow a misnomer as it does not have all the properties required from a vector commitment scheme [CF13]. As for the basic variant, it can be shown to be perfectly hiding, computationally binding under the DL assumption, and homomorphic with respect to addition over

The "random" part of the commitment is sometimes omitted, in which case the commitment algorithm becomes deterministic and the commitment is simply In that case, the scheme is still computationally binding under the DL assumption (and even perfectly binding for as the commitment function is bijective), but it is not hiding anymore (given two messages and and a commitment to for some random bit one can recover by simply computing the commitments corresponding to respectively and and comparing with For this reason, it is sometimes referred to as the non-hiding Pedersen commitment scheme. It is however preimage-resistant under the DL assumption, meaning that given a random commitment it is hard to compute a message such that

ElGamal Commitments

The Pedersen commitments scheme has a relative known as the ElGamal commitment scheme where the commitment key is as for Pedersen and the commitment for message with randomness is the pair (Note the similarity with ElGamal encryption w.r.t. public key ) This scheme is perfectly binding, computationally hiding under the DDH assumption, and additively homomorphic.

If the message is encoded as a group element and the commitment computed as the scheme has a trapdoor property allowing anyone with knowledge of the discrete logarithm of in base to extract the message (by "decrypting" the commitment as in ElGamal encryption, i.e., computing

Commitments and Hash Functions

There is a strong connection between commitment schemes and collision-resistant hash functions.

First, let us consider the following strengthening of the binding property: We say that a commitment scheme if strongly binding if it is hard to find a commitment and two distinct message-decommitment pairs and such that and That is, the adversary wins also when the messages and are equal but the decommitments and are different.

Given a hash function family indexed by some parameter one can define a simple commitment scheme with where It can be shown to be (computationally) strongly binding assuming the family is collision-resistant. On the other hand, there is no reason for this scheme to be hiding in general ( could for example reveal the first bit of the message, allowing to distinguish commitments to two messages with distinct first bits). It is however easily seen to be (computationally) hiding in the random oracle model.

Reciprocally, it is straightforward to derive a collision-resistant hash function family from a strongly binding commitment scheme.

Proposition 18.1. Consider a commitment scheme with a function taking parameters a message and explicit random coins If is strongly binding, then the function family is collision-resistant.

Proof

Assume that is not collision-resistant and that there is an adversary which on input returns such that Let Then hence this adversary can be used to break strong binding of

Note that the assumption that is binding is not sufficient: it could be easy to find such that but with which would break collision-resistance but not binding.

It is not hard to see that the Pedersen commitment scheme is actually strongly binding, which directly gives an algebraic family of collision-resistant hash functions usually called Pedersen hashing. A specific instance of the family is specified by a tuple of parameters where is a cyclic group of order and are generators chosen in a way such that nobody knows any discrete logarithm relation between them. Then has domain range and is defined by (Note that in the context of hashing, there is no distinction between the "message" and the "randomness" as in the context of commitment schemes.)

This family of hash functions is collision-resistant assuming the discrete logarithm problem is hard.

Variants are possible: for example, if inputs are bit strings of length exactly one can split the input into chunks of consecutive bits with and convert the -th chunk into an integer and let

Further Resources

For more background on commitment schemes, see for example this article by Damgård and Nielsen and this lecture by Dodis.


1: The name can vary; these parameters are sometimes called a commitment key or a public key.

2: Again, the name can vary and it is sometimes called an opening or a hint.

Chapter status: ✅ in good shape ✅

Related puzzles: Puzzle 2 and Puzzle 4

TODO:

Polynomial Commitment Schemes

Polynomial commitment schemes play a central role in the design of zk-SNARKs. In this section, we introduce them and present the two KZG schemes, DL-KZG and Ped-KZG.

Make sure to read the chapters about polynomials and standard commitments first.

We recall that given a field we let denote the set of all univariate polynomials over of degree at most

Contents

Generalities

As standard commitment schemes, a polynomial commitment scheme (PC scheme for short) involves two parties, a committer (or prover) and a verifier. It allows the committer to send to the verifier a commitment to some secret polynomial and later on prove that evaluates to some specific value at input (usually of the verifier's choosing), potentially for multiple inputs.

Recall that a polynomial of degree at most is specified by the tuple of coefficients such that From a broader perspective, a PC scheme can be seen as the combination of a standard commitment scheme over together with various proof systems for proving assertions about the committed vector In particular, proving that is equivalent to proving that i.e., proving that the inner product of with the vector is equal to Other assertions one may want to prove when designing SNARKs are, for example, that has degree at most which is equivalent to or more complex relations about evaluations of such as for some subset

Syntax

More formally, a PC scheme is parameterized by a maximal degree (one can think of as being given as input to all algorithms) and consists of the five following algorithms (the exact syntax can vary slightly in the literature, here we adhere to the syntax of standard commitment schemes):

  • a setup algorithm which on input the security parameter returns public parameters1 these parameters implicitly specify some finite field

  • a commitment algorithm which on input parameters and a polynomial returns a commitment and a decommitment

  • a "polynomial" verification algorithm which on input parameters a commitment a polynomial and a decommitment returns 1 if is a valid decommitment for and 0 otherwise;

  • a proving algorithm which on input parameters a polynomial a decommitment and a value returns an evaluation and a proof

  • an "evaluation" verification algorithm which on input parameters a commitment a pair and a proof returns 1 if is a valid proof that the polynomial committed to by evaluates to at input and 0 otherwise.

In some cases (in particular for KZG), it might be possible to split the public parameters into a commitment key and a verification key where typically only is needed for algorithms and and only is needed for

As already hinted, the three algorithms and can be regarded together as a standard commitment scheme with message space (with specified by the tuple of its coefficients) while and together form a proof system for statements of the form

As for standard commitment schemes, what we just defined here is the syntax for a non-interactive PC scheme, where the algorithm is run once and for all and then committing and proving an evaluation of the committed polynomial is non-interactive. More generally, committing and evaluation proving could be interactive.

As always, the scheme must be correct, meaning two things: first, must be correct as defined for a standard commitment scheme with message space second, for every security parameter every every and every the following game capturing the nominal execution of algorithms for evaluation proving must return true with probability 1:

Security

Defining security properties for PC schemes is rather subtle. Almost every paper about PC schemes define slightly different sets of security properties depending on the specific application being targeted. Here, we focus on the security properties proposed in the seminal paper about PC schemes [KZG10a], which are also the simplest ones.

First, a PC scheme should be hiding and binding in the standard sense when seen as a commitment to the tuple of coefficients defining the polynomial Let us the recall the corresponding games, that we call POLY-HIDING and POLY-BINDING for clarity:

It turns out that some PC schemes (such as the DL-KZG scheme) do not satisfy the poly-hiding notion (in general, when used to construct SNARKs, poly-hiding matters only if one cares about the SNARK being zero-knowledge). However, they satisfy what we call here evaluation hiding,2 which informally means that for a random polynomial of degree at most given a commitment to and at most evaluations of together with the corresponding proofs, no adversary should be able to guess the value of for a new input This is formalized by the following game:

To be completely explicit, the line means

The condition that the adversary makes at most queries to the oracle is of course necessary: once the commitment has been opened at distinct points the committed polynomial has been completely revealed by virtue of Lagrange interpolation.

Regarding the binding property of evaluation proving, a PC scheme should be evaluation binding, meaning no efficient adversary can produce a commitment and two valid proofs that the committed polynomial evaluates to two different values at the same input More formally, this is captured by the following game:

As for standard commitments, all these properties can hold statistically or computationally, but poly-hiding and poly-binding cannot hold both statistically for a PC scheme.

Informal Description of the KZG Schemes

Two closely related and very efficient PC schemes based on pairings were proposed by Kate, Zaverucha, and Goldberg in 2010 [KZG10a] (see also [KZG10b] for the full paper with security proofs). We will call them (for reasons that will become clear soon) DL-KZG and Ped-KZG. What is usually simply called KZG corresponds to the DL-KZG scheme.

Let us start with a high-level view of DL-KZG. For a maximal degree the commitment and opening part is very similar to the (non-hiding version of the) generalized Pedersen commitment scheme. The public parameters consist of generators of some group of prime order A polynomial is seen as a vector and the corresponding commitment is There is a big difference though with generalized Pedersen commitments: the generators are not independent. They are computed from a single generator and a secret random scalar as These parameters have a precise structure. For this reason, they are also called a structured reference string (SRS). This also implies that they cannot be sampled obliviously of and require a trusted setup (more on this later).

As a result, the commitment is actually in disguise:

Evaluation proving relies on the polynomial remainder theorem: a polynomial satisfies if and only if is divisible by i.e., there exists a polynomial such that The proving algorithm therefore consists in computing the polynomial explicitly and the proof which consists in in disguise, as

Evaluating this polynomial equality at we see that The verification algorithm consists in checking this equality "in the exponent" (or rather "in the scalar multiplication" here as we use additive notation). This is where pairings comes in: is actually a pairing-friendly group coming with related groups and and a pairing The public parameters include a generator of and the group element The verifier can compute and and also knows Then holds iff the following pairing equality does:

The DL-KZG commitment scheme is obviously not hiding because the commitment algorithm is deterministic. The Ped-KZG scheme remedies this problem by adding a commitment to a random polynomial with respect to another tuple of points Evaluation proving is adapted accordingly. The form of the commitment is reminiscent of the (hiding version) of Pedersen commitments, explaining the naming convention.

We now give a detailed description and analysis of the properties of the DL-KZG and Ped-KZG schemes.

The DL-KZG Scheme

Description

Let be an asymmetric pairing group setup algorithm.3 The DL-KZG scheme for a maximal degree is defined as follows:

  • The algorithm, on input the security parameter runs draws random generators and of respectively and draws and returns public parameters Here we assume that pairing parameters are implicitly specified in 4 The field over which polynomials are defined is The public parameters can be split into a commitment key and a verification key

  • The algorithm, on input a commitment key and a polynomial where returns the commitment and an empty decommitment

  • The algorithm, on input a commitment key a commitment and a polynomial where returns 1 if and 0 otherwise.

  • The algorithm, on input a commitment key a polynomial and computes the polynomial5 the group element and returns and the proof

  • The algorithm, on input a verification key a commitment a pair and a proof returns 1 if and 0 otherwise.

It is straightforward to verify that is correct as a standard commitment scheme. Let's check that the scheme is correct with respect to evaluation proving, i.e., if the commitment and the proof have been honestly computed, then the verification passes. If and where and are such that then hence Eq. (19.1) is satisfied and returns 1.

Hiding Security

The DL-KZG scheme as described above is not poly-hiding: the commitment algorithm is deterministic, hence, given two polynomials and and a commitment it is trivial to distinguish whether commits to and by computing the corresponding commitments and comparing with

Regarding the eval-hiding property, note that DL-KZG cannot be statistically eval-hiding. Indeed, an unbounded adversary can compute from the parameters and from the commitment and return to win the EVAL-HIDING game without making any query to the oracle. However, it is eval-hiding under the discrete logarithm assumption (in group

Let us give the intuition before the full-fledged proof. As the committed polynomial is uniformly random in Lagrange interpolation ensures that given at most evaluations of at the value of on any other point is uniformly random so that even an unbounded adversary can guess it with probability at most Hence, the only way an adversary can guess with non-negligible probability is to compute from Together with queries to the oracle, this yields evaluations of allowing to compute with Lagrange interpolation. But computing requires to solve the discrete logarithm problem for challenge

Theorem 19.1. Assume that the DL problem is hard in for Then the DL-KZG scheme is (computationally) eval-hiding. More precisely, for every adversary against the EVAL-HIDING game, there exists an adversary for the DL problem running in time where is the running time of and such that

Proof

Let be an adversary against the eval-hiding property of DL-KZG. Without loss of generality, we assume that makes exactly queries to the oracle. We simply denote EH the EVAL-HIDING game. Let also denote the event that queries the oracle on By definition of the advantage, we have Let us first show that the first term is negligible. Just before returns its answer, has been evaluated on at most points: (when computing the commitment) and the queries of to the oracle. Conditioned on querying on (i.e., has been in fact evaluated on points before returns its output Since is a random polynomial of degree the value of conditioned on these at most evaluations (for any is uniformly random. Hence, even a computationally unbounded adversary can guess with probability at most i.e., Let us now upper bound the second term with a reduction. We construct an adversary that solves the DL problem by simulating game EH to as follows. Let be the DL instance that must solve. draws computes and runs on input Note that implicitly commits to a polynomial such that simulates the oracle as follows: when queries the oracle on some field element draws computes a proof and returns The proof is valid because and hence the verification equation (19.1) is satisfied. Note that cannot answer this way if queries to since is exactly the solution to its DL challenge. In such a case (i.e., when happens), simply aborts. Conditioned on not happening, is sampled through the commitment evaluation and the evaluations corresponding to queries made by with and uniformly random and independent. By Lagrange interpolation, this is equivalent to drawing the coefficients of uniformly at random and hence the EVAL-HIDING game is perfectly simulated. If successfully returns a pair such that then can interpolate the evaluations corresponding to queries together with to recover polynomial and compute which yields the solution to the DL challenge.

Let DL be the discrete logarithm game played with Then where for the last equality we used that conditioned on games DL and EH are identical. Hence, runs in time (where is the running time of plus the time to interpolate which requires at most operations.

Binding Security

The only thing that a commitment commits to, information-theoretically speaking, is the value Hence, the DL-KZG scheme is certainly not statistically poly-binding: an adversary able to compute from the public parameters can very easily decommit any commitment to any polynomial such that However, for an adversary unable to compute from the public parameters, which is an instance of what we call the -co-DL problem, there is only a negligible chance that it can find two polynomials and such that More formally, we have the following result.

Theorem 19.2. Assume that the -co-DL problem is hard for Then the DL-KZG scheme for maximal degree is poly-binding. More precisely, for any adversary against the poly-binding security of DL-KZG for maximal degree there exists an adversary for the -co-DL problem running in time where is the running time of and such that

Proof

Let be an adversary against the poly-binding security of DL-KZG for maximal degree We construct an algorithm for the -co-DL problem as follows. gets pairing group parameters and an instance of the -co-DL problem. The goal of is to compute It runs on public parameters Assume that is successful and returns a commitment and two distinct polynomials and of degree at most such that This implies that hence and is a root of the non-zero polynomial This polynomial can be factored in time with the Cantor–Zassenhaus algorithm, which allows to compute all its roots and find The success probability of is the same as the one of and the running time of is

Eval-binding security relies on a stronger assumption, namely that the so-called -strong Diffie-Hellman (-SDH) problem is hard. This problem is as follows: given compute a pair such that The -SDH problem is usually simply called the -SDH problem.

Theorem 19.3. Assume that the -SDH problem is hard for Then the DL-KZG scheme for maximal degree is eval-binding. More precisely, for any adversary against the eval-binding security of DL-KZG for maximal degree there exists an adversary for the -SDH problem running in time similar to the time of and such that

Proof

Let be an adversary against the eval-binding security of the DL-KZG scheme for maximal degree We construct an adversary for the -SDH problem. gets pairing group parameters and an instance of the -SDH problem. The goal of is to return a pair such that It runs on public parameters Assume that is successful and returns a commitment a field element and two valid value/proof pairs and with Then proceeds as follows. First, it checks whether (e.g., by checking whether is equal to the second group element of the parameters If this is the case, then simply picks an arbitrary element and returns From now on, we assume The validity of the two proofs imply that Taking the inverse of the second equation and multiplying with the first equation, we get successively where for the last implication we used that and which allows us to multiply by The last equation implies that Hence, returns which is a valid solution of the SDH instance. The success probability of is the same as the one of and the running time of is close to the one of

The Ped-KZG Scheme

Description

As discussed in the previous section, the DL-KZG scheme is not poly-hiding because the algorithm is deterministic. It is possible the make the scheme poly-hiding by randomizing the algorithm. Below we present the Ped-KZG scheme. The idea is to add to the basic DL-KZG commitment a commitment to another random and independent polynomial with respect to another generator of The commitment becomes which is very similar to a Pedersen commitment, hence the name. This requires expanding the size of the public parameters and the evaluation proofs. The formal description follows.

  • The algorithm, on input the security parameter runs draws random generators and of and of draws and returns public parameters Here we assume that pairing parameters are implicitly specified in The field over which polynomials are defined is The public parameters can be split into a commitment key and a verification key

  • The algorithm, on input a commitment key and a polynomial where draws a random polynomial with and returns the commitment and the decommitment

  • The algorithm, on input a commitment key a commitment a polynomial where and a decommitment where returns 1 if and 0 otherwise.

  • The algorithm, on input a commitment key a polynomial a decommitment and computes the polynomials the group element and returns and the proof

  • The algorithm, on input a verification key a commitment a pair and a proof returns 1 if and 0 otherwise.

Correctness can be verified in a similar way to DL-KZG.

Hiding Security

Thanks to the commitment randomization, poly-hiding and eval-hiding security both hold statistically for Ped-KZG.

Theorem 19.4. The Ped-KZG scheme is perfectly poly-hiding.

Proof

For any and any polynomial the commitment returned by the algorithm is uniformly random in due to the addition of the term Hence, the oracle in the POLY-HIDING game does not reveal any information about the hidden bit

Theorem 19.5. The Ped-KZG scheme is statistically eval-hiding. More precisely, for any adversary one has

Proof

Let by a (computationally unbounded) adversary against the eval-hiding property of Ped-KZG. We can assume without loss of generality that is given the discrete logarithm of in base and the discrete logarithm of the challenge commitment Let u_i_, be the queries of to oracle and be the corresponding answers. Note that does not bring any additional information to as it can be computed from the other quantities, namely Hence, all in all the adversary is given evaluations of and at the same points together with the value Note that since is a generator of Hence, conditioned on for and the value of is uniformly random and only has evaluations of This implies that the probability that guesses correctly for is

Binding Security

The poly-binding and eval-binding security properties hold under the same assumptions as for DL-KZG. The proofs are slightly more complex and must account for the possibility that the adversary solves the discrete logarithm problem for in base

Theorem 19.6. Assume that the -co-DL problem is hard for Then the Ped-KZG scheme for maximal degree is poly-binding. More precisely, for any adversary against the poly-binding security of Ped-KZG for maximal degree there exists an adversary for the -co-DL problem running in time where is the running time of and such that

Proof

Let be an adversary against the poly-binding security of the Ped-KZG scheme for maximal degree We construct an algorithm for the -co-DL problem as follows. gets pairing group parameters and an instance of the -co-DL problem. The goal of is to compute

Adversary randomly chooses between two indistinguishable ways to embed its instance into the parameters. Namely, it draws and proceeds as follows depending on

  • If then draws and runs on public parameters This implicitly sets and
  • If then draws and runs on public parameters This implicitly sets

Assume that is successful and returns a commitment and two distinct polynomials and of degree at most together with corresponding decommitments and such that This implies that and hence We can distinguish two cases:

  • case If then aborts. Otherwise, since we have Hence, is a root of the non-zero polynomial This polynomial can be factored in time with the Cantor–Zassenhaus algorithm, which allows to compute all its roots and find
  • case If then aborts. Otherwise, since then This implies that Then necessarily as otherwise this would contradict Hence, can compute

The view of is independent from and hence aborts with probability so that The running time of is at most which concludes the proof.

Theorem 19.7. Assume that the -SDH problem is hard for Then the Ped-KZG scheme for maximal degree is eval-binding. More precisely, for any adversary against the eval-binding security of Ped-KZG for maximal degree there exists an adversary for the -SDH problem running in time similar to the time of and such that

Proof

Let be an adversary against the eval-binding security of the Ped-KZG scheme for maximal degree We construct an adversary for the -SDH problem. gets pairing group parameters and an instance of the -SDH problem. The goal of is to return a pair such that

Adversary randomly chooses between two indistinguishable ways to embed its instance into the parameters. Namely, it draws and proceeds as follows depending on

  • If then draws and runs on public parameters This implicitly sets and
  • If then draws and runs on public parameters This implicitly sets

Assume that is successful and returns a commitment a field element and two valid value/proof pairs and with Then proceeds as follows. First, it checks whether (e.g., by checking whether is equal to the second group element of the parameters If this is the case, then simply picks an arbitrary element and returns From now on, we assume The validity of the two proofs imply that Combining these two equations, we get where we used that

We can now distinguish two cases:

  • case If then aborts. Otherwise, since we have and knows the value such that The equation above yields This implies in particular that as otherwise this would imply Hence, which implies that Thus, can return as solution to the -SDH instance.
  • case If then aborts. Otherwise, since we have and the equation above yields which implies We cannot have as this would imply whereas when is successful. Hence, can compute choose an arbitrary and return as solution to the SDH instance.

The view of is independent from and hence aborts with probability so that The running time of is similar to the running time of which concludes the proof.

Discussion

Efficiency

DL-KZG commitments are extremely succinct and rather cheap to verify: a commitment and a proof take one elliptic curve point each (e.g., 48 bytes when using BLS12-381) and verifying an opening essentially takes two pairings. In case one has to verify many openings for the same commitment, the verification equation (19.1) can be equivalently written where and can be computed once and stored for verifying multiple openings, allowing to trade one pairing for one exponentiation in On the other hand, the size of the commitment key and the complexity of algorithms and are linear in the maximal degree of committed polynomials (which when building SNARKs can be quite large).

Trusted Setup

The secret value drawn by the algorithm must be securely deleted once the commitment key has been set up as it allows to break the evaluation binding property of the scheme. Indeed, knowing given an arbitrary commitment one can open this commitment at any point to any value by computing the proof as Then the verification equation (19.1) is satisfied as

This is quite different from the procedure of the Pedersen commitment scheme, for which it is possible to proceed without ever generating any trapdoor explicitly. There is no (efficient) way known to implement the procedure for KZG without explicitly sampling To the best of my knowledge, there is also no proof that this is impossible. The assumption that this is impossible looks quite similar to many "knowledge of exponent" assumptions, hence the claim that running the KZG setup obliviously of is impossible is presumably true but not provable with known techniques. It is, however, possible to run the setup in a decentralized fashion, ensuring that the process is secure as long as a single party behaves honestly (see for example [NRBB22]).

Note that it is possible to check that the trusted setup yielded public parameters having the correct form, namely that there indeed exists such that Say we are given Let be defined as the discrete log of in base i.e., and set Then has the correct form if and only if for every Indeed, one has the following equivalences:

Summary of KZG Properties

DL-KZGPed-KZG
param. size + +
comt. size
proof size +
poly-hiding---perfect
eval-hidingDL in perfect
poly-binding-co-DL-co-DL
eval-binding-SDH-SDH

Multi-evaluation Proofs

We will see that the DL-KZG scheme can be generalized to allow proving multiple evaluations with one single proof consisting of a single element. This technique can also be applied to Ped-KZG but it is much less interesting since the size of the proof for evaluations is one element plus field elements, hence it grows linearly with

Recall that for a polynomial is equivalent to being divisible by How does this generalize to multiple evaluations?

First, let us recall some vocabulary from the section about Lagrange interpolation. An evaluation domain (or simply domain) of size is a subset of size The vanishing polynomial over is the polynomial defined as A multi-evaluation of size is a subset such that for The evaluation domain associated with is We say that a polynomial satisfies a multi-evaluation if for every

The idea of multi-evaluation proofs relies on the generalized polynomial remainder theorem that we restate here. Let be a multi-evaluation of size and Let be the vanishing polynomial for domain and be the Lagrange interpolation polynomial for i.e., the unique polynomial of degree at most such that for every Then satisfies if and only if divides

For one recovers the standard polynomial remainder theorem since for a single point the vanishing polynomial is and the Lagrange interpolation polynomial is the constant polynomial hence satisfies if and only if divides

Syntax and Security Definition

Let us now see how to adapt the syntax of a PC scheme to accommodate multi-evaluation proofs. Concretely, a PC scheme with multi-evaluation proofs consists of five algorithms: and have the same syntax as for a standard PC scheme, while and are replaced respectively by the following two algorithms:

  • a algorithm which on input parameters a polynomial a decommitment and a tuple of distinct field elements, returns a tuple and a proof

  • a algorithm which on input parameters a commitment a multi-evaluation and a proof returns 1 if is a valid proof that the polynomial committed to by satisfies and 0 otherwise.

The correctness property can be straightforwardly adapted: for every security parameter every every every and every subset the following game capturing the nominal execution of algorithms for multi-evaluation proving must return true with probability 1:

We must modify the security definitions accordingly. The poly-hiding and poly-binding notion are identical to the ones defined for a standard PC scheme. The eval-hiding notion is very similar: one only needs to adapt the oracle so that it may be queried on domains of size larger than 1.

The eval-binding notion requires more care: we still want that no adversary can prove that a polynomial evaluates to two different values and at the same input point however, now the adversary has the freedom to prove this for two different multi-evaluations and with the constraint that and To emphasize the difference, we call this adapted security notion multi-binding. It is defined via the following game.

KZG with Multi-evaluation Proofs: Description

The DL-KZG multi-evaluation PC scheme works as follows:

  • The algorithm, on input the security parameter runs draws random generators and of respectively and draws and returns public parameters

  • The and algorithms are defined exactly as for DL-KZG.

  • The algorithm, on input a commitment key a polynomial and a subset of size computes the polynomials and the group element and returns and the proof

  • The algorithm, on input a verification key a commitment a multi-evaluation of size and a proof computes the polynomials and returns 1 if and 0 otherwise.

Observe that the algorithm must compute and For a multi-evaluation of size has degree at most and has degree Hence, if proofs for multi-evaluations of size at most are to be supported, one can restrict the public parameters to and derive a commitment key and a verification key

Security Proof

The proof of Theorem 19.3 can be adapted to show that DL-KZG is multi-binding under a slightly different assumption called -bilinear strong Diffie-Hellman (-BSDH). This problem is as follows: given compute a pair such that Note that Indeed, given a solution for some SDH instance, one can compute a solution for the corresponding BSDH instance. The converse, though, is not known to hold, so that BSDH is presumably a stronger assumption than SDH.

Theorem 19.8. Assume that the -BSDH problem is hard for Then the DL-KZG multi-evaluation scheme for maximal degree and multi-evaluations of size at most is multi-binding. More precisely, for any adversary against the multi-binding security of DL-KZG, there exists an adversary for the -BSDH problem running in time similar to the time of and such that

Proof

Let be an adversary against the multi-binding security of the DL-KZG scheme for maximal degree We construct an adversary for the -BSDH problem. gets pairing group parameters and an instance of the -BSDH problem. The goal of is to return a pair such that Adversary runs on input parameters Assume that returns a commitment a tuple and two valid multi-evaluations/proof pairs and such that and If (which can verify by checking whether is equal to the second group element of the parameters then simply picks an arbitrary element and returns as solution to the -BSDH instance. From now on, we assume that

Let resp. be the evaluation domain corresponding to resp. Let also resp. be the vanishing polynomial for resp. and resp. be the Lagrange interpolation polynomial for resp. Validity of the two proofs imply that Combining these two equations, we obtain We know that is a root of both and Hence, there are polynomials and such that and We also know that polynomial evaluates to at Hence, by the polynomial remainder theorem, there is a polynomial such that Note that can explicitly compute and Injecting this in the previous equation, we get where for the last equality we used that and Hence, can return where is the right-hand side of the last equation, as solution to the -BSDH instance.

As a sanity check, observe that for a single evaluation ( one has and in which case the last equation simplifies to which allows to solve the -SDH problem and recover Theorem 19.3.

A Practical Use Case

Ethereum is planning to use the KZG polynomial commitment scheme for proto-danksharding. Its properties make it a convenient solution to the data availability problem. A distributed trusted setup is being run at the time of writing.

Additional Resources

There are many resources explaining KZG out there, here are a few:


1: As for standard commitment schemes, the name can vary and this is sometimes called a common reference string (crs) or structured reference string (srs) when it does not consist of random bits and has a specific "shape", as it is the case for KZG.

2: In the seminal paper introducing polynomial commitment schemes [KZG10a], evaluation hiding is simply called hiding.

3: KZG polynomial commitments are often described with a symmetric pairing (i.e., but we define them for an asymmetric pairing as this is the preferred option in practice.

4: Quite often, generators and are standard and specified in public parameters alongside and

5: The polynomial is well-defined by the polynomial remainder theorem.

Before Starting

    ______ _   __  _   _            _
    |___  /| | / / | | | |          | |
       / / | |/ /  | |_| | __ _  ___| | __
      / /  |    \  |  _  |/ _` |/ __| |/ /
    ./ /___| |\  \ | | | | (_| | (__|   <
    \_____/\_| \_/ \_| |_/\__,_|\___|_|\_\

Over the last months, ZK Hack published a series of cryptographic puzzles. This is a fun way to learn about advanced cryptographic schemes such as BLS signatures, KZG polynomial commitments, proof systems and more, to improve its skills in Rust and Sage, and to delve into the arkworks libraries suite.

There are many great write-ups already available for every puzzle. The goal of this walk-through is to give an in-depth analysis of the cryptography underlying each puzzle.

We encourage the reader to regularly pause and try to come with its own solution!

The full code of the solutions is available at https://github.com/yannickseurin/crypto-book/tree/main/puzzles.

Prerequisites

All puzzles use Rust and require some familiarity with this language. Visit this page for installation instructions and go through the first sections of the Rust Book if you're new to Rust. You can use any text editor you like, but it is recommended to use Visual Studio Code together with the rust-analyzer plugin which provides very helpful functionalities such as inlay hints. See here and there for more advice about using VS Code fur Rust development.

We will also occasionally rely on the Sage mathematics software system to solve some of the puzzles. See here for installation instructions.

The solutions will be given for Linux but should be easily adaptable to other operating systems.

Getting Started

Each puzzle consists of a Rust package hosted on GitHub. To get started, one first need to clone the project and run it, which displays the puzzle instructions. E.g., for the first puzzle, one proceeds as follows:

$ git clone https://github.com/kobigurk/zkhack-bls-pedersen
$ cd zkhack-bls-pedersen
$ cargo run --release

This displays the puzzle description. Understanding the organization of the project's code requires some basic knowledge of Rust concepts of packages, crates, and modules. Section 7 of the Rust book contains all you need to know.

Rust Conventions and Tips

  • Paths to files are given relatively to the puzzle directory.
  • Most Rust snippets have an eyeball icon which will toggle the visibility of hidden lines.
  • Puzzles 1 to 11 are based on version 0.3 of the arkworks libraries, but version 0.4 has been released meanwhile with a handful of breaking changes (puzzle 12 and beyond use version 0.4); we will strive to indicate those affecting the relevant part of the crates.
  • We often switch between mathematical notation and Rust variables. We write var or var to identify the mathematical variable and the Rust variable var.

Puzzle 1: Let's Hash it Out

Alice designed an authentication system in which users gain access by presenting
it a signature on a username, which Alice provided.
One day, Alice discovered 256 of these signatures were leaked publicly, but the
secret key wasn't. Phew.
The next day, she found out someone accessed her system with a username she
doesn't know! This shouldn't be possible due to existential unforgeability, as
she never signed such a message.

Can you find out how it happend and produce a signature on your username?

From the puzzle's instructions, it seems that we have to mount a universal forgery attack against the signature scheme used by Alice (universal because one should be able to forge a signature for any username). Let's find out what signature scheme Alice is using exactly by taking a look at the code (the name of the package gives us a good hint already).

Initial Inspection

As this is the first puzzle, we will go over the code in details.

The package directory is structured as follows:

zkhack-bls-pedersen
├── Cargo.toml
└── src
    ├── bin
    │   └── verify-bls-pedersen.rs
    ├── bls.rs
    ├── data.rs
    ├── hash.rs
    └── lib.rs

It has two crates:

  • a library crate with root file src/lib.rs,
  • a binary crate with root file src/bin/verify-bls-pedersen.rs.

Let's take a look at the code inside the file src/lib.rs.:

pub mod bls;
pub mod data;
pub mod hash;

pub const PUZZLE_DESCRIPTION: &str = r#"Alice designed an authentication system in which users gain access by presenting it a signature on a username, which Alice provided.
One day, Alice discovered 256 of these signatures were leaked publicly, but the secret key wasn't. Phew.
The next day, she found out someone accessed her system with a username she doesn't know! This shouldn't be possible due to existential unforgeability, as she never signed such a message.

Can you find out how it happend and produce a signature on your username?"#;

It simply declares three public modules named bls, data, and hash and a string slice named PUZZLE_DESCRIPTION with the text which is displayed when running the project.

Let's now have a look at the code in the binary crate's source file src/bin/verify-bls-pedersen.rs:

use bls_pedersen::bls::verify;
use bls_pedersen::data::puzzle_data;
use bls_pedersen::PUZZLE_DESCRIPTION;
use prompt::{puzzle, welcome};

fn main() {
    welcome();
    puzzle(PUZZLE_DESCRIPTION);
    let (pk, ms, sigs) = puzzle_data();
    for (m, sig) in ms.iter().zip(sigs.iter()) {
        verify(pk, m, *sig);
    }

    /* Your solution here! */
    /*
      let sig = ...;
      let m = your username;
      verify(pk, m, sig);
    */
}

It first brings some items from the library crate into scope with the use keyword. Note that when a package contains both a library and a binary crate, any public item from the library crate can be used in the binary crate by starting paths with the name of the package (which is specified in the Cargo.toml file; in this case, it is bls_pedersen). It also brings into scope two functions called welcome and puzzle defined in the prompt external crate.

The main function first calls welcome and puzzle which respectively display some nice ASCII art and the puzzle description. It then calls the puzzle_data function. From the path used to bring puzzle_data into scope, we know that this function is defined in the data module of the library. Hence, we look into the file src/data.rs for its code, which looks like this:

pub fn puzzle_data() -> (G2Affine, Vec<Vec<u8>>, Vec<G1Affine>) {
    // ...
    (pk, ms, sigs)
}

This function returns a tuple made up of a public key pk of type G2Affine, a vector ms of messages of type Vec<u8>, and a vector sigs of signatures of type G1Affine. We'll come back to types G1Affine and G2Affine shortly. The main function then runs a loop which checks the validity of all message/signature pairs with respect to public key pk:

    for (m, sig) in ms.iter().zip(sigs.iter()) {
        verify(pk, m, *sig);
    }

If you don't understand the syntax of the loop, read about iterators and the zip method as this will come up quite often.

The verify function is defined in the bls module, as indicated by the path used to bring it into scope. In order to understand what this function does, make sure to read the chapters about pairings and BLS signatures. Here is the code:

use ark_bls12_381::{Bls12_381, G1Affine, G2Affine};
use ark_ec::{AffineCurve, PairingEngine};
use std::ops::Neg;

use crate::hash::hash_to_curve;
use ark_ff::One;

// --snip--

pub fn verify(pk: G2Affine, msg: &[u8], sig: G1Affine) {
    let (_, h) = hash_to_curve(msg);
    assert!(Bls12_381::product_of_pairings(&[
        (
            sig.into(),
            G2Affine::prime_subgroup_generator().neg().into()
        ),
        (h.into(), pk.into()),
    ])
    .is_one());
}

Just from the names of the functions, we can guess that it first hashes the message msg by calling hash_to_curve, getting a point h on and then asserts that the product of two pairings is equal to the identity element of group Ignoring the into method for now, we can see that the arguments for the first pairing are the signature sig and the opposite of the generator of group The arguments for the second pairing are the hash h and the public key pk.

Hence, what verify is asserting is whether (using our notation from the BLS signatures chapter) which is equivalent to the verification equation (17.1) we gave when describing BLS since

So verify is indeed checking a BLS signature. Computing a product of pairings can be done more efficiently than computing the pairings one by one (see here), which explains why performing verification using Eq. (25.1.1) is often preferable.

What pairing-friendly curve does the signature scheme use? The arkworks libraries implement many such curves (see the list here). From the Cargo.toml file listing dependencies of the package, we can see that it includes the ark-bls12-381 library crate, where the Bls12_381 type prefixing the call to product_of_pairings is defined. Hence, the puzzle uses the BLS12-381 curve.

Our next task is to understand what the hash_to_curve function does exactly. Before that, we take a moment to explore some of the arkworks crates used in the puzzle.

Exploring the arkworks Libraries

As explained in the introduction, the ZK Hack puzzles are a great opportunity to explore the arkworks libraries. Although we have a reasonable understanding of what verify does, let us pause a moment and explain how to find its way in all the crates arkworks provides. Say we want to understand what the function Bls12_381::product_of_pairings does exactly. First, we check the path used to bring Bls12_381 into scope in the src/bls.rs module:

use ark_bls12_381::{Bls12_381, G1Affine, G2Affine};

This tells us that we need to look into the ark-bls12-381 crate.1

The first thing to know is that there are two possible places where to look for information: Docs.rs, the documentation host for Rust crates hosted at crates.io, and the arkworks GitHub repositories.2

Second, one has to be careful about which version of the crate the puzzle requires. For this, we must inspect the Cargo.toml file which lists the package dependencies:

[dependencies]
ark-std = "0.3"
ark-ff = "0.3"
ark-ec = "0.3"
ark-serialize = "0.3"
ark-bls12-381 = "0.3"
ark-crypto-primitives = "0.3"
rand = "0.8"
rand_chacha = "0.3"
hex = "0.4"
prompt = { git = "https://github.com/kobigurk/zkhack-prompt" }
blake2s_simd = "0.5.11"

We can see that the puzzle requires version 0.3 of the ark-bls12-381 crate. Hence, we select to correct version of the ark-bls12-381 crate on Docs.rs as starting point of our exploration.

If you prefer to browse libraries on GitHub or locally, be careful to check out the correct commit: the ark-ec crate is part of the algebra repository, the releases of which are listed here.

When entering product_of_pairings in the search bar on top of the documentation page of the ark-bls12-381 crate, we don't get any hit. This probably means that this function is part of a trait that the type Bls12_381 implements using the default implementation. Hence, we search for this type instead, which leads us to its definition:

type Bls12_381 = Bls12<Parameters>;

Following the link to the definition of the Bls12 type, we see that it is defined in the models::bls12 submodule of the ark-ec crate, which contains all the generic code for curves of the BLS family with embedding degree 12:

pub struct Bls12<P: Bls12Parameters>(_);

This illustrates how arkworks uses traits to abstract common behaviour of various curves. Bls12 is an empty struct parameterized by a generic type P that must satisfy the trait bound Bls12Parameters. A specific curve of the BLS-12 family, such as BLS12-381, is then instantiated by defining an empty struct Parameters implementing the Bls12Parameters trait in a specific way.

We can now search for product_of_pairings in the ark-ec crate. We are more lucky this time as we find out that it is part of the PairingEngine trait. Let's take a look at the code:

pub trait PairingEngine: Sized + 'static + Copy + Debug + Sync + Send + Eq + PartialEq {
    // ...

    /// Computes a product of pairings.
    #[must_use]
    fn product_of_pairings<'a, I>(i: I) -> Self::Fqk
    where
        I: IntoIterator<Item = &'a (Self::G1Prepared, Self::G2Prepared)>,
    {
        Self::final_exponentiation(&Self::miller_loop(i)).unwrap()
    }

    // ...
}

We can see that a default implementation is indeed provided, computing a product of Miller loops followed by a single final exponentiation. You can keep digging from here and inspect how miller_loop and final_exponentiation are implemented for the BLS-12 family.

Note that the product_of_pairings function has been replaced in version 0.4.0 of the ark-ec crate by the multi_pairing function.

Next, we will see what the into method applied to curve points does.

Affine versus Projective Coordinates

The ark-ec library allows you to work both with affine and projective coordinates and to easily switch between them. For short Weierstrass curves, the affine representation corresponds to the GroupAffine struct implementing the AffineCurve trait:

pub struct GroupAffine<P: Parameters> {
    pub x: P::BaseField,
    pub y: P::BaseField,
    pub infinity: bool,
    // some fields omitted
}

The projective representation uses Jacobian (not homogeneous) coordinates and corresponds to the GroupProjective struct implementing the ProjectiveCurve trait:

pub struct GroupProjective<P: Parameters> {
    pub x: P::BaseField,
    pub y: P::BaseField,
    pub z: P::BaseField,
    // some fields omitted
}

Note in particular how the GroupAffine struct needs to hold a boolean field infinity indicating whether an instance is the point at infinity or not, whereas GroupProjective needs not.

The trait Parameters, an alias for ark_ec::models::SWModelParameters, contains all parameters specifying a prime-order subgroup of an elliptic curve in short Weierstrass form:

pub trait SWModelParameters: ModelParameters {
    const COEFF_A: Self::BaseField;
    const COEFF_B: Self::BaseField;
    const COFACTOR: &'static [u64];
    const COFACTOR_INV: Self::ScalarField;
    const AFFINE_GENERATOR_COEFFS: (Self::BaseField, Self::BaseField);
    fn mul_by_a(elem: &Self::BaseField) -> Self::BaseField { ... }
    fn add_b(elem: &Self::BaseField) -> Self::BaseField { ... }
}

What about types G1Affine and G2Affine? Recall that in order to define a pairing one needs two prime-order subgroups and of elliptic curves defined on respectively and some field extension Types G1Affine and G2Affine correspond to the affine representation of these two subgroups and are defined for curves of the BLS-12 family respectively here and there as:

type G1Affine<P> = GroupAffine<<P as Bls12Parameters>::G1Parameters>;
type G2Affine<P> = GroupAffine<<P as Bls12Parameters>::G2Parameters>;

Type P must implement the Bls12Parameters trait:

pub trait Bls12Parameters: 'static {
    type Fp: PrimeField + SquareRootField + Into<<Self::Fp as PrimeField>::BigInt>;
    type Fp2Params: Fp2Parameters<Fp = Self::Fp>;
    type Fp6Params: Fp6Parameters<Fp2Params = Self::Fp2Params>;
    type Fp12Params: Fp12Parameters<Fp6Params = Self::Fp6Params>;
    type G1Parameters: SWModelParameters<BaseField = Self::Fp>;
    type G2Parameters: SWModelParameters<BaseField = Fp2<Self::Fp2Params>, ScalarField = <Self::G1Parameters as ModelParameters>::ScalarField>;

    const X: &'static [u64];
    const X_IS_NEGATIVE: bool;
    const TWIST_TYPE: TwistType;
}

As expected, types G1Parameters and G2Parameters must both implement the SWModelParameters trait with a prime base field for G1Parameters and a quadratic extension field for G2Parameters.

Conversion

Conversion between affine and projective coordinates is handled by the From and Into traits. These are very general and useful traits that you can read about here and here. They provide respectively an associated function from and a method into, the latter one being generally derived from the former.

The function from for creating a point in projective coordinates from a point in affine coordinates is implemented here:

impl<P: Parameters> From<GroupAffine<P>> for GroupProjective<P> {
    #[inline]
    fn from(p: GroupAffine<P>) -> GroupProjective<P> {
        if p.is_zero() {
            Self::zero()
        } else {
            Self::new(p.x, p.y, P::BaseField::one())
        }
    }
}

The converse function, creating a point in affine coordinates from a point in projective coordinates, is implemented here:

impl<P: Parameters> From<GroupProjective<P>> for GroupAffine<P> {
    #[inline]
    fn from(p: GroupProjective<P>) -> GroupAffine<P> {
        if p.is_zero() {
            GroupAffine::zero()
        } else if p.z.is_one() {
            // If Z is one, the point is already normalized.
            GroupAffine::new(p.x, p.y, false)
        } else {
            // Z is nonzero, so it must have an inverse in a field.
            let zinv = p.z.inverse().unwrap();
            let zinv_squared = zinv.square();

            // X/Z^2
            let x = p.x * &zinv_squared;

            // Y/Z^3
            let y = p.y * &(zinv_squared * &zinv);

            GroupAffine::new(x, y, false)
        }
    }
}

Note that there are also more explicit into_affine and into_projective methods which simply call into.

All what we just said was for version 0.3 of the crate. The structs and traits have been renamed in version 0.4 as follows:

  • struct GroupAffine Affine
  • trait AffineCurve AffineRepr
  • struct GroupProjective Projective
  • trait ProjectiveCurve CurveGroup: Group.

Hopefully the code of the verify function should now make completely sense. Note in particular how elliptic curve points are converted from affine coordinates to projective coordinates using method into before being passed to product_of_pairings.

It's now time to see what the hash_to_curve function does exactly.


1: Note that hyphens are not valid characters in Rust identifiers, however it is possible (and idiomatic) to use them in package and crate names. Cargo automatically converts them to underscores. See here.

2: If you're curious about what guarantees we have that the source codes on github.com and crates.io are really the same, I recommend this interesting blog post by Eric Seppanen.

Understanding the hash-to-curve Function

It remains to take a look at what the hash_to_curve function defined in the src/hash module is doing exactly:

use ark_bls12_381::{G1Affine, G1Projective};
use ark_crypto_primitives::crh::{
    pedersen::{Window, CRH},
    CRH as CRHScheme,
};
use rand::SeedableRng;
use rand_chacha::ChaCha20Rng;

// --snip--

#[derive(Clone)]
struct ZkHackPedersenWindow {}

impl Window for ZkHackPedersenWindow {
    const WINDOW_SIZE: usize = 1;
    const NUM_WINDOWS: usize = 256;
}

pub fn hash_to_curve(msg: &[u8]) -> (Vec<u8>, G1Affine) {
    let rng_pedersen = &mut ChaCha20Rng::from_seed([
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1,
    ]);
    let parameters = CRH::<G1Projective, ZkHackPedersenWindow>::setup(rng_pedersen).unwrap();
    let b2hash = blake2s_simd::blake2s(msg);
    (
        b2hash.as_bytes().to_vec(),
        CRH::<G1Projective, ZkHackPedersenWindow>::evaluate(&parameters, b2hash.as_bytes())
            .unwrap(),
    )
}

This function first initializes the pseudorandom number generator ChaCha20 with a 32-byte seed and feeds this RNG to the setup function. We look up the setup function in the crypto_primitives::crh::pedersen submodule (how do we know where to look? we check the use statement which brings CRH into scope at the beginning of the src/hash.rs file) and arrive here. Documentation is nonexistent so we jump to the code. Here are the relevant lines:

pub struct Parameters<C: ProjectiveCurve> {
    pub generators: Vec<Vec<C>>,
}

pub struct CRH<C: ProjectiveCurve, W: Window> {
    group: PhantomData<C>,
    window: PhantomData<W>,
}

impl<C: ProjectiveCurve, W: Window> CRHTrait for CRH<C, W> {
    const INPUT_SIZE_BITS: usize = W::WINDOW_SIZE * W::NUM_WINDOWS;
    type Output = C::Affine;
    type Parameters = Parameters<C>;

    fn setup<R: Rng>(rng: &mut R) -> Result<Self::Parameters, Error> {
        // ...
        let generators = Self::create_generators(rng);
        // ...
        Ok(Self::Parameters { generators })
    }

    // ...
}

impl<C: ProjectiveCurve, W: Window> CRH<C, W> {
    pub fn create_generators<R: Rng>(rng: &mut R) -> Vec<Vec<C>> {
        let mut generators_powers = Vec::new();
        for _ in 0..W::NUM_WINDOWS {
            generators_powers.push(Self::generator_powers(W::WINDOW_SIZE, rng));
        }
        generators_powers
    }

    pub fn generator_powers<R: Rng>(num_powers: usize, rng: &mut R) -> Vec<C> {
        let mut cur_gen_powers = Vec::with_capacity(num_powers);
        let mut base = C::rand(rng);
        for _ in 0..num_powers {
            cur_gen_powers.push(base);
            base.double_in_place();
        }
        cur_gen_powers
    }
}

Each invocation of generator_powers draws a random group element and returns the vector where W::WINDOW_SIZE. This function is called NUM_WINDOWS times by create_generators which then returns a vector where are random group elements. In hash_to_curve, this function is called with constants WINDOW_SIZE = 1 and NUM_WINDOWS = 256 as defined in the implementation of trait Window for struct ZkHackPedersenWindow. Hence, the line

let parameters = CRH::<G1Projective, ZkHackPedersenWindow>::setup(rng_pedersen).unwrap();

defines a Parameters<G1Projective> struct whose field generators holds a tuple of 256 random group elements of type G1Projective.

Then, the message is hashed with hash function BLAKE2s and the result is passed to the evaluate function, whose core is as follows:

    fn evaluate(parameters: &Self::Parameters, input: &[u8]) -> Result<Self::Output, Error> {
        // ...

        // Compute sum of h_i^{m_i} for all i.
        let bits = bytes_to_bits(input);
        let result = cfg_chunks!(bits, W::WINDOW_SIZE)
            .zip(&parameters.generators)
            .map(|(bits, generator_powers)| {
                let mut encoded = C::zero();
                for (bit, base) in bits.iter().zip(generator_powers.iter()) {
                    if *bit {
                        encoded += base;
                    }
                }
                encoded
            })
            .sum::<C>();

        // ...

        Ok(result.into())
    }

First, the input is converted into a vector of booleans using the bytes_to_bits function from the pedersen module. Then, it is split into NUM_WINDOWS chunks of size WINDOW_SIZE and zipped with parameters.generators which contains the points returned by setup. The closure inside map takes a chunk of bits and a vector of points and returns where is the integer whose bit representation is The final value of result is the sum over the windows of the output of this closure, i.e., where is the integer corresponding to the -th chunk of bits of the input.

In the specific case of hash_to_curve, we have and Hence, if we let denote the 256 group elements returned by setup and denote the output of the BLAKE2s hash function applied to message seen as a vector of bits, then the hash_to_curve function applied to returns the point on defined by

Hence, can be seen as the composition of BLAKE2s and an instance of Pedersen hashing. Since both BLAKE2s and Pedersen hashing are collision-resistant (assuming hardness of the discrete logarithm problem for Pedersen hashing), is collision-resistant as well. Is it sufficient to make BLS signatures secure though?

Now that we understand all parts of the code, we can get down to solving the puzzle.

Gathering the Pieces

Recall that the BLS signature of message is the point on defined as where is the secret key. We also established in the previous section that is given by where the 's are the bits of Hence, the signature of can be expressed as In other words, is a formal linear combination (with known coefficients) of secret points As we are given 256 signatures, we should be able to forge a new signature with some linear algebra.

In the following, we let be the message for which we want to forge a signature, and denote the output of the BLAKE2s hash function applied to seen as a vector of bits. Hence, the signature that we want to forge is Similarly, let denote the 256 messages whose signature is given in the puzzle data and let denote the hash of with BLAKE2s. Then the signature of message is

Assume that we can write as a linear combination of vectors i.e., we can find a vector such that Then we can compute as How do we compute ? Letting denote the matrix whose rows are then Eq. (25.4.1) is equivalent to Hence, we need to solve a linear system.

Implementing the Attack

To perform the linear algebra and compute we will use Sage. For this, we first write and as arrays in a file sage/data.sage that we will load in Sage later on. Note that we must write the individual bits of the outputs of BLAKE2s. For this, we use the bytes_to_bits function from the pedersen module which returns a vector of booleans. Trying to write this vector yields an array of true and false strings which is not what we want, so we first need to convert these booleans into bytes.

use bls_pedersen::bls::verify;
use bls_pedersen::data::puzzle_data;
use bls_pedersen::PUZZLE_DESCRIPTION;
use prompt::{puzzle, welcome};

// additional items brought into scope for puzzle solving
use bls_pedersen::hash::hash_to_curve;
use ark_crypto_primitives::crh::pedersen::bytes_to_bits;
use ark_ff::{PrimeField, Zero};
use ark_ec::{AffineCurve, ProjectiveCurve, msm::VariableBaseMSM};
use ark_bls12_381::{Fr, G1Affine, G1Projective};
use std::fs;
use std::fs::File;
use std::io::Write;

fn main() {
    welcome();
    puzzle(PUZZLE_DESCRIPTION);
    let (pk, ms, sigs) = puzzle_data();
    for (m, sig) in ms.iter().zip(sigs.iter()) {
        verify(pk, m, *sig);
    }

    // --snip--

    let m = b"your_username";

    let mut data_file = File::create("sage/data.sage").unwrap();

    // write matrix M to be passed to SAGE
    data_file.write_all("M = [".as_bytes()).unwrap();
    for m in ms {
        let (hash, _) = hash_to_curve(&m);
        let bits = bytes_to_bits(&hash);
        let bits: Vec<u8> = bits.into_iter().map(|x| x as u8).collect();
        let line = format!("{:?}, ", bits);
        data_file.write_all(line.as_bytes()).unwrap();
    }
    data_file.write_all("]\n".as_bytes()).unwrap();

    // write vector h to be passed to SAGE
    data_file.write_all("h = ".as_bytes()).unwrap();
    let (hash, _) = hash_to_curve(m);
    let bits = bytes_to_bits(&hash);
    let bits: Vec<u8> = bits.into_iter().map(|x| x as u8).collect();
    let line = format!("{:?}", bits);
    data_file.write_all(line.as_bytes()).unwrap();

    // --snip--

    // read solution in coeffs.txt and cast these strings (one per line) into scalar field Fr elements
    let mut coeffs = Vec::new();
    for line in fs::read_to_string("sage/coeffs.txt").unwrap().lines() {
        // let c = Fr::from_le_bytes_mod_order(line.as_bytes()); // doesn't work
        let c: Fr = line.parse().unwrap();
        coeffs.push(c);
    }

    // --snip--

    // compute forgery using affine coordinates
    let mut aff_forge = G1Affine::zero();
    for (c, sig) in coeffs.iter().zip(sigs.iter()) {
        aff_forge = aff_forge + sig.mul(*c).into();
    }

    // compute forgery using projective coordinates
    let mut proj_forge = G1Projective::zero();
    for (c, sig) in coeffs.iter().zip(sigs.iter()) {
        proj_forge += sig.mul(*c);
    }

    // compute forgery using multi-scalar multiplication
    let coeffs: Vec<<Fr as PrimeField>::BigInt> = coeffs.iter().map(|c| (*c).into_repr()).collect();
    let msm_forge = VariableBaseMSM::multi_scalar_mul(&sigs, &coeffs);

    /* Your solution here! */

    verify(pk, m, aff_forge);
    verify(pk, m, proj_forge.into_affine());
    verify(pk, m, msm_forge.into_affine());
    println!("Puzzle solved!");
}

Then we move to Sage. We write a script sage/lin_algebra.sage that reads matrix and vector from file sage/data.sage and then we use the solve_left method to solve the linear system. Note that we must work over the scalar field of BLS12-381 and explicitly declare that and are defined over After that, we write coefficients of solution returned by Sage in file sage/coeffs.txt, one coefficient per line. Note that is invertible and the system has a unique solution (which was not a given). Here is the content of the sage/lin_algebra.sage file (the size of the scalar field of BLS12-381 can be found for example here):

r = 0x73eda753299d7d483339d80809a1d80553bda402fffe5bfeffffffff00000001
Fr = FiniteField(r)
load('sage/data.sage')
M = Matrix(Fr, M)
h = vector(Fr, h)
c = M.solve_left(h)
file = open('sage/coeffs.txt', 'w')
for coeff in c:
    file.write(str(coeff) + '\n')
file.close()

We simply run it with

$ sage ./sage/lin_algebra.sage

It remains to import the coefficients in our Rust function and compute For this, we read file sage/coeffs.txt line by line, obtaining strings that we must convert into elements of the scalar field. Hence, we bring BLS12-381 scalar field into scope with use ark_bls12_381::Fr and look for how to do the conversion. Initially, I tried to use functions from_be_bytes_mod_order and from_le_bytes_mod_order of the PrimeField trait (after converting strings into slices of bytes using as_bytes): the code compiles but the forgery does not verify... A simpler solution uses the fact that the PrimeField trait has supertrait FromStr, meaning one can directly convert strings into the Fr type using the parse method.

use bls_pedersen::bls::verify;
use bls_pedersen::data::puzzle_data;
use bls_pedersen::PUZZLE_DESCRIPTION;
use prompt::{puzzle, welcome};

// additional items brought into scope for puzzle solving
use bls_pedersen::hash::hash_to_curve;
use ark_crypto_primitives::crh::pedersen::bytes_to_bits;
use ark_ff::{PrimeField, Zero};
use ark_ec::{AffineCurve, ProjectiveCurve, msm::VariableBaseMSM};
use ark_bls12_381::{Fr, G1Affine, G1Projective};
use std::fs;
use std::fs::File;
use std::io::Write;

fn main() {
    welcome();
    puzzle(PUZZLE_DESCRIPTION);
    let (pk, ms, sigs) = puzzle_data();
    for (m, sig) in ms.iter().zip(sigs.iter()) {
        verify(pk, m, *sig);
    }

    // --snip--

    let m = b"your_username";

    let mut data_file = File::create("sage/data.sage").unwrap();

    // write matrix M to be passed to SAGE
    data_file.write_all("M = [".as_bytes()).unwrap();
    for m in ms {
        let (hash, _) = hash_to_curve(&m);
        let bits = bytes_to_bits(&hash);
        let bits: Vec<u8> = bits.into_iter().map(|x| x as u8).collect();
        let line = format!("{:?}, ", bits);
        data_file.write_all(line.as_bytes()).unwrap();
    }
    data_file.write_all("]\n".as_bytes()).unwrap();

    // write vector h to be passed to SAGE
    data_file.write_all("h = ".as_bytes()).unwrap();
    let (hash, _) = hash_to_curve(m);
    let bits = bytes_to_bits(&hash);
    let bits: Vec<u8> = bits.into_iter().map(|x| x as u8).collect();
    let line = format!("{:?}", bits);
    data_file.write_all(line.as_bytes()).unwrap();

    // --snip--

    // read solution in coeffs.txt and cast these strings (one per line) into scalar field Fr elements
    let mut coeffs = Vec::new();
    for line in fs::read_to_string("sage/coeffs.txt").unwrap().lines() {
        // let c = Fr::from_le_bytes_mod_order(line.as_bytes()); // doesn't work
        let c: Fr = line.parse().unwrap();
        coeffs.push(c);
    }

    // --snip--

    // compute forgery using affine coordinates
    let mut aff_forge = G1Affine::zero();
    for (c, sig) in coeffs.iter().zip(sigs.iter()) {
        aff_forge = aff_forge + sig.mul(*c).into();
    }

    // compute forgery using projective coordinates
    let mut proj_forge = G1Projective::zero();
    for (c, sig) in coeffs.iter().zip(sigs.iter()) {
        proj_forge += sig.mul(*c);
    }

    // compute forgery using multi-scalar multiplication
    let coeffs: Vec<<Fr as PrimeField>::BigInt> = coeffs.iter().map(|c| (*c).into_repr()).collect();
    let msm_forge = VariableBaseMSM::multi_scalar_mul(&sigs, &coeffs);

    /* Your solution here! */

    verify(pk, m, aff_forge);
    verify(pk, m, proj_forge.into_affine());
    verify(pk, m, msm_forge.into_affine());
    println!("Puzzle solved!");
}

We are now ready to compute the forgery. Scalar multiplication is performed using the mul method. One can choose to work with affine or projective coordinates. One thing to note is that mul applied to a point in affine coordinates returns a point in projective coordinates. In order to add it to an affine point afterwards, it must be converted back into affine form using method into. There is also the option to use the multi_scalar_mul function (replaced by msm in version 0.4.0 of the library) implementing multi-scalar multiplication directly. However, the scalars in vector coeffs must first be cast into their "big integer" representation using the into_repr method. The code below uses the three possibilities. Note that the += operator does not work for affine representation.

use bls_pedersen::bls::verify;
use bls_pedersen::data::puzzle_data;
use bls_pedersen::PUZZLE_DESCRIPTION;
use prompt::{puzzle, welcome};

// additional items brought into scope for puzzle solving
use bls_pedersen::hash::hash_to_curve;
use ark_crypto_primitives::crh::pedersen::bytes_to_bits;
use ark_ff::{PrimeField, Zero};
use ark_ec::{AffineCurve, ProjectiveCurve, msm::VariableBaseMSM};
use ark_bls12_381::{Fr, G1Affine, G1Projective};
use std::fs;
use std::fs::File;
use std::io::Write;

fn main() {
    welcome();
    puzzle(PUZZLE_DESCRIPTION);
    let (pk, ms, sigs) = puzzle_data();
    for (m, sig) in ms.iter().zip(sigs.iter()) {
        verify(pk, m, *sig);
    }

    // --snip--

    let m = b"your_username";

    let mut data_file = File::create("sage/data.sage").unwrap();

    // write matrix M to be passed to SAGE
    data_file.write_all("M = [".as_bytes()).unwrap();
    for m in ms {
        let (hash, _) = hash_to_curve(&m);
        let bits = bytes_to_bits(&hash);
        let bits: Vec<u8> = bits.into_iter().map(|x| x as u8).collect();
        let line = format!("{:?}, ", bits);
        data_file.write_all(line.as_bytes()).unwrap();
    }
    data_file.write_all("]\n".as_bytes()).unwrap();

    // write vector h to be passed to SAGE
    data_file.write_all("h = ".as_bytes()).unwrap();
    let (hash, _) = hash_to_curve(m);
    let bits = bytes_to_bits(&hash);
    let bits: Vec<u8> = bits.into_iter().map(|x| x as u8).collect();
    let line = format!("{:?}", bits);
    data_file.write_all(line.as_bytes()).unwrap();

    // --snip--

    // read solution in coeffs.txt and cast these strings (one per line) into scalar field Fr elements
    let mut coeffs = Vec::new();
    for line in fs::read_to_string("sage/coeffs.txt").unwrap().lines() {
        // let c = Fr::from_le_bytes_mod_order(line.as_bytes()); // doesn't work
        let c: Fr = line.parse().unwrap();
        coeffs.push(c);
    }

    // --snip--

    // compute forgery using affine coordinates
    let mut aff_forge = G1Affine::zero();
    for (c, sig) in coeffs.iter().zip(sigs.iter()) {
        aff_forge = aff_forge + sig.mul(*c).into();
    }

    // compute forgery using projective coordinates
    let mut proj_forge = G1Projective::zero();
    for (c, sig) in coeffs.iter().zip(sigs.iter()) {
        proj_forge += sig.mul(*c);
    }

    // compute forgery using multi-scalar multiplication
    let coeffs: Vec<<Fr as PrimeField>::BigInt> = coeffs.iter().map(|c| (*c).into_repr()).collect();
    let msm_forge = VariableBaseMSM::multi_scalar_mul(&sigs, &coeffs);

    /* Your solution here! */

    verify(pk, m, aff_forge);
    verify(pk, m, proj_forge.into_affine());
    verify(pk, m, msm_forge.into_affine());
    println!("Puzzle solved!");
}

The attack requires to run the Rust binary to write the file sage/data.sage, then the Sage script, and then the Rust binary again to read sage/coeffs.txt. Not great, but I have no idea how to call the Sage script from the Rust main function. Feel free to improve this, for example with a bash script.

Conclusion

The key takeaway of this puzzle is that Pedersen hashing does not behave as a random oracle. Although it is provably collision-resistant assuming the discrete logarithm problem is hard, it has a rich algebraic structure which makes it unsuitable for cases where a hash function behaving as a random oracle is required, such as BLS signatures.

Which hash-to-curve function should be used to make BLS secure then? The easy solution is to hash the message together with a counter into the base field of the elliptic curve until the result is the x-coordinate of a point on the curve.1 The drawback is that it is not possible to implement it in constant time and that security of this construction is not known to hold in the strong sense of being indifferentiable from a random oracle [BCI+10]. For the specific case of BLS12-381, an efficient solution based on isogenies was recently proposed by Wahby and Boneh [WB19]. See also RFC 9380 specifying various hash-to-curve constructions.


1: Note that hashing into the scalar field to get and letting is completely insecure: one single signature on some message reveals which allows to forge a signature on any other message by computing

Puzzle 2: Group Dynamics

Alice has computed a trusted setup for a Groth16 proof scheme.
She decided to use a 128-bit long secret, and she swears that she does not know
the secret s needed to get this setup.
The trusted setup is constructed as follows using two additional scalars α and β:
* [s^i] G1 for 0 ⩽ i ⩽ 62,
* [α s^i] G1 for 0 ⩽ i ⩽ 31,
* [β s^i] G1 for 0 ⩽ i ⩽ 31,
* [s^i] G2 for 0 ⩽ i ⩽ 31.

Can you recover the secret anyway?

Heads-up: although the puzzle description refers to the Groth16 zk-SNARK [Gro16], there's no need to know anything about Groth16 to solve the puzzle. Suffice it to say, Groth16 uses a so-called structured reference string (or common reference string) generated during a trusted setup which has a form similar to the one of the puzzle data, meaning it consists of points on some pairing-friendly pair of curves computed somehow similarly to what is described in the puzzle instructions. Anyone able to retrieve the secret values (the simulation trapdoor, sometimes referred to as "toxic waste" as it must absolutely be discarded after the trusted setup) would be able to break the soundness of the proof system, meaning it could produce valid proofs for false statements.

Another important scheme where such structured parameters show up is the KZG polynomial commitment scheme.

Let's take a look at the code to see what this is about.

Initial Inspection

On the surface, this puzzle looks like a discrete logarithm problem: we must recover some secret value given etc. Actually, this variant where additional points ... are given is sometimes called the -discrete logarithm (or -strong discrete logarithm) problem. In some cases, this auxiliary information enables to speed up the computation of using Cheon's algorithm [Che10]. Note also that the assumption that the -discrete log problem is hard implies the soundness of the Groth16 proof system in a restricted model of computation called the algebraic group model [FKL18].

Let's take a look at the code. The package directory is organized as follows:

zkhack-trusted-setup
├── Cargo.toml
└── src
    ├── bin
    │   └── verify-trusted-setup.rs
    ├── data.rs
    └── lib.rs

As with the previous puzzle, the package has two crates: a library crate with root file src/lib.rs which simply declares a module data and a string slice with the puzzle description and a binary crate with root file src/bin/verify-trusted-setup.rs which contains the following code:

use ark_bls12_381::Fr;
use ark_ec::AffineCurve;
use prompt::{puzzle, welcome};
use std::str::FromStr;
use trusted_setup::data::puzzle_data;
use trusted_setup::PUZZLE_DESCRIPTION;

fn main() {
    welcome();
    puzzle(PUZZLE_DESCRIPTION);
    let (_ts1, _ts2) = puzzle_data();

    /* Your solution here! (s in decimal)*/
    let s = Fr::from_str("0").unwrap();

    assert_eq!(_ts1[0].mul(s), _ts1[1]);
    assert_eq!(_ts2[0].mul(s), _ts2[1]);
}

We can see that the main function simply loads the puzzle data and expects us to replace "0" with the correct value for so that the two final assertions evaluate to true and the program does not panic anymore. Note that ark_bls12_381::Fr, the scalar field of the BLS12-381 pairing-friendly elliptic curve, is brought into scope.

The code for the data module in src/data.rs, where the puzzle_data function is defined, looks like this:

use ark_bls12_381::{G1Affine, G2Affine};
use ark_serialize::CanonicalDeserialize;
use std::io::Cursor;

pub fn puzzle_data() -> ([G1Affine; 2 * 32 - 1 + 32 + 32], [G2Affine; 32]) {
    // ...
}

The puzzle_data function returns two arrays: _ts1 which holds 127 elements of type G1Affine (the type representing elements of group of BLS12-381 in affine representation) and _ts2 which holds 32 elements of type G2Affine (the type representing elements of group of BLS12-381 in affine representation). From this and the puzzle description we can infer that _ts1 holds while _ts2 holds

Importantly, and above do not refer to the commonly agreed subgroup generators of BLS12-381 but to the points _ts1[0] and _ts2[0] provided by Alice.

Taking a closer look at how these values are defined, we can see that puzzle_data calls an associated function named deserialize_unchecked. What is it that this function does not check?

We head up towards the ark-ec-0.3.0 crate documentation and search for deserialize_unchecked, which leads us here. This function is part of the ark_serialize::CanonicalDeserialize trait. The documentation simply says:

fn deserialize_unchecked<R: Read>(reader: R) -> Result<Self, SerializationError>

Reads self from reader without compression, and without performing validity checks. Should be used only when the input is trusted.

Not very informative, so we jump to the source code:

fn deserialize_unchecked<R: Read>(mut reader: R) -> Result<Self, SerializationError> {
    let x: P::BaseField = CanonicalDeserialize::deserialize(&mut reader)?;
    let (y, flags): (P::BaseField, SWFlags) =
        CanonicalDeserializeWithFlags::deserialize_with_flags(&mut reader)?;
    let p = GroupAffine::<P>::new(x, y, flags.is_infinity());
    Ok(p)
}

This function simply reads two base field elements and from the buffer and creates a new point in affine coordinates. What could go wrong? If there's a deserialize_unchecked function, maybe there's a sibling which actually checks something? Indeed, the function just above in the source code reads:

fn deserialize_uncompressed<R: Read>(
    reader: R,
) -> Result<Self, ark_serialize::SerializationError> {
    let p = Self::deserialize_unchecked(reader)?;

    if !p.is_in_correct_subgroup_assuming_on_curve() {
        return Err(SerializationError::InvalidData);
    }
    Ok(p)
}

(There is also a deserialize function which does something similar for points in compressed form, meaning they are encoded with their -coordinate and the sign of ) Here we are: the crucial property which is not checked by deserialize_unchecked is whether the curve point it returns is in the correct subgroup. What does that mean exactly?

Small Subgroup Attacks

Actually, the puzzle webpage was giving us a hint by advising us to read a paper by Cremers and Jackson [CJ19] about so-called small subgroup attacks. Small subgroup attacks (more specifically, small subgroup key recovery attacks, as there are other flavors such as small subgroup confinement attacks that we won't touch here) have been proposed by Lim and Lee in 1997 [LL97] and vulnerable implementations have been found in the wild [VAS+17].

In a nutshell, small subgroup attacks occur when some party, holding some secret scalar is tricked into computing thinking generates some subgroup (of some larger group where the discrete logarithm problem is hard whereas in fact generates another subgroup of where the discrete logarithm problem is much easier. This can happen for example in Diffie-Hellman key exchange or in some blind signing protocols where the party computing the scalar multiplication gets the point from another (potentially malicious) party rather than from trusted parameters. Before going into details, we need to recall some basic facts about groups and their subgroups.

Computing Discrete Logarithms in Groups of Composite Order

Let be a finite group of order Then, by Lagrange's Theorem, the order of any subgroup of divides Moreover, if is cyclic, then every subgroup of is also cyclic (Proposition 5.19) and by the Fundamental Theorem of Cyclic Groups, for each divisor of there exists a unique subgroup of order

Small subgroup attacks derive from a simple observation about the discrete logarithm problem in groups of composite order. Let by a cyclic group of composite order where and are coprime and let be a generator of Say we want to solve the discrete logarithm problem for a group element in base i.e., we want to find the unique integer such that Then, we can take advantage of the structure of group as follows:

  • First, we compute and Then has order (why?) and If we let denote the discrete logarithm of in base which can be computed in group operations with generic algorithms, then implies that
  • Similarly, one can compute and Then the discrete logarithm of in base can be computed in group operations and satisfies
  • Finally, one can combine the two equations and using the Chinese remainder theorem to obtain (modulo

All in all, has been computed in rather than group operations.

The procedure above is the basis of the Pohlig-Hellman algorithm which computes discrete logarithms in a group of order in group operations, assuming the factorization of is known. Hence, if one wants 128-bit security for the discrete logarithm problem, a necessary condition is that the order of the group has a prime factor of size at least 256 bits.

This explains why cryptographers are so obsessed with groups of prime order: by Lagrange's theorem, groups of prime order don't have any subgroups other than the trivial ones (itself and the subgroup of order 1 consisting of the identity element) and hence the discrete logarithm problem cannot be "broken" into smaller pieces as above (which, of course, does not imply that the DL problem cannot by broken by other, non-generic means).

Small Subgroup Attacks

Groups used in cryptography often have composite order. For example, the multiplicative group of integers modulo some prime number has order which is always even. For elliptic curves, although it is possible to construct secure curves with a prime number of points such as secp256k1, many curves that are attractive for efficiency reasons (such as twisted Edwards curves) have order where is prime and is small (usually 4 or 8).

When using such groups of composite order a "base" point of prime order is usually specified. As long as group elements used in a protocol are computed as multiples of one never "gets out" of the prime-order subgroup The index of subgroup i.e., the ratio is often called the cofactor of in a cryptographic context.

However, what happens if Alice, holding some secret value is tricked by an attacker into computing where is not in subgroup ? If is made available to the attacker, then it can use the Pohlig-Hellman algorithm to compute where is the "smooth" part of 's order (meaning, informally, the product of all "small" prime factors of 's order), which might be somewhere between a few bits (if is or to enough to retrieve entirely. Note that does not have have to actually be in a small subgroup for the attack to work, the only condition is that some multiples of be in small subgroups (equivalently, that has small subgroups). If, for example, generates the entire ambient group of order then one can "project" the discrete logarithm on the subgroup of order by computing and and work with the pair instead.

Observe that the maximal amount of information that can leak about secret in a small subgroup attack is bits, hence having a "small" cofactor as 4 or 8 might seem benign. However, there are actually plenty of other ways a small cofactor can mess with your protocol, an interesting example being Monero's multiple-spend bug.

What about pairing-based cryptography? While there exists families of pairing-friendly curve pairs where the first "small" curve has prime order, such as Baretto-Naehrig (BN) curves [BN05], the second "large" curve always has composite order with a very large cofactor. For members of the BLS family, even the first small curve has composite order. Hence, small subgroup attacks are especially relevant when using pairing-based cryptographic primitives. For more information, see [BCM+15].

Subgroup Membership Tests

How can we prevent small subgroup attacks? By performing a subgroup membership test. Given a group of composite order and a prime factor of an element is in the subgroup of order if and only if This test is simple yet rather costly since is large. However, there are a number of tricks to make subgroup membership testing more efficient [HGP22]. For curves with small cofactors (4 or 8), some techniques such as Decaf [Ham15] or Ristretto allow to "eliminate the cofactor" and construct a prime-order group.

Invalid Curve Attacks

Finally, assuming we work with a prime-order curve such as secp256k1, is it safe to use any point received from an untrusted source without verification? If the curve has prime order any point other than 0 has order right? Well, not exactly: if was receive in so-called "uncompressed" form (meaning both coordinates and were explicitly given), might not be on the curve at all! It might be on another curve with a different equation but where the same addition formulas apply. If this "ghost" curve has a smooth order, computing might end up leaking information about the secret scalar exactly as described above. This is called an invalid curve attack and has affected for example some implementations of TLS [JSS15] and the Bluetooth protocol [BN19].

Let's now see how to apply all this to the puzzle.

Solving the Puzzle

Subgroup Membership Checks

Now that we know about small subgroup attacks, a natural idea is to check whether and are in the "correct" subgroups, i.e., the subgroups of large prime order specified by the BLS12-381 parameters. Fortunately, the arkworks library has methods for that:

use ark_bls12_381::Fr;
use ark_ec::AffineCurve;
use prompt::{puzzle, welcome};
use std::str::FromStr;
use trusted_setup::data::puzzle_data;
use trusted_setup::PUZZLE_DESCRIPTION;

fn main() {
    welcome();
    puzzle(PUZZLE_DESCRIPTION);
    let (_ts1, _ts2) = puzzle_data();

    // --snip--

    // check that G1 and G2 are not in the correct group
    if _ts1[0].is_on_curve() {
        println!("G1 is on the curve.");
    } else {
        println!("G1 is not on the curve.");
    }

    if _ts1[0].is_in_correct_subgroup_assuming_on_curve() {
        println!("G1 is in the correct subgroup.");
    } else {
        println!("G1 is not in the correct subgroup.");
    }

    if _ts2[0].is_on_curve() {
        println!("G2 is on the curve.");
    } else {
        println!("G2 is not on the curve.");
    }

    if _ts2[0].is_in_correct_subgroup_assuming_on_curve() {
        println!("G2 is in the correct subgroup.");
    } else {
        println!("G2 is not in the correct subgroup.");
    }

    // printing points to copy-paste them in sage script
    println!("G1 = {}", _ts1[0]);
    println!("s * G1 = {}", _ts1[1]);
    println!("G2 = {}", _ts2[0]);
    println!("s * g2 = {}", _ts2[1]);

    /* Your solution here! (s in decimal)*/
    let s = Fr::from_str("114939083266787167213538091034071020048").unwrap();
    println!("Checking the solution...");
    assert_eq!(_ts1[0].mul(s), _ts1[1]);
    assert_eq!(_ts2[0].mul(s), _ts2[1]);
    println!("It works!");
}

which yields

G1 is on the curve.
G1 is not in the correct subgroup.
G2 is on the curve.
G2 is not in the correct subgroup.

Nice! So we know that we must mount a small subgroup attack to retrieve and solve the puzzle.

You can have a look at the code of methods is_on_curve and is_in_correct_subgroup_assuming_on_curve here. Note that if the point that is being tested is not for certain on the curve, one should call both methods: indeed, is_in_correct_subgroup_assuming_on_curve could return true when applied to a point which is not on the curve (see invalid curve attacks).

Solving the Puzzle with Sage

From now on we will be using Sage as it is quite convenient to do all kinds of computations on elliptic curves, as we'll see.

Recall that BLS12-381 actually consists of two curves in short Weierstrass form:

  • defined over the prime field where is a 381-bit prime,
  • defined over the quadratic extension of obtained from the irreducible polynomial

The BLS12-381 parameters also specify two points and (more usually denoted and but this conflicts with the puzzle notation), both of prime order but we won't need them here.

Although I'm not aware of any normative reference for BLS12-381, all the parameters can be found in the IETF Internet-Draft for pairing-friendly curves. With that, we can construct the two curves in Sage (see here for the Sage documentation about elliptic curves over finite fields):

p = 0x1a0111ea397fe69a4b1ba7b6434bacd764774b84f38512bf6730d2a0f6b0f6241eabfffeb153ffffb9feffffffffaaab
F1 = GF(p)
a1 = 0
b1 = 4
E1 = EllipticCurve(F1, [a1, b1])

R.<x> = PolynomialRing(F1)
F2.<u> = F1.extension(x^2+1)
a2 = 0
b2 = 4*(1+u)
E2 = EllipticCurve(F2, [a2, b2])

Although we know that is not in the subgroup of large prime order we don't know 's order yet. We will compute it with Sage. For this, we display this point by using println!("G1 = {}", _ts1[0]); in the main function of the puzzle and we get:

G1 = GroupAffine(x=Fp384 "(0F99F411A5F6C484EC5CAD7B9F9C0F01A3D2BB73759BB95567F1FE4910331D32B95ED87E36681230273C9A6677BE3A69)",
 y=Fp384 "(12978C5E13A226B039CE22A0F4961D329747F0B78350988DAB4C1263455C826418A667CA97AC55576228FC7AA77D33E5)")

We can now copy-paste these values into our Sage file to define there and ask Sage to compute 's order and even factor it:

G1 = E1(0x0F99F411A5F6C484EC5CAD7B9F9C0F01A3D2BB73759BB95567F1FE4910331D32B95ED87E36681230273C9A6677BE3A69, \
        0x12978C5E13A226B039CE22A0F4961D329747F0B78350988DAB4C1263455C826418A667CA97AC55576228FC7AA77D33E5)
n1 = G1.order()
L1 = list(n1.factor())
print("The factorization of G1's order n1 is:")
for l in L1:
    print(l)

which prints the list of prime factors together with their addicity:

The factorization of G1's order n1 is:
(3, 1)
(11, 1)
(10177, 1)
(859267, 1)
(52437899, 1)
(52435875175126190479447740508185965837690552500527637822603658699938581184513, 1)

The largest prime factor is actually the order of subgroups and one is supposed to work with when using BLS12-381. It's 255-bit, no way we're gonna be able to compute of course. On the other hand, the other factors seem small enough so that we can apply the small subgroup attack to compute where As is a 64-bit number (check with print(numerical_approx(log(n1/r, 2)))), we'll be able to get roughly 64 bits of information about a good start.

Recall the overall principle of the attack as described earlier. Let (this is _ts1[1] from the puzzle data). We compute (whose order is and Then where the last equality follows from the fact that has order We can now compute the discrete logarithm of in base which will give us our first equation

Sage has a generic discrete_log function (where generic means it works over any group) which uses various algorithms (Pohlig-Hellman, Pollard-rho, ...) and that we can call directly here:

H1 = E1(0x16C2385B2093CC3EDBC0F2257E8F23E98E775F8F6628767E5F4FC0E495285B95B1505F487102FE083E65DC8E9E3A9181, \
         0x0F4B73F63C6FD1F924EAE2982426FC94FBD03FCEE12D9FB01BAF52BE1246A14C53C152D64ED312494A2BC32C4A3E7F9A)
s1 = discrete_log(r*H1, r*G1, operation='+')
print("s mod k1 =")
print(s1)

We get:

s mod k1 =
2335387132884273659

How do we continue from here? Could we use information contained in points ? Repeating all the above with for example would allow us to compute but this is simply which we can compute from so this does not give us new information.

Maybe we could use points then? This seems implausible though: as and are just random scalars independent from we could just have generated points with the same distribution by ourselves from by drawing random values for and and computing the corresponding scalar multiplications. These extra points rather look like some red herring.

Hence, we now consider the set of points on the second curve Again, we define point in Sage and try to compute its order:

G2 = E2(0x1173F10AD9F2DBEE8B6C0BB2624B05D72EEC87925F5C3633E2C000E699A580B842D3F35AF1BE77517C86AEBCA1130AE4 \
      + 0x0434043A97DA28EF7100AE559167FC613F057B85451476ABABB27CFF0238A32831A0B4D14BA83C4F97247C8AC339841F * u, \
        0x0BEBEC70446CB91BB3D4DC5C8412915E99D612D8807C950AB06BC41583F528FDA9F42EC0FE7CD2991638187EF44258D3 \
      + 0x19528E3B5C90C73A7092BB9AFDC73F86C838F551CCD9DBBA5CC6244CF76AB3372193DBE5B62383FAAE728728D4C1E649 * u)
n2 = G2.order()

Unlike for where Sage returned the answer pretty quickly, this seems to take quite some time (I killed the process after a few minutes). To understand why, one has to keep two things in mind:

  • First, the second curve is much bigger than the first one. By Hasse's bound, its order is close to (hence roughly 762-bit long). Sage can compute it very quickly with E2.cardinality(), however if we try to factor it, this seems quite long again, presumably because it has at least two large prime factors.
  • Second, computing the order of efficiently actually requires the factorization of the order of the ambient group, i.e., (see Algorithm 4.79 of Chapter 4 of the Handbook of Applied Cryptography).

Hence, the reason why G2.order() takes so much time is presumably that it tries to factor first. Could we help Sage here? We know at least one large prime factor of namely the prime order of subgroup defined by BLS12-381 parameters. The number is called the cofactor of subgroup in We can compute it in Sage and try to factor it:

E2_order = E2.cardinality()
c2 = E2_order/r
L2 = list(c2.factor())
print("The factorization of cofactor c2 = |E2|/r is:")
for l in L2:
    print(l)

This gives:

The factorization of cofactor c2 = |E2|/r is:
(13, 2)
(23, 2)
(2713, 1)
(11953, 1)
(262069, 1)
(402096035359507321594726366720466575392706800671181159425656785868777272553337714697862511267018014931937703598282857976535744623203249, 1)

We can see that the cofactor has a very large (448-bit long) prime factor that will be denoted in the following. What prevented Sage from factoring efficiently was the part, but Sage can factor very quickly because it has only one large prime factor.

Now that we know the factorization of can we hint it to Sage so that it can compute 's order efficiently? It turns out that there is another function called order_from_multiple to which we can pass a multiple of 's order (we will use as 's order necessarily divides and the factorization of So we give it a try (after inserting the missing factor in the list to obtain the factorization of

L2.insert(5, (r,1))
n2 = order_from_multiple(G2, E2_order, factorization=L2, operation='+')

This time, Sage returns instantly. Trying to factor the result again takes time, meaning 's order is a multiple of but as before we can factor and insert afterwards to get the result quickly:

L3 = list((n2/r).factor())
print("The factorization of n2/r is:")
for l in L3:
    print(l)
L3.insert(5, (r,1))
print("The factorization of G2's order n2 is:")
for l in L3:
    print(l)

We obtain:

The factorization of G2's order n2 is:
(13, 1)
(23, 1)
(2713, 1)
(11953, 1)
(262069, 1)
(52435875175126190479447740508185965837690552500527637822603658699938581184513, 1)
(402096035359507321594726366720466575392706800671181159425656785868777272553337714697862511267018014931937703598282857976535744623203249, 1)

Now we have all the information we need and can perform a second small subgroup attack by clearing the large factor from 's order, which will give us where which is a 52-bit integer. So we compute and and ask sage to compute the discrete logarithm of in base There is a catch though: as we're working in the order of which Sage cannot factor quickly, we need to pass Sage the order of namely

s2 = discrete_log(r * rp * H2, r * rp * G2, ord=k2, operation='+')
print("s mod k2 =")
print(s2)

We get:

s mod k2 =
712318409117070

As and are coprime, we can now combine the two equations using the Chinese remainder theorem to compute

s12 = crt([s1, s2], [k1, k2])
print("s mod k1 * k2 =")
print(s12)

We get

s mod k1 * k2 =
5592216610550884993006174526481245

As is a 115-bit integer and is 128-bit long, we miss roughly 13 bit of information, which is sufficiently small to allow exhaustive search: writing we can loop through all integers and check whether

for i in range(2^13):
	s = i*k + s12
	if s * G1 == H1:
		print("discrete log found:")
		print(s)
		break

We're finally done:

discrete log found:
114939083266787167213538091034071020048

Conclusion

Keep in mind that it is possible to create instances of the GroupAffine type which are not in the correct subgroup of the related curve through the deserialize_unchecked function. Hence, subgroup membership tests should always be performed before creating an instance whose coordinates come from an untrusted source. If you feel this is a step aside from the type safety philosophy of Rust, it is possible to define an enum such as

enum ECPoint<P> {
    Checked(GroupAffine<P>),
    Unchecked(GroupAffine<P>),
}

and to implement a public constructor for it that will never return an Unchecked variant if there is a possibility that the corresponding point is not in the correct subgroup.

Puzzle 3: Double Trouble

Bob has developed a new zero-knowledge inner-product proof allows proving that
the inner product of a hidden, committed vector `a` with a public vector `b`
equals the claimed value `v` that is committed. He's released the protocol
license-free below, but still wants to profit from his invention. So, he
developed a proprietary prover that he claims is 2x faster than the standard one
described below, but without sacrificing zero-knowledge: it still hides all
information about the committed vector `a`. To back up his claim, he has
published a few proofs generated by this proprietary prover for the same `a` but
with respect to different `b` vectors, and has challenged people to recover `a`
from just these proofs.

Can you rise to the challenge and recover the vector `a`?.


The inner-product proof is obtained by applying the Fiat--Shamir transform to
the following sigma protocol:

Before proof:
During proof of inner product with public vector b:
        Prover                                           Verifier
=================================================================================================
Offline phase (before `b` is available):
1. Prover computes
    C_a := PedersenCOMM(a; α)
         = sum_i (a_i * G_i) + α * H
    where G_i and H are random group elements,
    and s is sampled randomly.
                            --------- C_a ---------->

Online phase for a given public vector `b` (can be repeated for different `b`s):

1. Prover samples a random vector r
    and random elements ρ, τ, υ.
2. Prover computes
    C_r := PedersenCOMM(r; ρ)
    C_1 := PedersenCOMM(<a, b>; τ) // <x, y> denotes inner product of x and y.
    C_2 := PedersenCOMM(<r, b>; υ)
                            ---- C_r, C_1, C_2 ----->
                            <- random challenge γ ---
3. Prover computes
    s := a + γr,
    u := α + γρ
    t := τ + γυ,
                            -------- s, u, t ------->
                                                // Check that `s` really is a + γr,
                                                Check PedersenCOMM(s; u) = C_a + γC_r
                                                // Check that the inner product is committed in C_1.
                                                Check PedersenCOMM(<s, b>; t) = C_1 + γC_2
==================================================================================================

This puzzle is about a zero-knowledge poof system allowing to prove some kind of inner-product relation between committed values. The puzzle instruction asks us to retrieve the witness of some instance and hence to break the zero-knowledge property of the scheme. It is recommended to read the chapters about commitments and zero-knowledge proofs before starting.

Initial Inspection

First, let us try to define more precisely the goal of the puzzle just from the instructions, without looking into the code yet.

Notation

Given a group of order we will write vectors of scalars and vectors of group elements in bold font, e.g. and For two vectors and their scalar product is defined as Similarly, given and we let

Definition of the Proof System

Let us first try to specify the language for which the proof system is designed. We assume that the cyclic group of prime order and generators and are agreed upon by the prover and the verifier. Then, from the puzzle's description (and also from the code, as we will see), the language of interest is defined by the relation

Recall that the language defined by this relation consists of all instances such that there exists a witness such that

The protocol itself goes as follows:

The protocol is complete, meaning that for any instance an honestly generated proof is always accepted by the verifier. This holds because and

Proof of Knowledge

One can also check that the proof system is extractable, which follows from a property called special soundness: for any instance in language given two accepting transcripts with the same commitments but different challenges and one can compute a witness. Indeed, let be two accepting transcripts for some instance such that Let Since both transcripts are accepting, we have Then Hence, the relation defining the language is satisfied and is indeed a witness that the instance is in

Zero-Knowledge?

The puzzle instructions asks us to find which is part of the witness, from a few proofs generated by Bob. If the protocol is zero-knowledge, this shouldn't be possible. Is it zero-knowledge, though, from a theoretical point of view?

The answer is yes. Namely, the interactive protocol as defined above can be shown to be honest-verifier zero-knowledge. The simulator, whose task is to generate a fake transcript distributed as a true transcript between a prover and an honest verifier without knowing the witness works as follows:

Something Strange

There is something odd with the way we described the proof system: vector plays no role in the relation In particular, note that the proof that the system is extractable did not use the second equation checked by the verifier (meaning extractability still holds even if this check is omitted).

For this reason, it would make more sense to actually include the commitment to the inner product in the instance and randomness in the witness and to define the relation as

By doing this, the protocol would now be specified as follows:

However, this is not important for solving the puzzle and we will stick to the original, albeit unnatural, description.

Exploring the Code

Let us recall the proof system here before digging into the code:

The package directory is organized as follows:

zkhack-double-trouble
├── Cargo.toml
└── src
    ├── bin
    │   └── verify-double-trouble.rs
    ├── inner_product_argument
    │   ├── data_structures.rs
    │   └── utils.rs
    ├── data.rs
    ├── inner_product_argument.rs
    └── lib.rs

The proof system is implemented in the inner_product_argument module of the library crate and follows closely the specification from the puzzle's description. The puzzle uses the ark-ed-on-bls12-381 library which implements the so-called Jubjub curve developed by the Zcash team. This curve was designed to have its base field equal to the scalar field of BLS12-381, allowing to efficiently prove statements about cryptographic schemes based on Jubjub (such a Pedersen commitments, Schnorr signatures, etc) using a proof system based on BLS12-381. The affine group and the scalar field of this curve are brought into scope with

use ark_ed_on_bls12_381::{EdwardsAffine as GAffine, Fr};

Two structures Instance and Witness corresponding respectively to the instance and the witness are defined directly in src/inner_product_argument.rs:

pub struct Instance {
    pub comm_a: GAffine,
    pub b: Vec<Fr>,
}

pub struct Witness {
    pub a: Vec<Fr>,
    pub comm_a_rand: Fr,
}

Four additional structures are defined in src/inner_product_argument/data_structures.rs:

  • the commitment key CommitKey
  • the proof commitment (first message sent by the prover in the interactive protocol) ProofCommitment
  • the proof response (second message sent by the prover in the interactive protocol) ProofResponse
  • and a fourth structure Proof which simply combines ProofCommitment and ProofResponse.

Here is the code defining these four structures:

pub struct CommitKey {
    pub generators: Vec<GAffine>,
    pub hiding_generator: GAffine,
}

pub struct ProofCommitment {
    pub comm_r: GAffine,
    pub comm_1: GAffine,
    pub comm_2: GAffine,
}

pub struct ProofResponse {
    pub s: Vec<Fr>,
    pub u: Fr,
    pub t: Fr,
}

pub struct Proof {
    pub commitment: ProofCommitment,
    pub response: ProofResponse,
}

The rest of the code in the inner_product_argument module and sub-modules does not show anything surprising. As said in the puzzle description, the proof system is made non-interactive using the Fiat-Shamir transform. Namely, the challenge is computed by hashing the commitment key, the instance, and the commitment:

    let challenge = challenge(ck, instance, &commitment);

where function challenge is defined in inner_product_argument/utils.rs:

pub fn b2s_hash_to_field<C: CanonicalSerialize>(input: &C) -> Fr {
    let bytes = input.hash::<blake2::Blake2s>();
    Fr::from_le_bytes_mod_order(&bytes)
}

pub fn challenge(ck: &CommitKey, instance: &Instance, proof_comm: &ProofCommitment) -> Fr {
    b2s_hash_to_field(&(ck.clone(), instance.clone(), proof_comm.clone()))
}

Let's take a look at the binary crate and its main function:

fn main() {
    welcome();
    puzzle(PUZZLE_DESCRIPTION);
    let (ck, [instance_and_proof_1, instance_and_proof_2]) = puzzle_data();
    let (instance1, proof1) = instance_and_proof_1;
    let (instance2, proof2) = instance_and_proof_2;
    assert!(verify(&ck, &instance1, &proof1));
    assert!(verify(&ck, &instance2, &proof2));

    let (a, comm_a_rand): (Vec<Fr>, Fr) = {
        // Your solution here!
        todo!()
    };
    assert_eq!(
        ck.commit_with_explicit_randomness(&a, comm_a_rand),
        instance1.comm_a
    );
    assert_eq!(
        ck.commit_with_explicit_randomness(&a, comm_a_rand),
        instance2.comm_a
    );
}

The puzzle_data function simply deserializes some data and returns a commitment key ck and two instance/proof pairs. We can print the structures we're interested in:

#![allow(unused, unreachable_code)]
use ark_ed_on_bls12_381::Fr;
use ark_ff::Field;
use double_trouble::data::puzzle_data;
use double_trouble::inner_product_argument::utils::challenge;
use double_trouble::verify;
use double_trouble::PUZZLE_DESCRIPTION;
use prompt::{puzzle, welcome};

// additional items brought into scope for puzzle solving
use double_trouble::CommitKey;
use ark_ec::AffineCurve;

fn main() {
    welcome();
    puzzle(PUZZLE_DESCRIPTION);
    let (ck, [instance_and_proof_1, instance_and_proof_2]) = puzzle_data();
    let (instance1, proof1) = instance_and_proof_1;
    let (instance2, proof2) = instance_and_proof_2;
    assert!(verify(&ck, &instance1, &proof1));
    assert!(verify(&ck, &instance2, &proof2));

    // --snip--

    println!("commitment key:");
    for (i, ck_i) in ck.generators.iter().enumerate() {
        println!("ck.generators[{}] = {}", i, ck_i);
    }
    println!("ck.hiding_generator = {}\n", ck.hiding_generator);
    println!("instance 1, C_a:\n {}\n", instance1.comm_a);
    println!("instance 2, C_a:\n {}\n", instance2.comm_a);
    println!("instance 1, b:");
    for (i, b_i) in instance1.b.iter().enumerate() {
        println!("instance1.b[{}] = {}", i, b_i);
    }
    println!("");
    println!("instance 2, b:");
    for (i, b_i) in instance2.b.iter().enumerate() {
        println!("instance2.b[{}] = {}", i, b_i);
    }
    println!("");

    // --snip--

    assert_eq!(instance1, instance2);
    assert_eq!(ck, CommitKey::sample(8));

    // --snip--

    println!("proof1, comm_r:\n {}", proof1.commitment.comm_r);
    println!("proof1, comm_1:\n {}", proof1.commitment.comm_1);
    println!("proof1, comm_2:\n {}\n", proof1.commitment.comm_2);

    println!("proof2, comm_r:\n {}", proof2.commitment.comm_r);
    println!("proof2, comm_1:\n {}", proof2.commitment.comm_1);
    println!("proof2, comm_2:\n {}\n", proof2.commitment.comm_2);

    // --snip--

    if proof1.commitment.comm_r.mul(2) == proof2.commitment.comm_r {
        println!("C_r in the second proof is twice C_r in the first proof\n");
    }

    // --snip--

    let gamma1 = challenge(&ck, &instance1, &proof1.commitment);
    let gamma2 = challenge(&ck, &instance2, &proof2.commitment);
    let s1 = proof1.response.s;
    let s2 = proof2.response.s;
    let u1 = proof1.response.u;
    let u2 = proof2.response.u;
    let k = (gamma1 - Fr::from(2) * gamma2).inverse().unwrap();
    let my_a: Vec<Fr> = s1
        .iter()
        .zip(s2.iter())
        .map(|(c1, c2)| k * (gamma1 * c2 - Fr::from(2) * gamma2 * c1))
        .collect();
    let my_comm_a_rand = k * (gamma1 * u2 - Fr::from(2) * gamma2 * u1);

    let (a, comm_a_rand): (Vec<Fr>, Fr) = {
        // Your solution here!
        (my_a, my_comm_a_rand)
    };
    assert_eq!(
        ck.commit_with_explicit_randomness(&a, comm_a_rand),
        instance1.comm_a
    );
    assert_eq!(
        ck.commit_with_explicit_randomness(&a, comm_a_rand),
        instance2.comm_a
    );
    println!("Puzzle solved!");
}

We get:

commitment key:
ck.generators[0] = GroupAffine(x=Fp256 "(198B8C3FC05A64DF64DEC6C0C9CF997E1AFA1DBB5C191ED0DFD5C771467F089D)", y=Fp256 "(5AE26A746DDBBCC5ECC3C970E715AEC48BB2551DD9DDE8AE7A7DA7032E161577)")
ck.generators[1] = GroupAffine(x=Fp256 "(4B92D491ACE817177026CD40C5020D04B30759F240CFDFF8DD795629E3307C5E)", y=Fp256 "(6604E7622969E8E968970E74DB648CF8226DCB0B67FF8E6313A3B9CAD7353701)")
ck.generators[2] = GroupAffine(x=Fp256 "(70F5E9698EF8F51B85D089DBBCC2F25190C905F6113976696F109307D347ACBA)", y=Fp256 "(3BD293E00769FE7963674BFA0745B4FE316AF189856418E679DBEF49B00A9085)")
ck.generators[3] = GroupAffine(x=Fp256 "(3FF005D6FE8FE85EA40051BF5051464AB69F0B587BE571B2800C08F4B93AD452)", y=Fp256 "(445D24D6D0EDEFC80B3785F27613FD072E4E69EBFC6A0B9716036F19E15C6ABF)")
ck.generators[4] = GroupAffine(x=Fp256 "(5174580F00FB73C60E3CA1D0A0EBF328FDAC3A018D15F6ED81B39D833927C079)", y=Fp256 "(26053B0E29A8E735673D8ACC0C0353EA6326DF6F81A2EE0AC65A8C1A853241F5)")
ck.generators[5] = GroupAffine(x=Fp256 "(65B0213A6DE2BC0DDAD9648179372A2B3939B06A1062CF0D58F731ABF7D6B742)", y=Fp256 "(11C6904D697A2638441B3125D0A3316507A03C16D79FCB3F5A19CF3ECFD649E6)")
ck.generators[6] = GroupAffine(x=Fp256 "(50644971731D58AFB54EB92D6F0C7700F06774B2AF10E72B646C0D5A93AA6347)", y=Fp256 "(1C282A3A58C4AD917587EBE68DC84A4CAC0E4A914FA83618439373528617AAAF)")
ck.generators[7] = GroupAffine(x=Fp256 "(3181568CB6D37D00412E3348F2C08C1579DB32492A768F41EAC49D8E7E6F9BBF)", y=Fp256 "(27EDAEA9068A61645B11E8D5C15E9569340E59E3963AE78CAADD2282451F10F4)")
ck.hiding_generator = GroupAffine(x=Fp256 "(3A2ED8E0E81BED90A83FA22E58FA8A0F08752AAB03CD4BA9BB558965B9A57B32)", y=Fp256 "(3C603EF0D0BB80987AD83208034C552F8919C5F8FEACC5404DEBCC16FE3B947F)")

instance 1, C_a:
 GroupAffine(x=Fp256 "(6AE271E04FBB0AE9FB89506FF7180F5C06A8D60F802D934987965F694228BF8A)", y=Fp256 "(2BFBFA9CCF2151F01E71A069366DAD9398960B64684888D1AABB50D4D57BDF32)")

instance 2, C_a:
 GroupAffine(x=Fp256 "(6AE271E04FBB0AE9FB89506FF7180F5C06A8D60F802D934987965F694228BF8A)", y=Fp256 "(2BFBFA9CCF2151F01E71A069366DAD9398960B64684888D1AABB50D4D57BDF32)")

instance 1, b:
instance1.b[0] = Fp256 "(08180E66A534AADEBC88D09E1397DC7C33E2014115EB973B489E7D5CDBF839CD)"
instance1.b[1] = Fp256 "(036AFB822FAC04AC9191CCEEF5BF4E27ADA6DC0440C88ECF3E06DC2FAFB162E6)"
instance1.b[2] = Fp256 "(0DE7FE23DCF79F2A041E2C21876F9B9AEB3F2BC628E07B87F52DF460408334F2)"
instance1.b[3] = Fp256 "(0891BBE1E3DA5717F7ED59288C9F51186E7BBAE018C9DA56F4BC8B4BBBD7457E)"
instance1.b[4] = Fp256 "(05D81F4C416350A3D02B1685176BFE5A98FA15D51C84DBD47680326F9F005E96)"
instance1.b[5] = Fp256 "(06D5E58667508A24F3A3FFBB244575DE29ECB3408D6EBC6D3DCDEFF02AA9453C)"
instance1.b[6] = Fp256 "(06BC47A67C6BD353EE624051B4C4A6A28E7F8CEDB6ED65A007D897AC071CBDCB)"
instance1.b[7] = Fp256 "(0CF6D9D35E0B6F2309568E5BB7C19448D993D2EFFEF7B3D77C137A26C524315A)"

instance 2, b:
instance2.b[0] = Fp256 "(08180E66A534AADEBC88D09E1397DC7C33E2014115EB973B489E7D5CDBF839CD)"
instance2.b[1] = Fp256 "(036AFB822FAC04AC9191CCEEF5BF4E27ADA6DC0440C88ECF3E06DC2FAFB162E6)"
instance2.b[2] = Fp256 "(0DE7FE23DCF79F2A041E2C21876F9B9AEB3F2BC628E07B87F52DF460408334F2)"
instance2.b[3] = Fp256 "(0891BBE1E3DA5717F7ED59288C9F51186E7BBAE018C9DA56F4BC8B4BBBD7457E)"
instance2.b[4] = Fp256 "(05D81F4C416350A3D02B1685176BFE5A98FA15D51C84DBD47680326F9F005E96)"
instance2.b[5] = Fp256 "(06D5E58667508A24F3A3FFBB244575DE29ECB3408D6EBC6D3DCDEFF02AA9453C)"
instance2.b[6] = Fp256 "(06BC47A67C6BD353EE624051B4C4A6A28E7F8CEDB6ED65A007D897AC071CBDCB)"
instance2.b[7] = Fp256 "(0CF6D9D35E0B6F2309568E5BB7C19448D993D2EFFEF7B3D77C137A26C524315A)"

We can note a couple of interesting things. First, points instance1.comm_a and instance2.comm_a are equal and vectors instance1.b and instance2.b are equal, meaning the two instances are exactly the same (whereas the puzzle description said that the proofs published by Bob where for different different b vectors). Second, ck.generators (vector in the description above) has length Where does this commitment key come from? The CommitKey structure has an associated function allowing to sample a commitment key:

impl CommitKey {
    pub fn sample(size: usize) -> Self {
        let mut rng = ChaChaRng::from_seed(*b"zkHack IPA puzzle for 2021-10-26");
        let generators = sample_vector::<GAffine, _>(size, &mut rng)
            .into_iter()
            .map(Into::into)
            .collect();
        let hiding_generator = GProjective::rand(&mut rng).into();
        Self {
            generators,
            hiding_generator,
        }
    }
    // ...
}

We can verify that ck provided in the puzzle data is indeed the commitment key returned by this function. These two observations can be checked directly with

#![allow(unused, unreachable_code)]
use ark_ed_on_bls12_381::Fr;
use ark_ff::Field;
use double_trouble::data::puzzle_data;
use double_trouble::inner_product_argument::utils::challenge;
use double_trouble::verify;
use double_trouble::PUZZLE_DESCRIPTION;
use prompt::{puzzle, welcome};

// additional items brought into scope for puzzle solving
use double_trouble::CommitKey;
use ark_ec::AffineCurve;

fn main() {
    welcome();
    puzzle(PUZZLE_DESCRIPTION);
    let (ck, [instance_and_proof_1, instance_and_proof_2]) = puzzle_data();
    let (instance1, proof1) = instance_and_proof_1;
    let (instance2, proof2) = instance_and_proof_2;
    assert!(verify(&ck, &instance1, &proof1));
    assert!(verify(&ck, &instance2, &proof2));

    // --snip--

    println!("commitment key:");
    for (i, ck_i) in ck.generators.iter().enumerate() {
        println!("ck.generators[{}] = {}", i, ck_i);
    }
    println!("ck.hiding_generator = {}\n", ck.hiding_generator);
    println!("instance 1, C_a:\n {}\n", instance1.comm_a);
    println!("instance 2, C_a:\n {}\n", instance2.comm_a);
    println!("instance 1, b:");
    for (i, b_i) in instance1.b.iter().enumerate() {
        println!("instance1.b[{}] = {}", i, b_i);
    }
    println!("");
    println!("instance 2, b:");
    for (i, b_i) in instance2.b.iter().enumerate() {
        println!("instance2.b[{}] = {}", i, b_i);
    }
    println!("");

    // --snip--

    assert_eq!(instance1, instance2);
    assert_eq!(ck, CommitKey::sample(8));

    // --snip--

    println!("proof1, comm_r:\n {}", proof1.commitment.comm_r);
    println!("proof1, comm_1:\n {}", proof1.commitment.comm_1);
    println!("proof1, comm_2:\n {}\n", proof1.commitment.comm_2);

    println!("proof2, comm_r:\n {}", proof2.commitment.comm_r);
    println!("proof2, comm_1:\n {}", proof2.commitment.comm_1);
    println!("proof2, comm_2:\n {}\n", proof2.commitment.comm_2);

    // --snip--

    if proof1.commitment.comm_r.mul(2) == proof2.commitment.comm_r {
        println!("C_r in the second proof is twice C_r in the first proof\n");
    }

    // --snip--

    let gamma1 = challenge(&ck, &instance1, &proof1.commitment);
    let gamma2 = challenge(&ck, &instance2, &proof2.commitment);
    let s1 = proof1.response.s;
    let s2 = proof2.response.s;
    let u1 = proof1.response.u;
    let u2 = proof2.response.u;
    let k = (gamma1 - Fr::from(2) * gamma2).inverse().unwrap();
    let my_a: Vec<Fr> = s1
        .iter()
        .zip(s2.iter())
        .map(|(c1, c2)| k * (gamma1 * c2 - Fr::from(2) * gamma2 * c1))
        .collect();
    let my_comm_a_rand = k * (gamma1 * u2 - Fr::from(2) * gamma2 * u1);

    let (a, comm_a_rand): (Vec<Fr>, Fr) = {
        // Your solution here!
        (my_a, my_comm_a_rand)
    };
    assert_eq!(
        ck.commit_with_explicit_randomness(&a, comm_a_rand),
        instance1.comm_a
    );
    assert_eq!(
        ck.commit_with_explicit_randomness(&a, comm_a_rand),
        instance2.comm_a
    );
    println!("Puzzle solved!");
}

(For this, one needs to derive the PartialEq trait for structures CommitKey and Instance).

As we explained in the section about Pedersen commitments, the knowledge of discrete log relations between the group elements in the commitment key constitutes a trapdoor allowing to break the binding property of the commitment scheme. However, this does not seem like a promising avenue to solve the puzzle. On the one hand, this trapdoor does not allow to break the hiding property of the commitment scheme, which is what we would need to recover On the other hand, function sample does things correctly by sampling uniformly random and independent group elements using a pseudorandom number generator seeded with NUMS string "zkHack IPA puzzle for 2021-10-26".

A side note about the code: in function sample, generators could be defined more simply as

        let generators = sample_vector::<GAffine, _>(size, &mut rng);

Indeed, sample_vector::<GAffine, _>(size, &mut rng) returns an object of type Vec<GAffine> so there is no need to apply into to each element. On the other hand, there does not seem to be any good reason for sampling hiding_generator as a GProjective and then cast it into a GAffine using into.

Solving the Puzzle

Where are we left at after this code analysis? The proof system seems to be well designed, so presumably the problem is with the "proprietary prover" developed by Bob. Actually, a very well-known implementation vulnerability of sigma protocols is randomness reuse. In the context of discrete-log based signatures such as Schnorr or ECDSA signatures, repeating a nonce allows anyone to compute the private key from just two signatures. Vulnerable implementations lead, for example, to the jailbreaking of Sony's Play Station 3 and the theft of some bitcoins from Android wallets. Even if nonces are not repeated, seemingly small biases in nonce randomness [BH19] or partial information leakage (typically through side channels) [ANT+20] can be sufficient to retrieve the private key.

For the proof system of this puzzle, note how the proof that the system is extractable exploits the fact that from two accepting transcripts with the same commitments but different challenges one can compute a witness This property, which is used in the security proof in a "positive" sense, can actually give rise to a real attack in case a prover reuses the same randomness (and hence the same commitments) in two runs of the (interactive) protocol with different challenges. Here, because the Fiat-Shamir transform is used and the challenge is actually computed by hashing the commitment key, the instance, and the commitments, this would actually result in the same challenge and hence exactly the same transcript! (If you think of the corresponding attack for Schnorr signatures, the challenges are different if the victim signs different messages while reusing the same nonce.) However, one can check that the attack would work if the same randomness was reused for two different instances and the challenges obtained via Fiat-Shamir would be different (because but the reasoning of the extractability proof still applies.

Ca we apply this attack here? Let us display the commitments in the two proofs:

#![allow(unused, unreachable_code)]
use ark_ed_on_bls12_381::Fr;
use ark_ff::Field;
use double_trouble::data::puzzle_data;
use double_trouble::inner_product_argument::utils::challenge;
use double_trouble::verify;
use double_trouble::PUZZLE_DESCRIPTION;
use prompt::{puzzle, welcome};

// additional items brought into scope for puzzle solving
use double_trouble::CommitKey;
use ark_ec::AffineCurve;

fn main() {
    welcome();
    puzzle(PUZZLE_DESCRIPTION);
    let (ck, [instance_and_proof_1, instance_and_proof_2]) = puzzle_data();
    let (instance1, proof1) = instance_and_proof_1;
    let (instance2, proof2) = instance_and_proof_2;
    assert!(verify(&ck, &instance1, &proof1));
    assert!(verify(&ck, &instance2, &proof2));

    // --snip--

    println!("commitment key:");
    for (i, ck_i) in ck.generators.iter().enumerate() {
        println!("ck.generators[{}] = {}", i, ck_i);
    }
    println!("ck.hiding_generator = {}\n", ck.hiding_generator);
    println!("instance 1, C_a:\n {}\n", instance1.comm_a);
    println!("instance 2, C_a:\n {}\n", instance2.comm_a);
    println!("instance 1, b:");
    for (i, b_i) in instance1.b.iter().enumerate() {
        println!("instance1.b[{}] = {}", i, b_i);
    }
    println!("");
    println!("instance 2, b:");
    for (i, b_i) in instance2.b.iter().enumerate() {
        println!("instance2.b[{}] = {}", i, b_i);
    }
    println!("");

    // --snip--

    assert_eq!(instance1, instance2);
    assert_eq!(ck, CommitKey::sample(8));

    // --snip--

    println!("proof1, comm_r:\n {}", proof1.commitment.comm_r);
    println!("proof1, comm_1:\n {}", proof1.commitment.comm_1);
    println!("proof1, comm_2:\n {}\n", proof1.commitment.comm_2);

    println!("proof2, comm_r:\n {}", proof2.commitment.comm_r);
    println!("proof2, comm_1:\n {}", proof2.commitment.comm_1);
    println!("proof2, comm_2:\n {}\n", proof2.commitment.comm_2);

    // --snip--

    if proof1.commitment.comm_r.mul(2) == proof2.commitment.comm_r {
        println!("C_r in the second proof is twice C_r in the first proof\n");
    }

    // --snip--

    let gamma1 = challenge(&ck, &instance1, &proof1.commitment);
    let gamma2 = challenge(&ck, &instance2, &proof2.commitment);
    let s1 = proof1.response.s;
    let s2 = proof2.response.s;
    let u1 = proof1.response.u;
    let u2 = proof2.response.u;
    let k = (gamma1 - Fr::from(2) * gamma2).inverse().unwrap();
    let my_a: Vec<Fr> = s1
        .iter()
        .zip(s2.iter())
        .map(|(c1, c2)| k * (gamma1 * c2 - Fr::from(2) * gamma2 * c1))
        .collect();
    let my_comm_a_rand = k * (gamma1 * u2 - Fr::from(2) * gamma2 * u1);

    let (a, comm_a_rand): (Vec<Fr>, Fr) = {
        // Your solution here!
        (my_a, my_comm_a_rand)
    };
    assert_eq!(
        ck.commit_with_explicit_randomness(&a, comm_a_rand),
        instance1.comm_a
    );
    assert_eq!(
        ck.commit_with_explicit_randomness(&a, comm_a_rand),
        instance2.comm_a
    );
    println!("Puzzle solved!");
}

We get:

proof1, comm_r:
 GroupAffine(x=Fp256 "(54103849E3BA52CCE4C2C7485134A683257413F5B9A1E0DD8B04FAF09D18EC28)", y=Fp256 "(245981A43B6DB2323AB5DD6B59A72428238E1F7416DA0C30E239D3AD8EFC7CF1)")
proof1, comm_1:
 GroupAffine(x=Fp256 "(0ADC9FA9FE8D825BD5DA31F56D60EEB608EA5C47AF990736C3D27FAC048C11E1)", y=Fp256 "(46202BAFFE0145321B52334023D0A64C70B0283EB02A542C788FDF182C06ED4A)")
proof1, comm_2:
 GroupAffine(x=Fp256 "(19FD6B1FBA846B5212FD91F823E1D3CDD944FE641035B2E459876BB67C2A20F5)", y=Fp256 "(63BE5660CDF347C66B93E5CD5F53BED2AC02172EF99A960D5D13B7BE896952BD)")

proof2, comm_r:
 GroupAffine(x=Fp256 "(10098E91DCAF5036082E598F953E71B128BF1DA198D1CC39364272EE6A0FCD20)", y=Fp256 "(2DD073C47A020602A0CEF1C13E6D1365CB0ADC716935AE1A010E1546DF2BF7A1)")
proof2, comm_1:
 GroupAffine(x=Fp256 "(2F6A95827C2DF00431A43567CE757DCA4FABA1439EE6B09EB0A8CE88DF06B68C)", y=Fp256 "(2F06AC079158FC73402C6C4AF49DA4E9A957283439C4B45C25D116F340107C06)")
proof2, comm_2:
 GroupAffine(x=Fp256 "(110DE1B6E88AABFFAA4ED784B5EEF7BF359D5D02C7EDF745A873ED28221C208B)", y=Fp256 "(2A7594A3D6F65B338A8817D79F5ED22FC2751EBDDD5246A88645D25C8510FD85)")

The values of the commitments in the two proofs provided by the puzzle are different, hence we are not dealing with mere "randomness reuse" here. The puzzle description gives us a hint:

he [Bob] developed a proprietary prover that he claims is 2x faster than the standard one described below, but without sacrificing zero-knowledge

So maybe there is a simple relation between the commitments used in the two proofs, allowing to compute several proofs faster? Indeed, one can check that the commitment in the second proof is twice the one in the first proof (arguably there is some guess work here...):

#![allow(unused, unreachable_code)]
use ark_ed_on_bls12_381::Fr;
use ark_ff::Field;
use double_trouble::data::puzzle_data;
use double_trouble::inner_product_argument::utils::challenge;
use double_trouble::verify;
use double_trouble::PUZZLE_DESCRIPTION;
use prompt::{puzzle, welcome};

// additional items brought into scope for puzzle solving
use double_trouble::CommitKey;
use ark_ec::AffineCurve;

fn main() {
    welcome();
    puzzle(PUZZLE_DESCRIPTION);
    let (ck, [instance_and_proof_1, instance_and_proof_2]) = puzzle_data();
    let (instance1, proof1) = instance_and_proof_1;
    let (instance2, proof2) = instance_and_proof_2;
    assert!(verify(&ck, &instance1, &proof1));
    assert!(verify(&ck, &instance2, &proof2));

    // --snip--

    println!("commitment key:");
    for (i, ck_i) in ck.generators.iter().enumerate() {
        println!("ck.generators[{}] = {}", i, ck_i);
    }
    println!("ck.hiding_generator = {}\n", ck.hiding_generator);
    println!("instance 1, C_a:\n {}\n", instance1.comm_a);
    println!("instance 2, C_a:\n {}\n", instance2.comm_a);
    println!("instance 1, b:");
    for (i, b_i) in instance1.b.iter().enumerate() {
        println!("instance1.b[{}] = {}", i, b_i);
    }
    println!("");
    println!("instance 2, b:");
    for (i, b_i) in instance2.b.iter().enumerate() {
        println!("instance2.b[{}] = {}", i, b_i);
    }
    println!("");

    // --snip--

    assert_eq!(instance1, instance2);
    assert_eq!(ck, CommitKey::sample(8));

    // --snip--

    println!("proof1, comm_r:\n {}", proof1.commitment.comm_r);
    println!("proof1, comm_1:\n {}", proof1.commitment.comm_1);
    println!("proof1, comm_2:\n {}\n", proof1.commitment.comm_2);

    println!("proof2, comm_r:\n {}", proof2.commitment.comm_r);
    println!("proof2, comm_1:\n {}", proof2.commitment.comm_1);
    println!("proof2, comm_2:\n {}\n", proof2.commitment.comm_2);

    // --snip--

    if proof1.commitment.comm_r.mul(2) == proof2.commitment.comm_r {
        println!("C_r in the second proof is twice C_r in the first proof\n");
    }

    // --snip--

    let gamma1 = challenge(&ck, &instance1, &proof1.commitment);
    let gamma2 = challenge(&ck, &instance2, &proof2.commitment);
    let s1 = proof1.response.s;
    let s2 = proof2.response.s;
    let u1 = proof1.response.u;
    let u2 = proof2.response.u;
    let k = (gamma1 - Fr::from(2) * gamma2).inverse().unwrap();
    let my_a: Vec<Fr> = s1
        .iter()
        .zip(s2.iter())
        .map(|(c1, c2)| k * (gamma1 * c2 - Fr::from(2) * gamma2 * c1))
        .collect();
    let my_comm_a_rand = k * (gamma1 * u2 - Fr::from(2) * gamma2 * u1);

    let (a, comm_a_rand): (Vec<Fr>, Fr) = {
        // Your solution here!
        (my_a, my_comm_a_rand)
    };
    assert_eq!(
        ck.commit_with_explicit_randomness(&a, comm_a_rand),
        instance1.comm_a
    );
    assert_eq!(
        ck.commit_with_explicit_randomness(&a, comm_a_rand),
        instance2.comm_a
    );
    println!("Puzzle solved!");
}

How can we exploit this fact?

Coding the Attack

In all the following, we will denote quantities related to the first proof with exponent and quantities related to the second proof with exponent e.g., etc.

We have noticed that How can we exploit this information? Presumably, the prover is using and in the second proof. Hence, we have the following system of equations:

We can get rid of by multiplying the first equation by the second equation by and substracting both equations, which yields and hence, letting Similarly, from we obtain

Here is the code computing and and checking that they yield the correct commitment

#![allow(unused, unreachable_code)]
use ark_ed_on_bls12_381::Fr;
use ark_ff::Field;
use double_trouble::data::puzzle_data;
use double_trouble::inner_product_argument::utils::challenge;
use double_trouble::verify;
use double_trouble::PUZZLE_DESCRIPTION;
use prompt::{puzzle, welcome};

// additional items brought into scope for puzzle solving
use double_trouble::CommitKey;
use ark_ec::AffineCurve;

fn main() {
    welcome();
    puzzle(PUZZLE_DESCRIPTION);
    let (ck, [instance_and_proof_1, instance_and_proof_2]) = puzzle_data();
    let (instance1, proof1) = instance_and_proof_1;
    let (instance2, proof2) = instance_and_proof_2;
    assert!(verify(&ck, &instance1, &proof1));
    assert!(verify(&ck, &instance2, &proof2));

    // --snip--

    println!("commitment key:");
    for (i, ck_i) in ck.generators.iter().enumerate() {
        println!("ck.generators[{}] = {}", i, ck_i);
    }
    println!("ck.hiding_generator = {}\n", ck.hiding_generator);
    println!("instance 1, C_a:\n {}\n", instance1.comm_a);
    println!("instance 2, C_a:\n {}\n", instance2.comm_a);
    println!("instance 1, b:");
    for (i, b_i) in instance1.b.iter().enumerate() {
        println!("instance1.b[{}] = {}", i, b_i);
    }
    println!("");
    println!("instance 2, b:");
    for (i, b_i) in instance2.b.iter().enumerate() {
        println!("instance2.b[{}] = {}", i, b_i);
    }
    println!("");

    // --snip--

    assert_eq!(instance1, instance2);
    assert_eq!(ck, CommitKey::sample(8));

    // --snip--

    println!("proof1, comm_r:\n {}", proof1.commitment.comm_r);
    println!("proof1, comm_1:\n {}", proof1.commitment.comm_1);
    println!("proof1, comm_2:\n {}\n", proof1.commitment.comm_2);

    println!("proof2, comm_r:\n {}", proof2.commitment.comm_r);
    println!("proof2, comm_1:\n {}", proof2.commitment.comm_1);
    println!("proof2, comm_2:\n {}\n", proof2.commitment.comm_2);

    // --snip--

    if proof1.commitment.comm_r.mul(2) == proof2.commitment.comm_r {
        println!("C_r in the second proof is twice C_r in the first proof\n");
    }

    // --snip--

    let gamma1 = challenge(&ck, &instance1, &proof1.commitment);
    let gamma2 = challenge(&ck, &instance2, &proof2.commitment);
    let s1 = proof1.response.s;
    let s2 = proof2.response.s;
    let u1 = proof1.response.u;
    let u2 = proof2.response.u;
    let k = (gamma1 - Fr::from(2) * gamma2).inverse().unwrap();
    let my_a: Vec<Fr> = s1
        .iter()
        .zip(s2.iter())
        .map(|(c1, c2)| k * (gamma1 * c2 - Fr::from(2) * gamma2 * c1))
        .collect();
    let my_comm_a_rand = k * (gamma1 * u2 - Fr::from(2) * gamma2 * u1);

    let (a, comm_a_rand): (Vec<Fr>, Fr) = {
        // Your solution here!
        (my_a, my_comm_a_rand)
    };
    assert_eq!(
        ck.commit_with_explicit_randomness(&a, comm_a_rand),
        instance1.comm_a
    );
    assert_eq!(
        ck.commit_with_explicit_randomness(&a, comm_a_rand),
        instance2.comm_a
    );
    println!("Puzzle solved!");
}

Conclusion

The lesson of this puzzle is that one should never try to optimize a Sigma protocol (or any kind of ZK proof) by compromising on the quality of the randomness used by the prover. It is not enough for commitments in different runs of a sigma protocol to be different, they must be computed using independent and fresh randomness in each run. In case the prover does not have access to a reliable source of randomness, one can use proof systems satisfying the stronger resettable zero-knowledge notion [CGGM00].

Puzzle 12: Gamma Ray

Bob was deeply inspired by the Zcash design [1] for private transactions
and had some pretty cool ideas on how to adapt it for his requirements.
He was also inspired by the Mina design for the lightest blockchain and
wanted to combine the two. In order to achieve that, Bob used the MNT6753
cycle of curves to enable efficient infinite recursion, and used elliptic
curve public keys to authorize spends. He released a first version of the
system to the world and Alice soon announced she was able to double spend
by creating two different nullifiers for the same key...

ZCash, Mina, MNT curves, nullifiers... Let's jump into the code to see what this is about.

Code Analysis

The package directory is organized as follows:

puzzle-gamma-ray
├── Cargo.toml
├── leaked_secret.bin
├── leaves.bin
├── proof_keys.bin
└── src
    ├── main.rs
    └── poseidon_parameters.rs

Files leaked_secret.bin, leaves.bin, and proof_keys.bin contain raw data that will be used to initialize variables, as we will see.

The main.rs file brings a lot of items from various arkworks crates into scope, notably for MNT4-753 and MNT6-753 curves, Groth16 proofs, R1CS arithmetization, etc. We will come back to this shortly.

The first thing the main function does is to define a number of variables for the puzzle, in particular:

  • a proving key and a verification key for the Groth16 [Gro16] proof system over the MNT4-753 curve:
    let (pk, vk): (
        <Groth16<MNT4_753> as SNARK<MNT4BigFr>>::ProvingKey,
        <Groth16<MNT4_753> as SNARK<MNT4BigFr>>::VerifyingKey,
    ) = from_file("./proof_keys.bin");
  • a "leaked secret" of type MNT4BigFr (the scalar field of the MNT4-753 curve) used by Alice to spend one of her coins:
    let leaked_secret: MNT4BigFr = from_file("./leaked_secret.bin");
  • a Merkle tree, with leaf leaf at index i = 2 playing a special role:
    let leaves: Vec<Vec<MNT4BigFr>> = from_file("./leaves.bin");
    // ...
    let leaf_crh_params = poseidon_parameters::poseidon_parameters();
    let i = 2;
    let two_to_one_crh_params = leaf_crh_params.clone();
    // ...
    let tree = MntMerkleTree::new(
        &leaf_crh_params,
        &two_to_one_crh_params,
        leaves.iter().map(|x| x.as_slice()),
    )
    .unwrap();
    let root = tree.root();
    let leaf = &leaves[i];

The hash function used to build the Merkle tree is the SNARK-friendly Poseidon hash function [GKR+21] with parameters specified in the poseidon_parameters.rs file. In particular, the underlying field is also the scalar field MNT4BigFr of the MNT4-753 curve. One can also print the leaves of the Merkle tree:

use ark_ec::AffineRepr;
use ark_ff::PrimeField;
use ark_mnt4_753::{Fr as MNT4BigFr, MNT4_753};
use ark_mnt6_753::G1Affine;
use ark_mnt6_753::{constraints::G1Var, Fr as MNT6BigFr};

use ark_crypto_primitives::merkle_tree::{Config, MerkleTree, Path};
use ark_crypto_primitives::{crh::TwoToOneCRHScheme, snark::SNARK};
use ark_groth16::Groth16;
use ark_r1cs_std::fields::fp::FpVar;
use ark_r1cs_std::prelude::*;
use ark_relations::r1cs::{ConstraintSynthesizer, ConstraintSystemRef, SynthesisError};
use ark_serialize::{CanonicalDeserialize, Read};

use prompt::{puzzle, welcome};

use std::fs::File;
use std::io::Cursor;

pub mod poseidon_parameters;

type ConstraintF = MNT4BigFr;

use ark_crypto_primitives::{
    crh::{poseidon, *},
    merkle_tree::constraints::*,
    merkle_tree::*,
};
use ark_std::rand::SeedableRng;

type LeafH = poseidon::CRH<ConstraintF>;
type LeafHG = poseidon::constraints::CRHGadget<ConstraintF>;

type CompressH = poseidon::TwoToOneCRH<ConstraintF>;
type CompressHG = poseidon::constraints::TwoToOneCRHGadget<ConstraintF>;

type LeafVar = [FpVar<ConstraintF>];
struct MntMerkleTreeParamsVar;
impl ConfigGadget<MntMerkleTreeParams, ConstraintF> for MntMerkleTreeParamsVar {
    type Leaf = LeafVar;
    type LeafDigest = <LeafHG as CRHSchemeGadget<LeafH, ConstraintF>>::OutputVar;
    type LeafInnerConverter = IdentityDigestConverter<FpVar<ConstraintF>>;
    type InnerDigest = <CompressHG as TwoToOneCRHSchemeGadget<CompressH, ConstraintF>>::OutputVar;
    type LeafHash = LeafHG;
    type TwoToOneHash = CompressHG;
}

type MntMerkleTree = MerkleTree<MntMerkleTreeParams>;

struct MntMerkleTreeParams;

impl Config for MntMerkleTreeParams {
    type Leaf = [ConstraintF];

    type LeafDigest = <LeafH as CRHScheme>::Output;
    type LeafInnerDigestConverter = IdentityDigestConverter<ConstraintF>;
    type InnerDigest = <CompressH as TwoToOneCRHScheme>::Output;

    type LeafHash = LeafH;
    type TwoToOneHash = CompressH;
}

#[derive(Clone)]
struct SpendCircuit {
    pub leaf_params: <LeafH as CRHScheme>::Parameters,
    pub two_to_one_params: <LeafH as CRHScheme>::Parameters,
    pub root: <CompressH as TwoToOneCRHScheme>::Output,
    pub proof: Path<MntMerkleTreeParams>,
    pub secret: ConstraintF,
    pub nullifier: ConstraintF,
}

impl ConstraintSynthesizer<ConstraintF> for SpendCircuit {
    fn generate_constraints(
        self,
        cs: ConstraintSystemRef<ConstraintF>,
    ) -> Result<(), SynthesisError> {
        // Allocate Merkle Tree Root
        let root = <LeafHG as CRHSchemeGadget<LeafH, _>>::OutputVar::new_input(
            ark_relations::ns!(cs, "new_digest"),
            || Ok(self.root),
        )?;

        // Allocate Parameters for CRH
        let leaf_crh_params_var =
            <LeafHG as CRHSchemeGadget<LeafH, _>>::ParametersVar::new_constant(
                ark_relations::ns!(cs, "leaf_crh_parameter"),
                &self.leaf_params,
            )?;
        let two_to_one_crh_params_var =
            <CompressHG as TwoToOneCRHSchemeGadget<CompressH, _>>::ParametersVar::new_constant(
                ark_relations::ns!(cs, "two_to_one_crh_parameter"),
                &self.two_to_one_params,
            )?;

        let secret = FpVar::new_witness(ark_relations::ns!(cs, "secret"), || Ok(self.secret))?;
        let secret_bits = secret.to_bits_le()?;
        Boolean::enforce_smaller_or_equal_than_le(&secret_bits, MNT6BigFr::MODULUS)?;

        let nullifier = <LeafHG as CRHSchemeGadget<LeafH, _>>::OutputVar::new_input(
            ark_relations::ns!(cs, "nullifier"),
            || Ok(self.nullifier),
        )?;

        let nullifier_in_circuit =
            <LeafHG as CRHSchemeGadget<LeafH, _>>::evaluate(&leaf_crh_params_var, &[secret])?;
        nullifier_in_circuit.enforce_equal(&nullifier)?;

        let base = G1Var::new_constant(ark_relations::ns!(cs, "base"), G1Affine::generator())?;
        let pk = base.scalar_mul_le(secret_bits.iter())?.to_affine()?;

        // Allocate Leaf
        let leaf_g: Vec<_> = vec![pk.x];

        // Allocate Merkle Tree Path
        let cw: PathVar<MntMerkleTreeParams, ConstraintF, MntMerkleTreeParamsVar> =
            PathVar::new_witness(ark_relations::ns!(cs, "new_witness"), || Ok(&self.proof))?;

        cw.verify_membership(
            &leaf_crh_params_var,
            &two_to_one_crh_params_var,
            &root,
            &leaf_g,
        )?
        .enforce_equal(&Boolean::constant(true))?;

        Ok(())
    }
}

fn from_file<T: CanonicalDeserialize>(path: &str) -> T {
    let mut file = File::open(path).unwrap();
    let mut buffer = Vec::new();
    file.read_to_end(&mut buffer).unwrap();
    T::deserialize_uncompressed_unchecked(Cursor::new(&buffer)).unwrap()
}

fn main() {
    welcome();
    puzzle(PUZZLE_DESCRIPTION);

    let rng = &mut ark_std::rand::rngs::StdRng::seed_from_u64(0u64);

    let leaves: Vec<Vec<MNT4BigFr>> = from_file("./leaves.bin");
    let leaked_secret: MNT4BigFr = from_file("./leaked_secret.bin");
    let (pk, vk): (
        <Groth16<MNT4_753> as SNARK<MNT4BigFr>>::ProvingKey,
        <Groth16<MNT4_753> as SNARK<MNT4BigFr>>::VerifyingKey,
    ) = from_file("./proof_keys.bin");

    let leaf_crh_params = poseidon_parameters::poseidon_parameters();
    let i = 2;
    let two_to_one_crh_params = leaf_crh_params.clone();

    let nullifier = <LeafH as CRHScheme>::evaluate(&leaf_crh_params, vec![leaked_secret]).unwrap();

    let tree = MntMerkleTree::new(
        &leaf_crh_params,
        &two_to_one_crh_params,
        leaves.iter().map(|x| x.as_slice()),
    )
    .unwrap();
    let root = tree.root();
    let leaf = &leaves[i];

    let tree_proof = tree.generate_proof(i).unwrap();
    assert!(tree_proof
        .verify(
            &leaf_crh_params,
            &two_to_one_crh_params,
            &root,
            leaf.as_slice()
        )
        .unwrap());

    let c = SpendCircuit {
        leaf_params: leaf_crh_params.clone(),
        two_to_one_params: two_to_one_crh_params.clone(),
        root: root.clone(),
        proof: tree_proof.clone(),
        nullifier: nullifier.clone(),
        secret: leaked_secret.clone(),
    };

    let proof = Groth16::<MNT4_753>::prove(&pk, c.clone(), rng).unwrap();

    // --snip--

    for (i, leaf) in leaves.iter().enumerate() {
        for (j, p) in leaf.iter().enumerate() {
            println!("leaves[{}][{}]: {}", i, j, p);
        }
    }
    println!("");

    assert!(Groth16::<MNT4_753>::verify(&vk, &vec![root, nullifier], &proof).unwrap());

    // --snip--

    /* Enter your solution here */

    // cast leaked_secret as big integer...
    let s: num_bigint::BigUint = leaked_secret.into();
    // ... and then as an element of MNT6BigFr
    let s_as_mnt6bigfr = MNT6BigFr::from_le_bytes_mod_order(&s.to_bytes_le());
    // take the opposite and cast it again as a big integer...
    let secret_hack_as_bigint: num_bigint::BigUint = (-s_as_mnt6bigfr).into();
    // and finally cast it back to an element of MNT4BigFr
    let secret_hack = MNT4BigFr::from_le_bytes_mod_order(&secret_hack_as_bigint.to_bytes_le());
    // compute the corresponding nullifier
    let nullifier_hack =
        <LeafH as CRHScheme>::evaluate(&leaf_crh_params, vec![secret_hack]).unwrap();
    println!("nullifier_hack: {}", nullifier_hack);
    println!("secret_hack: {}", secret_hack_as_bigint);

    /* End of solution */

    assert_ne!(nullifier, nullifier_hack);

    let c2 = SpendCircuit {
        leaf_params: leaf_crh_params.clone(),
        two_to_one_params: two_to_one_crh_params.clone(),
        root: root.clone(),
        proof: tree_proof.clone(),
        nullifier: nullifier_hack.clone(),
        secret: secret_hack.clone(),
    };

    let proof = Groth16::<MNT4_753>::prove(&pk, c2.clone(), rng).unwrap();

    assert!(Groth16::<MNT4_753>::verify(&vk, &vec![root, nullifier_hack], &proof).unwrap());

    println!("Puzzle solved!");
}

const PUZZLE_DESCRIPTION: &str = r"
Bob was deeply inspired by the Zcash design [1] for private transactions [2] and had some pretty cool ideas on how to adapt it for his requirements. He was also inspired by the Mina design for the lightest blockchain and wanted to combine the two. In order to achieve that, Bob used the MNT7653 cycle of curves to enable efficient infinite recursion, and used elliptic curve public keys to authorize spends. He released a first version of the system to the world and Alice soon announced she was able to double spend by creating two different nullifiers for the same key... 

[1] https://zips.z.cash/protocol/protocol.pdf
";

There are four leaves, each consisting of a single MNT4BigFr element. At this point it's not clear what these leaves represent but we will clarify this in a moment.

Then, a Merkle proof (a proof that a specific leaf contains a specific element) is computed for the leaf at index i = 2:

    let tree_proof = tree.generate_proof(i).unwrap();

If you're unfamiliar with how ZCash works, the state of the chain is encoded in a Merkle tree where each leaf represents a coin. Attached to this leaf is a public key and a nullifier (originally called coin serial number in the ZeroCash paper [BCG+14]) whose role is to prevent double spends: when a coin is spent, the corresponding nullifier is revealed and recorded and the protocol later ensures that any transaction using the same nullifier (and hence trying to spend the same coin) is invalid. Note in particular that leaves of the Merkle tree do not represent UTXOs but rather all coins that ever existed, spent or unspent. For more details about how nullifiers work, this blog post by Ariel Gabizon explains it very well.

Here, we can see that the nullifier is computed as the hash of the secret allowing to spend a coin:

    let nullifier = <LeafH as CRHScheme>::evaluate(&leaf_crh_params, vec![leaked_secret]).unwrap();

In order to spend the coin represented by leaf at index i = 2, Alice needs to provide a Groth16 proof that her transaction is valid:

    let c = SpendCircuit {
        leaf_params: leaf_crh_params.clone(),
        two_to_one_params: two_to_one_crh_params.clone(),
        root: root.clone(),
        proof: tree_proof.clone(),
        nullifier: nullifier.clone(),
        secret: leaked_secret.clone(),
    };

    let proof = Groth16::<MNT4_753>::prove(&pk, c.clone(), rng).unwrap();

    assert!(Groth16::<MNT4_753>::verify(&vk, &vec![root, nullifier], &proof).unwrap());

We will get into what SpendCircuit is shortly, but before that, let's take a look at the part where we need to work to solve the puzzle:

    /* Enter your solution here */

    let nullifier_hack = MNT4BigFr::from(0);
    let secret_hack = MNT4BigFr::from(0);

    /* End of solution */

    assert_ne!(nullifier, nullifier_hack);

    let c2 = SpendCircuit {
        leaf_params: leaf_crh_params.clone(),
        two_to_one_params: two_to_one_crh_params.clone(),
        root: root.clone(),
        proof: tree_proof.clone(),
        nullifier: nullifier_hack.clone(),
        secret: secret_hack.clone(),
    };

    let proof = Groth16::<MNT4_753>::prove(&pk, c2.clone(), rng).unwrap();

    assert!(Groth16::<MNT4_753>::verify(&vk, &vec![root, nullifier_hack], &proof).unwrap());

As we can see, we must find another nullifier nullifier_hack (different from nullifier) and another secret secret_hack allowing to spend the same coin again (this is the same coin because the second Groth16 proof uses the same Merkle root root and the same Merkle proof tree_proof as the first Groth16 proof).

Next, let us unravel what the spending circuit does.

Understanding the Spending Circuit

Let us try to understand what the circuit for which Groth16 proofs are generated does. It is specified by the generate_constraints method from the ConstraintSynthesizer trait implemented on the SpendCircuit struct:

impl ConstraintSynthesizer<ConstraintF> for SpendCircuit {
    fn generate_constraints(
        self,
        cs: ConstraintSystemRef<ConstraintF>,
    ) -> Result<(), SynthesisError> {

    // ...

    }
}

This method generates R1CS constraints (over the field ConstraintF = MNT4BigFr, the scalar field of the MNT4-753 curve) to which the Groth16 proof system is then applied. There is no need to understand precisely how R1CS arithmetization works exactly, just getting what the circuit does will be enough to solve the puzzle. For this, having a quick look at the arkworks R1CS tutorial can help. Let's go step by step into the definition of the circuit. A circuit can have public inputs (the "instance", declared with the new_input method of the AllocVar trait) and private inputs (the "witness", declared with the new_witness method of the AllocVar trait): the proof generation function takes public and private inputs and generates a proof that together they "satisfy" the circuit; the verification function takes only the public inputs and the proof and returns 0 or 1 (valid/invalid). It can also have constants declared with the new_constant of the same AllocVar trait.

Here, the public inputs consist of the Merkle root and the nullifier:

        // Allocate Merkle Tree Root
        let root = <LeafHG as CRHSchemeGadget<LeafH, _>>::OutputVar::new_input(
            ark_relations::ns!(cs, "new_digest"),
            || Ok(self.root),
        )?;

        // ...

        let nullifier = <LeafHG as CRHSchemeGadget<LeafH, _>>::OutputVar::new_input(
            ark_relations::ns!(cs, "nullifier"),
            || Ok(self.nullifier),
        )?;

The private inputs consist of the secret secret and the Merkle proof (the field proof of the SpendCircuit struct):

        let secret = FpVar::new_witness(ark_relations::ns!(cs, "secret"), || Ok(self.secret))?;

        // ...

        // Allocate Merkle Tree Path
        let cw: PathVar<MntMerkleTreeParams, ConstraintF, MntMerkleTreeParamsVar> =
            PathVar::new_witness(ark_relations::ns!(cs, "new_witness"), || Ok(&self.proof))?;

Then, the generate_constraints method calls a number of "gadgets" to implement the logic of the circuit. First, it checks that the secret is less than the size of the scalar field of the MNT6-763 curve:

        let secret_bits = secret.to_bits_le()?;
        Boolean::enforce_smaller_or_equal_than_le(&secret_bits, MNT6BigFr::MODULUS)?;

It also checks that the hash of the secret is equal to the nullifier passed as input:

        let nullifier_in_circuit =
            <LeafHG as CRHSchemeGadget<LeafH, _>>::evaluate(&leaf_crh_params_var, &[secret])?;
        nullifier_in_circuit.enforce_equal(&nullifier)?;

Then, it computes the public key associated with secret:

        let base = G1Var::new_constant(ark_relations::ns!(cs, "base"), G1Affine::generator())?;
        let pk = base.scalar_mul_le(secret_bits.iter())?.to_affine()?;

Note that the G1Affine type here represents the group of points of the MNT6-753 curve in short Weierstrass affine representation, meaning pk here is a point on this curve, encoded as a pair of elements of the base field of the MNT6-753 curve, and computed as where is the generator of this group corresponding to variable base and corresponds to secret.

Finally, the circuit verifies that the Merkle proof passed as private input to the circuit is valid for the root passed as public input and the leaf defined as pk.x, the -coordinate of pk.

        // Allocate Leaf
        let leaf_g: Vec<_> = vec![pk.x];

        // ...

        cw.verify_membership(
            &leaf_crh_params_var,
            &two_to_one_crh_params_var,
            &root,
            &leaf_g,
        )?
        .enforce_equal(&Boolean::constant(true))?;

Something might seem strange here at first. Point pk lies on the MNT6-753 curve, hence pk.x is an element of its base field Yet we saw previously that leaves of the Merkle tree were defined as elements of the scalar field of the MNT4-753 curve. In fact, this is fine because MNT4-753 and MNT6-753 form a "cycle of curves", meaning the scalar field of one is the base field of the other. If and denote respectively the base field and the scalar field of MNT4-753 and and denote the base field and the scalar field of MNT6-753, then forming a cycle means that and

Cycles of curves were proposed in [BCTV14] to solve the "field mismatch" problem when composing SNARKs recursively [BCCT13]. For more background, see for example [AHG23].

This concludes our inspection of the spending circuit. In short, to spend a coin, Alice must compute a Groth16 proof of satisfiability of the spending circuit using the secret key corresponding to the (-coordinate of the) public key of the leaf representing the coin, a valid Merkle proof for this public key, and the corresponding nullifier, i.e., the hash of the secret.

We are now ready to solve the puzzle.

Solving the Puzzle

Recall that our task is to find another secret/nullifier pair that will satisfy the spending circuit. We are not allowed to modify the Merkle root nor the Merkle proof, meaning this secret/nullifier pair must correspond to the exact same leaf of the Merkle tree.

But wait, the leaf does not really encode the public key pk, only its -coordinate! This is the key insight to solve the puzzle. Indeed, for any point on the curve (different from the point at infinity), there is another point on the curve with the same -coordinate, namely This point simply correspond to secret where is the size of the scalar field of the MNT6-753 curve since

Hence, all we have to do to solve the puzzle is to take the opposite of leaked_secret mod and compute the corresponding nullifier! There's a catch though: leaked_secret is defined as an element in the scalar field of MNT4-753, hence simply defining secret_hack = - leaked_secret won't work as this will compute where is the size of the scalar field of MNT4-753.

There are probably several options here, but a simple one is to cast leaked_secret as a big integer first. For this, we need to add the num-bigint crate to the project:

$ cargo add num-bigint

Here is the code allowing to solve the puzzle:

use ark_ec::AffineRepr;
use ark_ff::PrimeField;
use ark_mnt4_753::{Fr as MNT4BigFr, MNT4_753};
use ark_mnt6_753::G1Affine;
use ark_mnt6_753::{constraints::G1Var, Fr as MNT6BigFr};

use ark_crypto_primitives::merkle_tree::{Config, MerkleTree, Path};
use ark_crypto_primitives::{crh::TwoToOneCRHScheme, snark::SNARK};
use ark_groth16::Groth16;
use ark_r1cs_std::fields::fp::FpVar;
use ark_r1cs_std::prelude::*;
use ark_relations::r1cs::{ConstraintSynthesizer, ConstraintSystemRef, SynthesisError};
use ark_serialize::{CanonicalDeserialize, Read};

use prompt::{puzzle, welcome};

use std::fs::File;
use std::io::Cursor;

pub mod poseidon_parameters;

type ConstraintF = MNT4BigFr;

use ark_crypto_primitives::{
    crh::{poseidon, *},
    merkle_tree::constraints::*,
    merkle_tree::*,
};
use ark_std::rand::SeedableRng;

type LeafH = poseidon::CRH<ConstraintF>;
type LeafHG = poseidon::constraints::CRHGadget<ConstraintF>;

type CompressH = poseidon::TwoToOneCRH<ConstraintF>;
type CompressHG = poseidon::constraints::TwoToOneCRHGadget<ConstraintF>;

type LeafVar = [FpVar<ConstraintF>];
struct MntMerkleTreeParamsVar;
impl ConfigGadget<MntMerkleTreeParams, ConstraintF> for MntMerkleTreeParamsVar {
    type Leaf = LeafVar;
    type LeafDigest = <LeafHG as CRHSchemeGadget<LeafH, ConstraintF>>::OutputVar;
    type LeafInnerConverter = IdentityDigestConverter<FpVar<ConstraintF>>;
    type InnerDigest = <CompressHG as TwoToOneCRHSchemeGadget<CompressH, ConstraintF>>::OutputVar;
    type LeafHash = LeafHG;
    type TwoToOneHash = CompressHG;
}

type MntMerkleTree = MerkleTree<MntMerkleTreeParams>;

struct MntMerkleTreeParams;

impl Config for MntMerkleTreeParams {
    type Leaf = [ConstraintF];

    type LeafDigest = <LeafH as CRHScheme>::Output;
    type LeafInnerDigestConverter = IdentityDigestConverter<ConstraintF>;
    type InnerDigest = <CompressH as TwoToOneCRHScheme>::Output;

    type LeafHash = LeafH;
    type TwoToOneHash = CompressH;
}

#[derive(Clone)]
struct SpendCircuit {
    pub leaf_params: <LeafH as CRHScheme>::Parameters,
    pub two_to_one_params: <LeafH as CRHScheme>::Parameters,
    pub root: <CompressH as TwoToOneCRHScheme>::Output,
    pub proof: Path<MntMerkleTreeParams>,
    pub secret: ConstraintF,
    pub nullifier: ConstraintF,
}

impl ConstraintSynthesizer<ConstraintF> for SpendCircuit {
    fn generate_constraints(
        self,
        cs: ConstraintSystemRef<ConstraintF>,
    ) -> Result<(), SynthesisError> {
        // Allocate Merkle Tree Root
        let root = <LeafHG as CRHSchemeGadget<LeafH, _>>::OutputVar::new_input(
            ark_relations::ns!(cs, "new_digest"),
            || Ok(self.root),
        )?;

        // Allocate Parameters for CRH
        let leaf_crh_params_var =
            <LeafHG as CRHSchemeGadget<LeafH, _>>::ParametersVar::new_constant(
                ark_relations::ns!(cs, "leaf_crh_parameter"),
                &self.leaf_params,
            )?;
        let two_to_one_crh_params_var =
            <CompressHG as TwoToOneCRHSchemeGadget<CompressH, _>>::ParametersVar::new_constant(
                ark_relations::ns!(cs, "two_to_one_crh_parameter"),
                &self.two_to_one_params,
            )?;

        let secret = FpVar::new_witness(ark_relations::ns!(cs, "secret"), || Ok(self.secret))?;
        let secret_bits = secret.to_bits_le()?;
        Boolean::enforce_smaller_or_equal_than_le(&secret_bits, MNT6BigFr::MODULUS)?;

        let nullifier = <LeafHG as CRHSchemeGadget<LeafH, _>>::OutputVar::new_input(
            ark_relations::ns!(cs, "nullifier"),
            || Ok(self.nullifier),
        )?;

        let nullifier_in_circuit =
            <LeafHG as CRHSchemeGadget<LeafH, _>>::evaluate(&leaf_crh_params_var, &[secret])?;
        nullifier_in_circuit.enforce_equal(&nullifier)?;

        let base = G1Var::new_constant(ark_relations::ns!(cs, "base"), G1Affine::generator())?;
        let pk = base.scalar_mul_le(secret_bits.iter())?.to_affine()?;

        // Allocate Leaf
        let leaf_g: Vec<_> = vec![pk.x];

        // Allocate Merkle Tree Path
        let cw: PathVar<MntMerkleTreeParams, ConstraintF, MntMerkleTreeParamsVar> =
            PathVar::new_witness(ark_relations::ns!(cs, "new_witness"), || Ok(&self.proof))?;

        cw.verify_membership(
            &leaf_crh_params_var,
            &two_to_one_crh_params_var,
            &root,
            &leaf_g,
        )?
        .enforce_equal(&Boolean::constant(true))?;

        Ok(())
    }
}

fn from_file<T: CanonicalDeserialize>(path: &str) -> T {
    let mut file = File::open(path).unwrap();
    let mut buffer = Vec::new();
    file.read_to_end(&mut buffer).unwrap();
    T::deserialize_uncompressed_unchecked(Cursor::new(&buffer)).unwrap()
}

fn main() {
    welcome();
    puzzle(PUZZLE_DESCRIPTION);

    let rng = &mut ark_std::rand::rngs::StdRng::seed_from_u64(0u64);

    let leaves: Vec<Vec<MNT4BigFr>> = from_file("./leaves.bin");
    let leaked_secret: MNT4BigFr = from_file("./leaked_secret.bin");
    let (pk, vk): (
        <Groth16<MNT4_753> as SNARK<MNT4BigFr>>::ProvingKey,
        <Groth16<MNT4_753> as SNARK<MNT4BigFr>>::VerifyingKey,
    ) = from_file("./proof_keys.bin");

    let leaf_crh_params = poseidon_parameters::poseidon_parameters();
    let i = 2;
    let two_to_one_crh_params = leaf_crh_params.clone();

    let nullifier = <LeafH as CRHScheme>::evaluate(&leaf_crh_params, vec![leaked_secret]).unwrap();

    let tree = MntMerkleTree::new(
        &leaf_crh_params,
        &two_to_one_crh_params,
        leaves.iter().map(|x| x.as_slice()),
    )
    .unwrap();
    let root = tree.root();
    let leaf = &leaves[i];

    let tree_proof = tree.generate_proof(i).unwrap();
    assert!(tree_proof
        .verify(
            &leaf_crh_params,
            &two_to_one_crh_params,
            &root,
            leaf.as_slice()
        )
        .unwrap());

    let c = SpendCircuit {
        leaf_params: leaf_crh_params.clone(),
        two_to_one_params: two_to_one_crh_params.clone(),
        root: root.clone(),
        proof: tree_proof.clone(),
        nullifier: nullifier.clone(),
        secret: leaked_secret.clone(),
    };

    let proof = Groth16::<MNT4_753>::prove(&pk, c.clone(), rng).unwrap();

    // --snip--

    for (i, leaf) in leaves.iter().enumerate() {
        for (j, p) in leaf.iter().enumerate() {
            println!("leaves[{}][{}]: {}", i, j, p);
        }
    }
    println!("");

    assert!(Groth16::<MNT4_753>::verify(&vk, &vec![root, nullifier], &proof).unwrap());

    // --snip--

    /* Enter your solution here */

    // cast leaked_secret as big integer...
    let s: num_bigint::BigUint = leaked_secret.into();
    // ... and then as an element of MNT6BigFr
    let s_as_mnt6bigfr = MNT6BigFr::from_le_bytes_mod_order(&s.to_bytes_le());
    // take the opposite and cast it again as a big integer...
    let secret_hack_as_bigint: num_bigint::BigUint = (-s_as_mnt6bigfr).into();
    // and finally cast it back to an element of MNT4BigFr
    let secret_hack = MNT4BigFr::from_le_bytes_mod_order(&secret_hack_as_bigint.to_bytes_le());
    // compute the corresponding nullifier
    let nullifier_hack =
        <LeafH as CRHScheme>::evaluate(&leaf_crh_params, vec![secret_hack]).unwrap();
    println!("nullifier_hack: {}", nullifier_hack);
    println!("secret_hack: {}", secret_hack_as_bigint);

    /* End of solution */

    assert_ne!(nullifier, nullifier_hack);

    let c2 = SpendCircuit {
        leaf_params: leaf_crh_params.clone(),
        two_to_one_params: two_to_one_crh_params.clone(),
        root: root.clone(),
        proof: tree_proof.clone(),
        nullifier: nullifier_hack.clone(),
        secret: secret_hack.clone(),
    };

    let proof = Groth16::<MNT4_753>::prove(&pk, c2.clone(), rng).unwrap();

    assert!(Groth16::<MNT4_753>::verify(&vk, &vec![root, nullifier_hack], &proof).unwrap());

    println!("Puzzle solved!");
}

const PUZZLE_DESCRIPTION: &str = r"
Bob was deeply inspired by the Zcash design [1] for private transactions [2] and had some pretty cool ideas on how to adapt it for his requirements. He was also inspired by the Mina design for the lightest blockchain and wanted to combine the two. In order to achieve that, Bob used the MNT7653 cycle of curves to enable efficient infinite recursion, and used elliptic curve public keys to authorize spends. He released a first version of the system to the world and Alice soon announced she was able to double spend by creating two different nullifiers for the same key... 

[1] https://zips.z.cash/protocol/protocol.pdf
";

A Note about Hint 1

The first hint revealed to help solve the puzzle points to Lemma 5.4.7 of the Zcash specifications. It reads:

Let Then

Here, is (a subgroup of) the Jubjub curve developed by the ZCash team. Its base field is actually the scalar field of BLS12-381, allowing to efficiently prove algebraic statements about this curve using BLS12-381-based SNARKs.

The reason why the "attack" used to solve the puzzle would not apply with this curve is that it is a twisted Edwards curve rather than a curve in short Weierstrass form. In particular, Theorem 5.4.8 in the same document states that the function mapping points in to their -coordinate is injective, meaning two distinct points have distinct -coordinates. In this case, it is safe to encode a point by recording only its -coordinate in a leaf.

ZK Hack Puzzle 13: Supervillain

Bob has been designing a new optimized signature scheme for his L1 based on BLS
signatures. Specifically, he wanted to be able to use the most efficient form
of BLS signature aggregation, where you just add the signatures together rather
than having to delinearize them. In order to do that, he designed a
proof-of-possession scheme based on the B-KEA assumption he found in the the
Sapling security analysis paper by Mary Maller [1]. Based the reasoning in the
Power of Proofs-of-Possession paper [2], he concluded that his scheme would be
secure. After he deployed the protocol, he found it was attacked and there was
a malicious block entered the system, fooling all the light nodes...

BLS aggregate signatures, proofs of possession, ... This should be interesting and quite relevant to Ethereum since the beacon chain uses BLS signature aggregation since the Merge. Let's take a look at the code.

Code Analysis

The package directory is organized as follows:

puzzle-supervillain
├── Cargo.toml
├── public_keys.bin
└── src
    └── main.rs

The code is pretty simpler than in most other puzzles. Let's take a look at main.rs. It first brings a number of items from arkworks crates into scope, in particular related to the BLS12-381 pairing-friendly curve. Let us introduce straightaway some (standard) notation that will help us explain what's going on in this puzzle. In all the following, we will let and denote the two groups related to this curve, denote the order of these groups, and denote the corresponding scalar field. Types G1Affine and G2Affine respectively correspond to points in and represented in short Weierstrass form. We will also let denote the generator of returned by G1Affine::generator() and the pairing map from to which for any and satisfies

Function main is also quite simple. First, it creates a vector public_keys of public key/proof pairs by deserializing the data in public_keys.bin and checks the proofs (we will come back to what these proofs are and what function pok_verify does in a moment, but the idea is that should prove possession of the secret key corresponding to public key

    let public_keys: Vec<(G1Affine, G2Affine)> = from_file("public_keys.bin");

    public_keys
        .iter()
        .enumerate()
        .for_each(|(i, (pk, proof))| pok_verify(*pk, i, *proof));

There are 9 public key/proofs pair in total. We can print these public keys and proofs if we want, although there's not much remarkable about them:

    for (i, (pk, proof)) in public_keys.iter().enumerate() {
        println!("public_keys[{}].pk: {}", i, pk);
        println!("public_keys[{}].proof: {}\n", i, proof);
    }

We get:

public_keys[0].pk: (3951285727116295734026345521365512737910419062953537242549018568832618561552329351430853683858605302756892560527243, 2015562491477402081445210194864883205939261701444702459066048593747231321865210770475706036490256666079149530034340)
public_keys[0].proof: (QuadExtField(3882041700531663080715209917545481876729765846180025125888908765171220948117125212571143276457991056473137284787111 + 1050510852775817852847416507597900558865419625189113347525854846152071282929138131519646186856851998395894795581147 * u), QuadExtField(2276155031300751614807654043081790005359418219874201281987179783127300973516686184393036674096389076445085471809656 + 1499576108176939010561117214629143885375859964478472578277936997691719552590218342980721074321136489528861749450818 * u))

[...]

public_keys[8].pk: (1590421703439460875501217084904151928024777767932960691388269493213756601481659194276214126863101251608754666663069, 2514873486426372291261215275870411521130979618244175339961890502447807774325646533262394833397969654866179194151855)
public_keys[8].proof: (QuadExtField(3425299122867009301502774777484371853886695233020764827572267590585668332652640989134711487565285919169053024365378 + 3944021846570525607818281571743626433255634014300232163668270031644045523012218166565184866119660182557753392994734 * u), QuadExtField(109895792935386285998226095339950304065932040948382827892877878577064124319340998704951294008715504779991886183613 + 2408179566338427416508441175406184228438312159836037637845488388439414219265617277506646523139939207977761770383651 * u))

Then, function main defines the index of an extra key and a message for which we will have to forge a signature and expects us to define three things: a new public key, a new proof and an aggregate signature:

    let new_key_index = public_keys.len();
    let message = b"YOUR GITHUB USERNAME";

    /* Enter solution here */

    let new_key = G1Affine::zero();
    let new_proof = G2Affine::zero();
    let aggregate_signature = G2Affine::zero();

    /* End of solution */

The solution should satisfy two conditions. First, the new proof should be valid for the new public key, and second, the aggregate signature should be a valid BLS signature with respect to some aggregate key:

    pok_verify(new_key, new_key_index, new_proof);
    let aggregate_key = public_keys
        .iter()
        .fold(G1Projective::from(new_key), |acc, (pk, _)| acc + pk)
        .into_affine();
    bls_verify(aggregate_key, aggregate_signature, message)

This aggregate key is defined by adding all public keys from the puzzle data to the new key we must specify in our solution. In other words, letting denote the new public key, the aggregate key (let us denote it is simply the sum of all public keys:

That's a good start. In order to progress, let us recall how BLS signatures work.

BLS Signatures

There are many great online resources about BLS signatures such as here or the IETF draft. See also the corresponding chapter in this book. In order to make this write-up more self-contained, let us quickly recall how they work.

The signature and verification functions are defined by these few lines of code:

fn bls_sign(sk: Fr, msg: &[u8]) -> G2Affine {
    hasher().hash(msg).unwrap().mul(sk).into_affine()
}

fn bls_verify(pk: G1Affine, sig: G2Affine, msg: &[u8]) {
    assert!(Bls12_381::multi_pairing(
        &[pk, G1Affine::generator()],
        &[hasher().hash(msg).unwrap().neg(), sig]
    )
    .is_zero());
}

Here, public keys are elements from group and signatures are elements from (this is the choice made in Ethereum but in general, this can be swapped depending on which one should be shorter for a specific use case; elements from are roughly twice longer than elements from and arithmetic is slower in than in

Given a secret key the corresponding public key is (recall that is a commonly agreed generator of

In order to sign a message one first hashes it "into" with some hash function and multiply the resulting point by meaning the signature is

The verification function, given a public key a message and a signature asserts whether which is equivalent to (but more efficient to compute) This works since for a signature computed correctly one has and hence (Note that is denoted additively here, which would make Vitalik happy; multiplicative notation is more common since is a multiplicative subgroup of where is the size of the base field of BLS12-381).

Hashing into elliptic curves is delicate: this should be done in a way that does not leak any algebraic relation (such as relative discrete logarithms) between resulting points (more formally, should behave as a random oracle returning a random point for each input). RFC 9380 describes various methods for doing this.

Here, the hash function used for hashing into is the so-called Wahby-Boneh map [WB19]:

fn hasher() -> MapToCurveBasedHasher<G2Projective, DefaultFieldHasher<Sha256, 128>, WBMap<Config>> {
    let wb_to_curve_hasher =
        MapToCurveBasedHasher::<G2Projective, DefaultFieldHasher<Sha256, 128>, WBMap<Config>>::new(
            &[1, 3, 3, 7],
        )
        .unwrap();
    wb_to_curve_hasher
}

It's pretty complicated and fortunately there's no need to understand how it works exactly to solve the puzzle.

Aggregating BLS Signatures

BLS signatures have the nice property that they can be aggregated by simply adding them. Namely, if we have signatures corresponding to public key/message pairs we can simply take the sum of all signatures Then, to check that the messages have been signed, one verifies that This cuts the cost of verification roughly by half (one only has to compute pairings instead of when checking signatures one by one; the cost of point additions to compute the aggregate signature is negligible compared to the cost of a pairing).

This can be shown to be secure assuming all messages are distinct as otherwise a so-called rogue-key attack is possible. To see why, let us take the simple case of two signers with respective public keys and who want to sign the same message (A note about the wording: when all signers want to sign a common message, this is more often called a multi-signature scheme rather than an aggregate signature scheme). To be valid, the aggregate signature must satisfy Because messages are the same, this can be written Then, assuming signer 0 announced its public key first, signer 1 could just choose its public key as for some known secret Then, signer 1 can compute an aggregate signature on its own for any message (even messages signer 0 would refuse to sign) simply by computing this is a valid signature for under the "aggregate" key

There are several solutions to thwart this attack:

  • one can use "augmented messages", meaning signer signs instead of just this was suggested in the original paper where aggregate BLS signatures were proposed [BGLS03] and further formalized in [BNN07];
  • one can use "delinearization", meaning each public key is multiplied by a some random-looking scalar for some hash function with values in this was first suggested to solve the corresponding problem for Schnorr multisignatures and later studied for BLS in [BDN18];
  • finally (and this is the solution the puzzle is about) one can use "proofs of possession", as suggested in [RY07] (reference [2] in the puzzle description); this means that each signer must prove that it has access to the secret key corresponding to its public key; this thwarts rogue-key attacks since signer 1 does not know the secret key corresponding to and hence cannot provide a proof of possession.

Proofs of Possession

What is a proof of possession (PoP) exactly? There is actually no clear security definition. [RY07] defines it as follows:

A POP attests that a party has access to the secret key associated with his/her public key, which is typically accomplished using the functionality of the key pair’s intended scheme. For signature schemes, the simplest POP has a party sign its certificate request message and send both the message and signature to the CA.

Hence, this is somewhat reminiscent of a proof of knowledge, except there is no formal guarantee that there exists an extractor which is capable of extracting the secret key when granted arbitrary access to the party implementing the PoP. In particular, this makes PoPs more cumbersome to use in security proofs. In a protocol based on a proof of knowledge, the security proof is typically modular, meaning it only relies on the assumption that the PoK satisfies extractability. The protocol is then guaranteed to be secure when used with any PoK meeting the definition (which can be proved independently). On the contrary, since PoPs have no formal security definition, one must provide a new security proof for each PoP scheme one may want to use the protocol with.

It has been proved in [RY07] that BLS multi-signatures are secure when used with the PoP which consists in signing its own public key with the corresponding secret key. Namely, if my key pair is then the proof of possession is the point on (There's a slight subtlety here: the hash function used to compute PoPs should be different from the one used to actually sign messages; prepending two different constants to the argument of the hash function does the trick.) In fact, this PoP can even be proved to be a proof of knowledge under a very strong assumption called B-KEA: this is shown in the paper by Maller mentioned in the puzzle description (see Lemma 1).

How does the proof used in the puzzle work exactly? It is defined as follows:

fn derive_point_for_pok(i: usize) -> G2Affine {
    let rng = &mut ark_std::rand::rngs::StdRng::seed_from_u64(20399u64);
    G2Affine::rand(rng).mul(Fr::from(i as u64 + 1)).into()
}

#[allow(dead_code)]
fn pok_prove(sk: Fr, i: usize) -> G2Affine {
    derive_point_for_pok(i).mul(sk).into()
}

fn pok_verify(pk: G1Affine, i: usize, proof: G2Affine) {
    assert!(Bls12_381::multi_pairing(
        &[pk, G1Affine::generator()],
        &[derive_point_for_pok(i).neg(), proof]
    )
    .is_zero());
}

First, a point in is computed as a function of the index of the public key in the vector of public keys. This point is equal to where is a random point in returned by the rand function seeded with some fixed string 20399. Importantly, the same point is used in every proofs. The proof of possession of the secret key for public key is then the point defined as Hence, this kind of looks like a BLS signature, except that the point which is multiplied by the secret key is rather than Can we exploit this?

Solving the Puzzle

We are now ready to gather all the pieces and solve the puzzle. The straightforward idea is to mount a rogue key attack by choosing some "rogue" secret key and define our new public key as minus the sum of all other public keys: This way, the aggregate key is simply This means that we know the secret key corresponding to and hence we can forge a valid signature (with respect to for any message we want by just computing

Are we done, then? Well, not exactly, as we now have to come with a valid proof of possession of the secret key for But, we don't know this secret key! It seems like we have just moved the problem elsewhere.

Let us write formally what the proof for should be. For let be the secret key corresponding to public key Since can be written the secret key corresponding to is Recall that the proof for the -th public key is Hence, the valid proof that we must compute is

We can compute the first term since we know On the other hand, we don't know secret keys But we have access to proofs and this allows us to compute Hence, we obtain that the proof we're looking for can be computed from the puzzle data as

Here is the code computing the new key, the new proof, and the aggregate signature (the rogue secret is generated pseudorandomly):

use ark_bls12_381::{g2::Config, Bls12_381, Fr, G1Affine, G1Projective, G2Affine, G2Projective};
use ark_ec::{
    hashing::{curve_maps::wb::WBMap, map_to_curve_hasher::MapToCurveBasedHasher, HashToCurve},
    pairing::Pairing,
    AffineRepr, CurveGroup,
};
use ark_ff::field_hashers::DefaultFieldHasher;
use ark_ff::Field;

use ark_serialize::{CanonicalDeserialize, Read};

use prompt::{puzzle, welcome};

use sha2::Sha256;
use std::fs::File;
use std::io::Cursor;
use std::ops::{Mul, Neg};

use ark_std::{rand::SeedableRng, UniformRand, Zero};

fn derive_point_for_pok(i: usize) -> G2Affine {
    let rng = &mut ark_std::rand::rngs::StdRng::seed_from_u64(20399u64);
    G2Affine::rand(rng).mul(Fr::from(i as u64 + 1)).into()
}

#[allow(dead_code)]
fn pok_prove(sk: Fr, i: usize) -> G2Affine {
    derive_point_for_pok(i).mul(sk).into()
}

fn pok_verify(pk: G1Affine, i: usize, proof: G2Affine) {
    assert!(Bls12_381::multi_pairing(
        &[pk, G1Affine::generator()],
        &[derive_point_for_pok(i).neg(), proof]
    )
    .is_zero());
}

fn hasher() -> MapToCurveBasedHasher<G2Projective, DefaultFieldHasher<Sha256, 128>, WBMap<Config>> {
    let wb_to_curve_hasher =
        MapToCurveBasedHasher::<G2Projective, DefaultFieldHasher<Sha256, 128>, WBMap<Config>>::new(
            &[1, 3, 3, 7],
        )
        .unwrap();
    wb_to_curve_hasher
}

#[allow(dead_code)]
fn bls_sign(sk: Fr, msg: &[u8]) -> G2Affine {
    hasher().hash(msg).unwrap().mul(sk).into_affine()
}

fn bls_verify(pk: G1Affine, sig: G2Affine, msg: &[u8]) {
    assert!(Bls12_381::multi_pairing(
        &[pk, G1Affine::generator()],
        &[hasher().hash(msg).unwrap().neg(), sig]
    )
    .is_zero());
}

fn from_file<T: CanonicalDeserialize>(path: &str) -> T {
    let mut file = File::open(path).unwrap();
    let mut buffer = Vec::new();
    file.read_to_end(&mut buffer).unwrap();
    T::deserialize_uncompressed_unchecked(Cursor::new(&buffer)).unwrap()
}

fn main() {
    welcome();
    puzzle(PUZZLE_DESCRIPTION);

    let public_keys: Vec<(G1Affine, G2Affine)> = from_file("public_keys.bin");

    public_keys
        .iter()
        .enumerate()
        .for_each(|(i, (pk, proof))| pok_verify(*pk, i, *proof));

    for (i, (pk, proof)) in public_keys.iter().enumerate() {
        println!("public_keys[{}].pk: {}", i, pk);
        println!("public_keys[{}].proof: {}\n", i, proof);
    }

    let new_key_index = public_keys.len();
    let message = b"yannickseurin";

    // --snip--

    /* Enter solution here */

    let rng = &mut ark_std::rand::rngs::StdRng::seed_from_u64(23898323u64);
    let rogue_secret = Fr::rand(rng);
    let new_key = public_keys
        .iter()
        .fold(G1Affine::generator().mul(rogue_secret), |acc, (pk, _)| {
            acc - pk
        })
        .into_affine();
    let new_proof = public_keys.iter().enumerate().fold(
        pok_prove(rogue_secret, new_key_index),
        |acc, (i, (_, proof))| {
            let correct =
                Fr::from(new_key_index as u64 + 1) * Fr::from(i as u64 + 1).inverse().unwrap();
            (acc - proof.mul(correct).into_affine()).into_affine()
        },
    );
    let aggregate_signature = bls_sign(rogue_secret, message);

    /* End of solution */

    pok_verify(new_key, new_key_index, new_proof);
    let aggregate_key = public_keys
        .iter()
        .fold(G1Projective::from(new_key), |acc, (pk, _)| acc + pk)
        .into_affine();
    bls_verify(aggregate_key, aggregate_signature, message);

    println!("Puzzle solved!");
}

const PUZZLE_DESCRIPTION: &str = r"
Bob has been designing a new optimized signature scheme for his L1 based on BLS signatures. Specifically, he wanted to be able to use the most efficient form of BLS signature aggregation, where you just add the signatures together rather than having to delinearize them. In order to do that, he designed a proof-of-possession scheme based on the B-KEA assumption he found in the the Sapling security analysis paper by Mary Maller [1]. Based the reasoning in the Power of Proofs-of-Possession paper [2], he concluded that his scheme would be secure. After he deployed the protocol, he found it was attacked and there was a malicious block entered the system, fooling all the light nodes...

[1] https://github.com/zcash/sapling-security-analysis/blob/master/MaryMallerUpdated.pdf
[2] https://rist.tech.cornell.edu/papers/pkreg.pdf
";

Conclusion

The proof of possession used in the puzzle departed from what has been proven secure in the literature: instead of hashing the public key into the group, it used points of the form for a common random point While this is secure for a single key in isolation, this breaks when multiple public keys are involved as an attacker can use PoPs of other cosigners to maul a PoP for a public key for which it does not know the corresponding secret key, ultimately enabling a rogue key attack.

ZK Hack Puzzle 14: Chaos Theory

Bob designed a new one time scheme, that's based on the tried and true method
of encrypt + sign. He combined ElGamal encryption with BLS signatures in a
clever way, such that you use pairings to verify the encrypted message was
not tampered with. Alice, then, figured out a way to reveal the plaintexts...

The puzzle webpage recommends to read background material about authenticated encryption, which usually refers to the combination of symmetric encryption and MACs, which are symmetric-key primitives. Combining public-key encryption and signatures is more usually called signcryption.

Code Analysis

The package directory is organized as follows:

puzzle-chaos-theory
├── Cargo.toml
├── blob.bin
└── src
    └── main.rs

Let us go through the main.rs file to understand how Bob designed his signcryption scheme. It first brings a number of items from arkworks crates into scope, in particular related to the BLS12-381 pairing-friendly curve. Let us introduce straightaway some (standard) notation that will help us explain mathematically how the signcryption scheme works. In all the following, we will let and denote the two groups related to the BLS12-381 curve, denote the order of these groups, and denote the corresponding scalar field. Types G1Affine and G2Affine respectively correspond to points in and represented in short Weierstrass affine coordinates. We will also let denote the generator of returned by G1Affine::generator() (or G1Projective::generator() in projective form) and the pairing map from to which for any and satisfies

A hasher function is defined, returning an instance of the so-called Wahby-Boneh map [WB19] allowing to hash into

fn hasher() -> MapToCurveBasedHasher<G2Projective, DefaultFieldHasher<Sha256, 128>, WBMap<Config>> {
    let wb_to_curve_hasher =
        MapToCurveBasedHasher::<G2Projective, DefaultFieldHasher<Sha256, 128>, WBMap<Config>>::new(
            &[1, 3, 3, 7],
        )
        .unwrap();
    wb_to_curve_hasher
}

In all the following we will simply let denote this hash function.

Then, a tuple struct ElGamal holding two G1Affine points corresponding to ciphertexts is defined together with a method hash_to_curve returning

pub struct ElGamal(G1Affine, G1Affine);

impl ElGamal {
    pub fn hash_to_curve(&self) -> G2Affine {
        let mut data = Vec::new();
        self.serialize_uncompressed(&mut data).unwrap();

        hasher().hash(&data).unwrap()
    }
}

Messages are simply G1Affine points wrapped in a Message struct using the newtype pattern:

pub struct Message(G1Affine);

Finally, two structs are defined for respectively the sender and receiver:

struct Sender {
    pub sk: Fr,
    pub pk: G1Affine,
}

pub struct Receiver {
    pk: G1Affine,
}

Public keys for both the sender and the receiver are G1Affine points. Note that the receiver has a public key field, but no secret key field (decryption is not implemented).

Then comes the implementation of the encryption and signature by the sender:

impl Sender {
    pub fn send(&self, m: Message, r: &Receiver) -> ElGamal {
        let c_2: G1Affine = (r.pk.mul(&self.sk) + m.0).into_affine();
        ElGamal(self.pk, c_2)
    }

    pub fn authenticate(&self, c: &ElGamal) -> G2Affine {
        let hash_c = c.hash_to_curve();
        hash_c.mul(&self.sk).into_affine()
    }
}

Let us express what it does mathematically. Let denote the public key of the sender (self.pk) and denote the corresponding secret key (self.sk), the public key of the receiver (r.pk), and denote the message (m). Then the ciphertext returned by function send is

Hence, this is just ElGamal encryption where plays the role of the randomness of the standard ElGamal ciphertext.

The signature returned by function authenticate is the point in defined as

Hence, this is simply a BLS signature computed on the ciphertext.

Although decryption by the receiver is not implemented, verification of ciphertexts is implemented through a function check_auth on the empty struct Auditor:

impl Auditor {
    pub fn check_auth(sender_pk: G1Affine, c: &ElGamal, s: G2Affine) -> bool {
        let lhs = { Bls12_381::pairing(G1Projective::generator(), s) };

        let hash_c = c.hash_to_curve();
        let rhs = { Bls12_381::pairing(sender_pk, hash_c) };

        lhs == rhs
    }
}

This simply checks that the BLS signature is valid for public key and message

So to summarize, Bob's signcryption scheme simply encrypts the message using ElGamal and signs the ciphertext using the randomness of the ElGamal ciphertext as secret key for the BLS signature scheme.

Solving the Puzzle

The main function defines a Blob instance (by deserializing data in blob.bin) containing the sender public key the ciphertext the signature and the receiver public key

    let blob = Blob::deserialize_uncompressed(data.as_slice()).unwrap();

where Blob is defined as

pub struct Blob {
    pub sender_pk: G1Affine,
    pub c: ElGamal,
    pub s: G2Affine,
    pub rec_pk: G1Affine,
}

It also defines 10 candidate messages:

    let messages = generate_message_space();

where the generate_message_space function is defined as

fn generate_message_space() -> [Message; 10] {
    let g1 = G1Projective::generator();
    let msgs = [
        390183091831u64,
        4987238947234982,
        84327489279482,
        8492374892742,
        5894274824234,
        4982748927426,
        48248927348927427,
        489274982749828,
        99084321987189371,
        8427489729843712893,
    ];
    msgs.iter()
        .map(|&msg_i| Message(g1.mul(Fr::from(msg_i)).into_affine()))
        .collect::<Vec<_>>()
        .try_into()
        .unwrap()
}

Hence, the message is not completely arbitrary in we know a priori that it corresponds to one of the 10 messages in the messages vector.

The ciphertext is valid, as checked by the following lines:

    // ensure that blob is correct
    assert!(Auditor::check_auth(blob.sender_pk, &blob.c, blob.s));

ElGamal encryption is IND-CPA secure under the decisional Diffie-Hellman (DDH) assumption (which is believed to hold in group of BLS12-381), hence we must find a way to exploit information in the signature.

Since there are only 10 possible messages, if we can find a "test function" which is satisfied only by the real message, then we are done (note that this would not be the case if the space of potential message was too large to exhaustively run the test).

Recall that the message satisfies or equivalently

Also, the signature is

In other words, the discrete log of in base (in group is equal to the discrete log of in base (in group But equality of discrete logs is exactly the property that a pairing allows to test! Here is our test function then: for each potential message, check whether

Only for the real message will this equation be satisfied since implies

The attack is straightforward to implement:

use ark_bls12_381::{g2::Config, Bls12_381, Fr, G1Affine, G1Projective, G2Affine, G2Projective};
use ark_ec::{
    hashing::{curve_maps::wb::WBMap, map_to_curve_hasher::MapToCurveBasedHasher, HashToCurve},
    pairing::Pairing,
    CurveGroup, Group,
};
use ark_ff::field_hashers::DefaultFieldHasher;
use ark_serialize::{CanonicalDeserialize, CanonicalSerialize};
use sha2::Sha256;
use std::{fs::File, io::Read, ops::Mul};

use prompt::{puzzle, welcome};

#[derive(Debug)]
pub enum Error {
    InvalidMsg,
}

fn hasher() -> MapToCurveBasedHasher<G2Projective, DefaultFieldHasher<Sha256, 128>, WBMap<Config>> {
    let wb_to_curve_hasher =
        MapToCurveBasedHasher::<G2Projective, DefaultFieldHasher<Sha256, 128>, WBMap<Config>>::new(
            &[1, 3, 3, 7],
        )
        .unwrap();
    wb_to_curve_hasher
}

#[derive(CanonicalSerialize, CanonicalDeserialize)]
pub struct ElGamal(G1Affine, G1Affine);

impl ElGamal {
    pub fn hash_to_curve(&self) -> G2Affine {
        let mut data = Vec::new();
        self.serialize_uncompressed(&mut data).unwrap();

        hasher().hash(&data).unwrap()
    }
}

#[derive(Debug, Clone, Copy, PartialEq)]
pub struct Message(G1Affine);

struct Sender {
    pub sk: Fr,
    pub pk: G1Affine,
}

pub struct Receiver {
    pk: G1Affine,
}

pub struct Auditor {}

impl Sender {
    pub fn send(&self, m: Message, r: &Receiver) -> ElGamal {
        let c_2: G1Affine = (r.pk.mul(&self.sk) + m.0).into_affine();
        ElGamal(self.pk, c_2)
    }

    pub fn authenticate(&self, c: &ElGamal) -> G2Affine {
        let hash_c = c.hash_to_curve();
        hash_c.mul(&self.sk).into_affine()
    }
}

impl Auditor {
    pub fn check_auth(sender_pk: G1Affine, c: &ElGamal, s: G2Affine) -> bool {
        let lhs = { Bls12_381::pairing(G1Projective::generator(), s) };

        let hash_c = c.hash_to_curve();
        let rhs = { Bls12_381::pairing(sender_pk, hash_c) };

        lhs == rhs
    }
}

#[derive(CanonicalSerialize, CanonicalDeserialize)]
pub struct Blob {
    pub sender_pk: G1Affine,
    pub c: ElGamal,
    pub s: G2Affine,
    pub rec_pk: G1Affine,
}

fn generate_message_space() -> [Message; 10] {
    let g1 = G1Projective::generator();
    let msgs = [
        390183091831u64,
        4987238947234982,
        84327489279482,
        8492374892742,
        5894274824234,
        4982748927426,
        48248927348927427,
        489274982749828,
        99084321987189371,
        8427489729843712893,
    ];
    msgs.iter()
        .map(|&msg_i| Message(g1.mul(Fr::from(msg_i)).into_affine()))
        .collect::<Vec<_>>()
        .try_into()
        .unwrap()
}

pub fn main() {
    welcome();
    puzzle(PUZZLE_DESCRIPTION);

    let messages = generate_message_space();

    let mut file = File::open("blob.bin").unwrap();
    let mut data = Vec::new();
    file.read_to_end(&mut data).unwrap();
    let blob = Blob::deserialize_uncompressed(data.as_slice()).unwrap();

    // ensure that blob is correct
    assert!(Auditor::check_auth(blob.sender_pk, &blob.c, blob.s));

    // --snip--

    /* Implement your attack here, to find the index of the encrypted message */

    for (i, m) in messages.iter().enumerate() {
        let lhs = { Bls12_381::pairing(blob.c.1 - m.0, blob.c.hash_to_curve()) };
        let rhs = { Bls12_381::pairing(blob.rec_pk, blob.s) };
        if lhs == rhs {
            println!("Condition satisfied for message index {}", i);
        }
    }

    /* End of attack */
}

const PUZZLE_DESCRIPTION: &str = r"
Bob designed a new one time scheme, that's based on the tried and true method of encrypt + sign. He combined ElGamal encryption with BLS signatures in a clever way, such that you use pairings to verify the encrypted message was not tampered with. Alice, then, figured out a way to reveal the plaintexts...
";

We find that the encrypted message has index 3.

Conclusion

The main takeaway is that adding a BLS signature using the randomness of the ElGamal ciphertext as secret key for signing allowed a test function to discriminate the real plaintext.

In order to securely combine a public key encryption scheme and a signature scheme, one can use generic composition and simply "encrypt-then-sign", but with independent randomness in the encryption part and the signature part. This means that the ElGmal ciphertext should be for some random independent from the sender signing key (and freshly drawn for each ciphertext). The exact security of this method was studied in [ADR02] where it was shown that combining an IND-CPA-secure encryption scheme (such as ElGamal encryption) and a EUF-CMA-secure signature scheme (such as BLS) yields a so-called "outsider-secure" signcryption scheme. Outsider-security means that the sender is protected against forgery as long as the receiver's secret key is not compromised and conversely the confidentiality of messages sent to the receiver is ensured as long as the sender's secret key is not compromised. By opposition, "insider-security" means that the sender is protected even if the receiver's secret key leaks and vice-versa.

There is actually a more complicated way to combine ElGamal and BLS into a signcryption scheme achieving the stronger notion of insider-security which has been proposed in [LQ04].

The idea is as follow (note that the paper describes the scheme with a symmetric pairing, we transpose it here for an asymmetric pairing). As before, let and be the sender and receiver secret/public key pairs. To encrypt a message the sender draws uniformly at random and computes

The signed ciphertext is

References

[ABR01] Michel Abdalla, Mihir Bellare, and Phillip Rogaway. DHIES: An encryption scheme based on the Diffie-Hellman Problem. Manuscript, 2001 (preliminary version in the proceedings of CT-RSA 2001).

[ADR02] Jee Hea An, Yevgeniy Dodis, and Tal Rabin. On the Security of Joint Signature and Encryption. In proceedings of EUROCRYPT 2002.

[AHG23] Diego F. Aranha, Youssef El Housni, and Aurore Guillevic. A survey of elliptic curves for proof systems. In Designs, Codes and Cryptography, 2023.

[ANT+20] Diego F. Aranha, Felipe Rodrigues Novaes, Akira Takahashi, Mehdi Tibouchi, and Yuval Yarom. LadderLeak: Breaking ECDSA with Less than One Bit of Nonce Leakage. In proceedings of ACM CCS 2020.

[BB04] Dan Boneh and Xavier Boyen. Efficient Selective-ID Secure Identity Based Encryption Without Random Oracles. In proceedings of EUROCRYPT 2004.

[BB08] Dan Boneh and Xavier Boyen. Short Signatures Without Random Oracles and the SDH Assumption in Bilinear Groups. In Journal of Cryptology, 2008.

[BCCT13] Nir Bitansky, Ran Canetti, Alessandro Chiesa, and Eran Tromer. Recursive Composition and Bootstrapping for SNARKs and Proof-Carrying Data. In proceedings of STOC 2013.

[BCG+14] Eli Ben-Sasson, Alessandro Chiesa, Christina Garman, Matthew Green, Ian Miers, Eran Tromer, and Madars Virza. Zerocash: Decentralized Anonymous Payments from Bitcoin. In proceedings of IEEE SP 2014.

[BCI+10] Eric Brier, Jean-Sébastien Coron, Thomas Icart, David Madore, Hugues Randriam, and Mehdi Tibouchi. Efficient Indifferentiable Hashing into Ordinary Elliptic Curves. In proceedings of CRYPTO 2010.

[BCM+15] Paulo S. L. M. Barreto, Craig Costello, Rafael Misoczki, Michael Naehrig, Geovandro C. C. F. Pereira, and Gustavo Zanon. Subgroup Security in Pairing-Based Cryptography. In proceedings of Latincrypt 2015.

[BCTV14] Eli Ben-Sasson, Alessandro Chiesa, Eran Tromer, and Madars Virza. Scalable Zero Knowledge via Cycles of Elliptic Curves. In proceedings of CRYPTO 2014.

[BDFG20] Dan Boneh, Justin Drake, Ben Fisch, and Ariel Gabizon. Efficient polynomial commitment schemes for multiple points and polynomials. IACR ePrint report 2020/081, 2020.

[BDN18] Dan Boneh, Manu Drijvers, and Gregory Neven. Compact Multi-signatures for Smaller Blockchains. In proceedings of ASIACRYPT 2018.

[BF03] Dan Boneh and Matthew K. Franklin. Identity-Based Encryption from the Weil Pairing. In SIAM Journal on Computing, 2003.

[BGLS03] Dan Boneh, Craig Gentry, Ben Lynn, and Hovav Shacham. Aggregate and Verifiably Encrypted Signatures from Bilinear Maps. In proceedings of EUROCRYPT 2003.

[BH19] Joachim Breitner and Nadia Heninger. Biased Nonce Sense: Lattice Attacks Against Weak ECDSA Signatures in Cryptocurrencies. In proceedings of FC 2019.

[BL22] Renas Bacho and Julian Loss. On the Adaptive Security of the Threshold BLS Signature Scheme. In proceedings of ACM CCS 2022.

[BLS01] Dan Boneh, Ben Lynn, and Hovav Shacham. Short Signatures from the Weil Pairing. In proceedings of ASIACRYPT 2001.

[BLS04] Dan Boneh, Ben Lynn, and Hovav Shacham. Short Signatures from the Weil Pairing. In Journal of Cryptology, 2004.

[BN05] Paulo S. L. M. Barreto and Michael Naehrig. Pairing-Friendly Elliptic Curves of Prime Order. In proceedings of SAC 2005.

[BN19] Eli Biham and Lior Neumann. Breaking the Bluetooth Pairing - The Fixed Coordinate Invalid Curve Attack. In proceedings of SAC 2019.

[BNN07] Mihir Bellare, Chanathip Namprempre, and Gregory Neven. Unrestricted Aggregate Signatures. In proceedings of ICALP 2007.

[BPW12] David Bernhard, Olivier Pereira, and Bogdan Warinschi. How Not to Prove Yourself: Pitfalls of the Fiat-Shamir Heuristic and Applications to Helios. In proceedings of ASIACRYPT 2012.

[BT04] Jean-Paul Berrut and Lloyd N. Trefethen. Barycentric Lagrange Interpolation. In SIAM Review, 2004.

[CA89] David Chaum and Hans Van Antwerpen. Undeniable Signatures. In proceedings of CRYPTO 1989.

[CF13] Dario Catalano and Dario Fiore. Vector Commitments and Their Applications. In proceedings of PKC 2013.

[CGGM00] Ran Canetti, Oded Goldreich, Shafi Goldwasser, and Silvio Micali. Resettable Zero-Knowledge. In proceedings of STOC 2000.

[Che10] Jung Hee Cheon. Discrete Logarithm Problems with Auxiliary Inputs. In Journal of Cryptology, 2010.

[CHKM10] Sanjit Chatterjee, Darrel Hankerson, Edward Knapp, and Alfred Menezes. Comparing two pairing-based aggregate signature schemes. In Designs, Codes and Cryptography, 2010.

[CJ19] Cas Cremers and Dennis Jackson. Prime, Order Please! Revisiting Small Subgroup and Invalid Curve Attacks on Protocols using Diffie-Hellman. In proceedings of IEEE CSF 2019.

[CM09] Sanjit Chatterjee and Alfred Menezes. On Cryptographic Protocols Employing Asymmetric Pairings - The Role of Revisited. IACR ePrint report 2009/480, 2009.

[DMWG23] Quang Dao, Jim Miller, Opal Wright, and Paul Grubbs. Weak Fiat-Shamir Attacks on Modern Proof Systems. In proceedings of IEEE SP 2023.

[FKL18] Georg Fuchsbauer, Eike Kiltz, and Julian Loss. The Algebraic Group Model and its Applications. In proceedings of CRYPTO 2018.

[GKR+21] Lorenzo Grassi, Dmitry Khovratovich, Christian Rechberger, Arnab Roy, and Markus Schofnegger. Poseidon: A New Hash Function for Zero-Knowledge Proof Systems. In proceedings of USENIX Security 2021.

[GPS08] Steven D. Galbraith, Kenneth G. Paterson, and Nigel P. Smart. Pairings for cryptographers. In Discrete Applied Mathematics, 2008.

[Gro16] Jens Groth. On the Size of Pairing-Based Non-interactive Arguments. In proceedings of EUROCRYPT 2016.

[GWC19] Ariel Gabizon, Zachary J. Williamson, and Oana Ciobotaru. PLONK: Permutations over Lagrange-bases for Oecumenical Noninteractive arguments of Knowledge. IACR ePrint report 2019/953, 2019.

[Ham15] Mike Hamburg. Decaf: Eliminating Cofactors Through Point Compression. In proceedings of CRYPTO 2015.

[HGP22] Youssef El Housni, Aurore Guillevic, and Thomas Piellard. Co-factor Clearing and Subgroup Membership Testing on Pairing-Friendly Curves. In proceedings of AFRICACRYPT 2022.

[HLPT20] Thomas Haines, and Sarah Jamie Lewis, and Olivier Pereira, and Vanessa Teague. How not to prove your election outcome. In proceedings of IEEE SP 2020.

[Jou00] Antoine Joux. A One Round Protocol for Tripartite Diffie-Hellman. In proceedings of ANTS 2000.

[JSS15] Tibor Jager, Jörg Schwenk, and Juraj Somorovsky. Practical Invalid Curve Attacks on TLS-ECDH. In proceedings of ESORICS 2015.

[KZG10a] Aniket Kate, Gregory M. Zaverucha, and Ian Goldberg. Constant-Size Commitments to Polynomials and Their Applications. In proceedings of ASIACRYPT 2010.

[KZG10b] Aniket Kate, Gregory M. Zaverucha, and Ian Goldberg. Polynomial Commitments. Full version of [KZG10a].

[Lip10] Helger Lipmaa. Progression-Free Sets and Sublinear Pairing-Based Non-Interactive Zero-Knowledge Arguments. In proceedings of TCC 2012.

[LL97] Chae Hoon Lim and Pil Joong Lee. A Key Recovery Attack on Discrete Log-based Schemes Using a Prime Order Subgroup. In proceedings of CRYPTO 1997.

[LQ04] Benoît Libert and Jean-Jacques Quisquater. Efficient Signcryption with Key Privacy from Gap Diffie-Hellman Groups. In proceedings of PKC 2004.

[MOV91] Alfred Menezes, Tatsuaki Okamoto, and Scott A. Vanstone. Reducing Elliptic Curve Logarithms to Logarithms in a Finite Field. In proceedings of STOC 1991.

[NRBB22] Valeria Nikolaenko, Sam Ragsdale, Joseph Bonneau, and Dan Boneh. Powers-of-Tau to the People: Decentralizing Setup Ceremonies. IACR ePrint report 2022/1592, 2022.

[Ped91] Torben P. Pedersen. Non-Interactive and Information-Theoretic Secure Verifiable Secret Sharing. In proceedings of CRYPTO 1991.

[PST13] Charalampos Papamanthou, Elaine Shi, and Roberto Tamassia. Signatures of Correct Computation. In proceedings of TCC 2013.

[Qua21] Nguyen Thoi Minh Quan. 0. IACR ePrint report 2021/323, 2021.

[Rot22] Lior Rotem . Revisiting the Uber Assumption in the Algebraic Group Model: Fine-Grained Bounds in Hidden-Order Groups and Improved Reductions in Bilinear Groups. In proceedings of ITC 2022.

[RY07] Thomas Ristenpart and Scott Yilek. The Power of Proofs-of-Possession: Securing Multiparty Signatures against Rogue-Key Attacks. In proceedings of EUROCRYPT 2007.

[SV07] Nigel P. Smart and Frederik Vercauteren. On computable isomorphisms in efficient asymmetric pairing-based systems. In Discrete Applied Mathematics, 2007.

[VAS+17] Luke Valenta, David Adrian, Antonio Sanso, Shaanan Cohney, Joshua Fried, Marcella Hastings, J. Alex Halderman, and Nadia Heninger. Measuring small subgroup attacks against Diffie-Hellman. In proceedings of NDSS 2017.

[WB19] Riad S. Wahby and Dan Boneh. Fast and simple constant-time hashing to the BLS12-381 elliptic curve. IACR ePrint report 2019/403, 2019.