Small Subgroup Attacks
Actually, the puzzle webpage was giving us a hint by advising us to read a paper by Cremers and Jackson [CJ19] about so-called small subgroup attacks. Small subgroup attacks (more specifically, small subgroup key recovery attacks, as there are other flavors such as small subgroup confinement attacks that we won't touch here) have been proposed by Lim and Lee in 1997 [LL97] and vulnerable implementations have been found in the wild [VAS+17].
In a nutshell, small subgroup attacks occur when some party, holding some secret scalar is tricked into computing thinking generates some subgroup (of some larger group where the discrete logarithm problem is hard whereas in fact generates another subgroup of where the discrete logarithm problem is much easier. This can happen for example in Diffie-Hellman key exchange or in some blind signing protocols where the party computing the scalar multiplication gets the point from another (potentially malicious) party rather than from trusted parameters. Before going into details, we need to recall some basic facts about groups and their subgroups.
Computing Discrete Logarithms in Groups of Composite Order
Let be a finite group of order Then, by Lagrange's Theorem, the order of any subgroup of divides Moreover, if is cyclic, then every subgroup of is also cyclic (Proposition 5.19) and by the Fundamental Theorem of Cyclic Groups, for each divisor of there exists a unique subgroup of order
Small subgroup attacks derive from a simple observation about the discrete logarithm problem in groups of composite order. Let by a cyclic group of composite order where and are coprime and let be a generator of Say we want to solve the discrete logarithm problem for a group element in base i.e., we want to find the unique integer such that Then, we can take advantage of the structure of group as follows:
- First, we compute and Then has order (why?) and If we let denote the discrete logarithm of in base which can be computed in group operations with generic algorithms, then implies that
- Similarly, one can compute and Then the discrete logarithm of in base can be computed in group operations and satisfies
- Finally, one can combine the two equations and using the Chinese remainder theorem to obtain (modulo
All in all, has been computed in rather than group operations.
The procedure above is the basis of the Pohlig-Hellman algorithm which computes discrete logarithms in a group of order in group operations, assuming the factorization of is known. Hence, if one wants 128-bit security for the discrete logarithm problem, a necessary condition is that the order of the group has a prime factor of size at least 256 bits.
This explains why cryptographers are so obsessed with groups of prime order: by Lagrange's theorem, groups of prime order don't have any subgroups other than the trivial ones (itself and the subgroup of order 1 consisting of the identity element) and hence the discrete logarithm problem cannot be "broken" into smaller pieces as above (which, of course, does not imply that the DL problem cannot by broken by other, non-generic means).
Small Subgroup Attacks
Groups used in cryptography often have composite order. For example, the multiplicative group of integers modulo some prime number has order which is always even. For elliptic curves, although it is possible to construct secure curves with a prime number of points such as secp256k1, many curves that are attractive for efficiency reasons (such as twisted Edwards curves) have order where is prime and is small (usually 4 or 8).
When using such groups of composite order a "base" point of prime order is usually specified. As long as group elements used in a protocol are computed as multiples of one never "gets out" of the prime-order subgroup The index of subgroup i.e., the ratio is often called the cofactor of in a cryptographic context.
However, what happens if Alice, holding some secret value is tricked by an attacker into computing where is not in subgroup ? If is made available to the attacker, then it can use the Pohlig-Hellman algorithm to compute where is the "smooth" part of 's order (meaning, informally, the product of all "small" prime factors of 's order), which might be somewhere between a few bits (if is or to enough to retrieve entirely. Note that does not have have to actually be in a small subgroup for the attack to work, the only condition is that some multiples of be in small subgroups (equivalently, that has small subgroups). If, for example, generates the entire ambient group of order then one can "project" the discrete logarithm on the subgroup of order by computing and and work with the pair instead.
Observe that the maximal amount of information that can leak about secret in a small subgroup attack is bits, hence having a "small" cofactor as 4 or 8 might seem benign. However, there are actually plenty of other ways a small cofactor can mess with your protocol, an interesting example being Monero's multiple-spend bug.
What about pairing-based cryptography? While there exists families of pairing-friendly curve pairs where the first "small" curve has prime order, such as Baretto-Naehrig (BN) curves [BN05], the second "large" curve always has composite order with a very large cofactor. For members of the BLS family, even the first small curve has composite order. Hence, small subgroup attacks are especially relevant when using pairing-based cryptographic primitives. For more information, see [BCM+15].
Subgroup Membership Tests
How can we prevent small subgroup attacks? By performing a subgroup membership test. Given a group of composite order and a prime factor of an element is in the subgroup of order if and only if This test is simple yet rather costly since is large. However, there are a number of tricks to make subgroup membership testing more efficient [HGP22]. For curves with small cofactors (4 or 8), some techniques such as Decaf [Ham15] or Ristretto allow to "eliminate the cofactor" and construct a prime-order group.
Invalid Curve Attacks
Finally, assuming we work with a prime-order curve such as secp256k1, is it safe to use any point received from an untrusted source without verification? If the curve has prime order any point other than 0 has order right? Well, not exactly: if was receive in so-called "uncompressed" form (meaning both coordinates and were explicitly given), might not be on the curve at all! It might be on another curve with a different equation but where the same addition formulas apply. If this "ghost" curve has a smooth order, computing might end up leaking information about the secret scalar exactly as described above. This is called an invalid curve attack and has affected for example some implementations of TLS [JSS15] and the Bluetooth protocol [BN19].
Let's now see how to apply all this to the puzzle.