Chapter status: ✅ in good shape ✅

Related puzzles: Puzzle 2 and Puzzle 4

TODO:

Polynomial Commitment Schemes

Polynomial commitment schemes play a central role in the design of zk-SNARKs. In this section, we introduce them and present the two KZG schemes, DL-KZG and Ped-KZG.

Make sure to read the chapters about polynomials and standard commitments first.

We recall that given a field we let denote the set of all univariate polynomials over of degree at most

Contents

Generalities

As standard commitment schemes, a polynomial commitment scheme (PC scheme for short) involves two parties, a committer (or prover) and a verifier. It allows the committer to send to the verifier a commitment to some secret polynomial and later on prove that evaluates to some specific value at input (usually of the verifier's choosing), potentially for multiple inputs.

Recall that a polynomial of degree at most is specified by the tuple of coefficients such that From a broader perspective, a PC scheme can be seen as the combination of a standard commitment scheme over together with various proof systems for proving assertions about the committed vector In particular, proving that is equivalent to proving that i.e., proving that the inner product of with the vector is equal to Other assertions one may want to prove when designing SNARKs are, for example, that has degree at most which is equivalent to or more complex relations about evaluations of such as for some subset

Syntax

More formally, a PC scheme is parameterized by a maximal degree (one can think of as being given as input to all algorithms) and consists of the five following algorithms (the exact syntax can vary slightly in the literature, here we adhere to the syntax of standard commitment schemes):

  • a setup algorithm which on input the security parameter returns public parameters1 these parameters implicitly specify some finite field

  • a commitment algorithm which on input parameters and a polynomial returns a commitment and a decommitment

  • a "polynomial" verification algorithm which on input parameters a commitment a polynomial and a decommitment returns 1 if is a valid decommitment for and 0 otherwise;

  • a proving algorithm which on input parameters a polynomial a decommitment and a value returns an evaluation and a proof

  • an "evaluation" verification algorithm which on input parameters a commitment a pair and a proof returns 1 if is a valid proof that the polynomial committed to by evaluates to at input and 0 otherwise.

In some cases (in particular for KZG), it might be possible to split the public parameters into a commitment key and a verification key where typically only is needed for algorithms and and only is needed for

As already hinted, the three algorithms and can be regarded together as a standard commitment scheme with message space (with specified by the tuple of its coefficients) while and together form a proof system for statements of the form

As for standard commitment schemes, what we just defined here is the syntax for a non-interactive PC scheme, where the algorithm is run once and for all and then committing and proving an evaluation of the committed polynomial is non-interactive. More generally, committing and evaluation proving could be interactive.

As always, the scheme must be correct, meaning two things: first, must be correct as defined for a standard commitment scheme with message space second, for every security parameter every every and every the following game capturing the nominal execution of algorithms for evaluation proving must return true with probability 1:

Security

Defining security properties for PC schemes is rather subtle. Almost every paper about PC schemes define slightly different sets of security properties depending on the specific application being targeted. Here, we focus on the security properties proposed in the seminal paper about PC schemes [KZG10a], which are also the simplest ones.

First, a PC scheme should be hiding and binding in the standard sense when seen as a commitment to the tuple of coefficients defining the polynomial Let us the recall the corresponding games, that we call POLY-HIDING and POLY-BINDING for clarity:

It turns out that some PC schemes (such as the DL-KZG scheme) do not satisfy the poly-hiding notion (in general, when used to construct SNARKs, poly-hiding matters only if one cares about the SNARK being zero-knowledge). However, they satisfy what we call here evaluation hiding,2 which informally means that for a random polynomial of degree at most given a commitment to and at most evaluations of together with the corresponding proofs, no adversary should be able to guess the value of for a new input This is formalized by the following game:

To be completely explicit, the line means

The condition that the adversary makes at most queries to the oracle is of course necessary: once the commitment has been opened at distinct points the committed polynomial has been completely revealed by virtue of Lagrange interpolation.

Regarding the binding property of evaluation proving, a PC scheme should be evaluation binding, meaning no efficient adversary can produce a commitment and two valid proofs that the committed polynomial evaluates to two different values at the same input More formally, this is captured by the following game:

As for standard commitments, all these properties can hold statistically or computationally, but poly-hiding and poly-binding cannot hold both statistically for a PC scheme.

Informal Description of the KZG Schemes

Two closely related and very efficient PC schemes based on pairings were proposed by Kate, Zaverucha, and Goldberg in 2010 [KZG10a] (see also [KZG10b] for the full paper with security proofs). We will call them (for reasons that will become clear soon) DL-KZG and Ped-KZG. What is usually simply called KZG corresponds to the DL-KZG scheme.

Let us start with a high-level view of DL-KZG. For a maximal degree the commitment and opening part is very similar to the (non-hiding version of the) generalized Pedersen commitment scheme. The public parameters consist of generators of some group of prime order A polynomial is seen as a vector and the corresponding commitment is There is a big difference though with generalized Pedersen commitments: the generators are not independent. They are computed from a single generator and a secret random scalar as These parameters have a precise structure. For this reason, they are also called a structured reference string (SRS). This also implies that they cannot be sampled obliviously of and require a trusted setup (more on this later).

As a result, the commitment is actually in disguise:

Evaluation proving relies on the polynomial remainder theorem: a polynomial satisfies if and only if is divisible by i.e., there exists a polynomial such that The proving algorithm therefore consists in computing the polynomial explicitly and the proof which consists in in disguise, as

Evaluating this polynomial equality at we see that The verification algorithm consists in checking this equality "in the exponent" (or rather "in the scalar multiplication" here as we use additive notation). This is where pairings comes in: is actually a pairing-friendly group coming with related groups and and a pairing The public parameters include a generator of and the group element The verifier can compute and and also knows Then holds iff the following pairing equality does:

The DL-KZG commitment scheme is obviously not hiding because the commitment algorithm is deterministic. The Ped-KZG scheme remedies this problem by adding a commitment to a random polynomial with respect to another tuple of points Evaluation proving is adapted accordingly. The form of the commitment is reminiscent of the (hiding version) of Pedersen commitments, explaining the naming convention.

We now give a detailed description and analysis of the properties of the DL-KZG and Ped-KZG schemes.

The DL-KZG Scheme

Description

Let be an asymmetric pairing group setup algorithm.3 The DL-KZG scheme for a maximal degree is defined as follows:

  • The algorithm, on input the security parameter runs draws random generators and of respectively and draws and returns public parameters Here we assume that pairing parameters are implicitly specified in 4 The field over which polynomials are defined is The public parameters can be split into a commitment key and a verification key

  • The algorithm, on input a commitment key and a polynomial where returns the commitment and an empty decommitment

  • The algorithm, on input a commitment key a commitment and a polynomial where returns 1 if and 0 otherwise.

  • The algorithm, on input a commitment key a polynomial and computes the polynomial5 the group element and returns and the proof

  • The algorithm, on input a verification key a commitment a pair and a proof returns 1 if and 0 otherwise.

It is straightforward to verify that is correct as a standard commitment scheme. Let's check that the scheme is correct with respect to evaluation proving, i.e., if the commitment and the proof have been honestly computed, then the verification passes. If and where and are such that then hence Eq. (19.1) is satisfied and returns 1.

Hiding Security

The DL-KZG scheme as described above is not poly-hiding: the commitment algorithm is deterministic, hence, given two polynomials and and a commitment it is trivial to distinguish whether commits to and by computing the corresponding commitments and comparing with

Regarding the eval-hiding property, note that DL-KZG cannot be statistically eval-hiding. Indeed, an unbounded adversary can compute from the parameters and from the commitment and return to win the EVAL-HIDING game without making any query to the oracle. However, it is eval-hiding under the discrete logarithm assumption (in group

Let us give the intuition before the full-fledged proof. As the committed polynomial is uniformly random in Lagrange interpolation ensures that given at most evaluations of at the value of on any other point is uniformly random so that even an unbounded adversary can guess it with probability at most Hence, the only way an adversary can guess with non-negligible probability is to compute from Together with queries to the oracle, this yields evaluations of allowing to compute with Lagrange interpolation. But computing requires to solve the discrete logarithm problem for challenge

Theorem 19.1. Assume that the DL problem is hard in for Then the DL-KZG scheme is (computationally) eval-hiding. More precisely, for every adversary against the EVAL-HIDING game, there exists an adversary for the DL problem running in time where is the running time of and such that

Proof

Let be an adversary against the eval-hiding property of DL-KZG. Without loss of generality, we assume that makes exactly queries to the oracle. We simply denote EH the EVAL-HIDING game. Let also denote the event that queries the oracle on By definition of the advantage, we have Let us first show that the first term is negligible. Just before returns its answer, has been evaluated on at most points: (when computing the commitment) and the queries of to the oracle. Conditioned on querying on (i.e., has been in fact evaluated on points before returns its output Since is a random polynomial of degree the value of conditioned on these at most evaluations (for any is uniformly random. Hence, even a computationally unbounded adversary can guess with probability at most i.e., Let us now upper bound the second term with a reduction. We construct an adversary that solves the DL problem by simulating game EH to as follows. Let be the DL instance that must solve. draws computes and runs on input Note that implicitly commits to a polynomial such that simulates the oracle as follows: when queries the oracle on some field element draws computes a proof and returns The proof is valid because and hence the verification equation (19.1) is satisfied. Note that cannot answer this way if queries to since is exactly the solution to its DL challenge. In such a case (i.e., when happens), simply aborts. Conditioned on not happening, is sampled through the commitment evaluation and the evaluations corresponding to queries made by with and uniformly random and independent. By Lagrange interpolation, this is equivalent to drawing the coefficients of uniformly at random and hence the EVAL-HIDING game is perfectly simulated. If successfully returns a pair such that then can interpolate the evaluations corresponding to queries together with to recover polynomial and compute which yields the solution to the DL challenge.

Let DL be the discrete logarithm game played with Then where for the last equality we used that conditioned on games DL and EH are identical. Hence, runs in time (where is the running time of plus the time to interpolate which requires at most operations.

Binding Security

The only thing that a commitment commits to, information-theoretically speaking, is the value Hence, the DL-KZG scheme is certainly not statistically poly-binding: an adversary able to compute from the public parameters can very easily decommit any commitment to any polynomial such that However, for an adversary unable to compute from the public parameters, which is an instance of what we call the -co-DL problem, there is only a negligible chance that it can find two polynomials and such that More formally, we have the following result.

Theorem 19.2. Assume that the -co-DL problem is hard for Then the DL-KZG scheme for maximal degree is poly-binding. More precisely, for any adversary against the poly-binding security of DL-KZG for maximal degree there exists an adversary for the -co-DL problem running in time where is the running time of and such that

Proof

Let be an adversary against the poly-binding security of DL-KZG for maximal degree We construct an algorithm for the -co-DL problem as follows. gets pairing group parameters and an instance of the -co-DL problem. The goal of is to compute It runs on public parameters Assume that is successful and returns a commitment and two distinct polynomials and of degree at most such that This implies that hence and is a root of the non-zero polynomial This polynomial can be factored in time with the Cantor–Zassenhaus algorithm, which allows to compute all its roots and find The success probability of is the same as the one of and the running time of is

Eval-binding security relies on a stronger assumption, namely that the so-called -strong Diffie-Hellman (-SDH) problem is hard. This problem is as follows: given compute a pair such that The -SDH problem is usually simply called the -SDH problem.

Theorem 19.3. Assume that the -SDH problem is hard for Then the DL-KZG scheme for maximal degree is eval-binding. More precisely, for any adversary against the eval-binding security of DL-KZG for maximal degree there exists an adversary for the -SDH problem running in time similar to the time of and such that

Proof

Let be an adversary against the eval-binding security of the DL-KZG scheme for maximal degree We construct an adversary for the -SDH problem. gets pairing group parameters and an instance of the -SDH problem. The goal of is to return a pair such that It runs on public parameters Assume that is successful and returns a commitment a field element and two valid value/proof pairs and with Then proceeds as follows. First, it checks whether (e.g., by checking whether is equal to the second group element of the parameters If this is the case, then simply picks an arbitrary element and returns From now on, we assume The validity of the two proofs imply that Taking the inverse of the second equation and multiplying with the first equation, we get successively where for the last implication we used that and which allows us to multiply by The last equation implies that Hence, returns which is a valid solution of the SDH instance. The success probability of is the same as the one of and the running time of is close to the one of

The Ped-KZG Scheme

Description

As discussed in the previous section, the DL-KZG scheme is not poly-hiding because the algorithm is deterministic. It is possible the make the scheme poly-hiding by randomizing the algorithm. Below we present the Ped-KZG scheme. The idea is to add to the basic DL-KZG commitment a commitment to another random and independent polynomial with respect to another generator of The commitment becomes which is very similar to a Pedersen commitment, hence the name. This requires expanding the size of the public parameters and the evaluation proofs. The formal description follows.

  • The algorithm, on input the security parameter runs draws random generators and of and of draws and returns public parameters Here we assume that pairing parameters are implicitly specified in The field over which polynomials are defined is The public parameters can be split into a commitment key and a verification key

  • The algorithm, on input a commitment key and a polynomial where draws a random polynomial with and returns the commitment and the decommitment

  • The algorithm, on input a commitment key a commitment a polynomial where and a decommitment where returns 1 if and 0 otherwise.

  • The algorithm, on input a commitment key a polynomial a decommitment and computes the polynomials the group element and returns and the proof

  • The algorithm, on input a verification key a commitment a pair and a proof returns 1 if and 0 otherwise.

Correctness can be verified in a similar way to DL-KZG.

Hiding Security

Thanks to the commitment randomization, poly-hiding and eval-hiding security both hold statistically for Ped-KZG.

Theorem 19.4. The Ped-KZG scheme is perfectly poly-hiding.

Proof

For any and any polynomial the commitment returned by the algorithm is uniformly random in due to the addition of the term Hence, the oracle in the POLY-HIDING game does not reveal any information about the hidden bit

Theorem 19.5. The Ped-KZG scheme is statistically eval-hiding. More precisely, for any adversary one has

Proof

Let by a (computationally unbounded) adversary against the eval-hiding property of Ped-KZG. We can assume without loss of generality that is given the discrete logarithm of in base and the discrete logarithm of the challenge commitment Let u_i_, be the queries of to oracle and be the corresponding answers. Note that does not bring any additional information to as it can be computed from the other quantities, namely Hence, all in all the adversary is given evaluations of and at the same points together with the value Note that since is a generator of Hence, conditioned on for and the value of is uniformly random and only has evaluations of This implies that the probability that guesses correctly for is

Binding Security

The poly-binding and eval-binding security properties hold under the same assumptions as for DL-KZG. The proofs are slightly more complex and must account for the possibility that the adversary solves the discrete logarithm problem for in base

Theorem 19.6. Assume that the -co-DL problem is hard for Then the Ped-KZG scheme for maximal degree is poly-binding. More precisely, for any adversary against the poly-binding security of Ped-KZG for maximal degree there exists an adversary for the -co-DL problem running in time where is the running time of and such that

Proof

Let be an adversary against the poly-binding security of the Ped-KZG scheme for maximal degree We construct an algorithm for the -co-DL problem as follows. gets pairing group parameters and an instance of the -co-DL problem. The goal of is to compute

Adversary randomly chooses between two indistinguishable ways to embed its instance into the parameters. Namely, it draws and proceeds as follows depending on

  • If then draws and runs on public parameters This implicitly sets and
  • If then draws and runs on public parameters This implicitly sets

Assume that is successful and returns a commitment and two distinct polynomials and of degree at most together with corresponding decommitments and such that This implies that and hence We can distinguish two cases:

  • case If then aborts. Otherwise, since we have Hence, is a root of the non-zero polynomial This polynomial can be factored in time with the Cantor–Zassenhaus algorithm, which allows to compute all its roots and find
  • case If then aborts. Otherwise, since then This implies that Then necessarily as otherwise this would contradict Hence, can compute

The view of is independent from and hence aborts with probability so that The running time of is at most which concludes the proof.

Theorem 19.7. Assume that the -SDH problem is hard for Then the Ped-KZG scheme for maximal degree is eval-binding. More precisely, for any adversary against the eval-binding security of Ped-KZG for maximal degree there exists an adversary for the -SDH problem running in time similar to the time of and such that

Proof

Let be an adversary against the eval-binding security of the Ped-KZG scheme for maximal degree We construct an adversary for the -SDH problem. gets pairing group parameters and an instance of the -SDH problem. The goal of is to return a pair such that

Adversary randomly chooses between two indistinguishable ways to embed its instance into the parameters. Namely, it draws and proceeds as follows depending on

  • If then draws and runs on public parameters This implicitly sets and
  • If then draws and runs on public parameters This implicitly sets

Assume that is successful and returns a commitment a field element and two valid value/proof pairs and with Then proceeds as follows. First, it checks whether (e.g., by checking whether is equal to the second group element of the parameters If this is the case, then simply picks an arbitrary element and returns From now on, we assume The validity of the two proofs imply that Combining these two equations, we get where we used that

We can now distinguish two cases:

  • case If then aborts. Otherwise, since we have and knows the value such that The equation above yields This implies in particular that as otherwise this would imply Hence, which implies that Thus, can return as solution to the -SDH instance.
  • case If then aborts. Otherwise, since we have and the equation above yields which implies We cannot have as this would imply whereas when is successful. Hence, can compute choose an arbitrary and return as solution to the SDH instance.

The view of is independent from and hence aborts with probability so that The running time of is similar to the running time of which concludes the proof.

Discussion

Efficiency

DL-KZG commitments are extremely succinct and rather cheap to verify: a commitment and a proof take one elliptic curve point each (e.g., 48 bytes when using BLS12-381) and verifying an opening essentially takes two pairings. In case one has to verify many openings for the same commitment, the verification equation (19.1) can be equivalently written where and can be computed once and stored for verifying multiple openings, allowing to trade one pairing for one exponentiation in On the other hand, the size of the commitment key and the complexity of algorithms and are linear in the maximal degree of committed polynomials (which when building SNARKs can be quite large).

Trusted Setup

The secret value drawn by the algorithm must be securely deleted once the commitment key has been set up as it allows to break the evaluation binding property of the scheme. Indeed, knowing given an arbitrary commitment one can open this commitment at any point to any value by computing the proof as Then the verification equation (19.1) is satisfied as

This is quite different from the procedure of the Pedersen commitment scheme, for which it is possible to proceed without ever generating any trapdoor explicitly. There is no (efficient) way known to implement the procedure for KZG without explicitly sampling To the best of my knowledge, there is also no proof that this is impossible. The assumption that this is impossible looks quite similar to many "knowledge of exponent" assumptions, hence the claim that running the KZG setup obliviously of is impossible is presumably true but not provable with known techniques. It is, however, possible to run the setup in a decentralized fashion, ensuring that the process is secure as long as a single party behaves honestly (see for example [NRBB22]).

Note that it is possible to check that the trusted setup yielded public parameters having the correct form, namely that there indeed exists such that Say we are given Let be defined as the discrete log of in base i.e., and set Then has the correct form if and only if for every Indeed, one has the following equivalences:

Summary of KZG Properties

DL-KZGPed-KZG
param. size + +
comt. size
proof size +
poly-hiding---perfect
eval-hidingDL in perfect
poly-binding-co-DL-co-DL
eval-binding-SDH-SDH

Multi-evaluation Proofs

We will see that the DL-KZG scheme can be generalized to allow proving multiple evaluations with one single proof consisting of a single element. This technique can also be applied to Ped-KZG but it is much less interesting since the size of the proof for evaluations is one element plus field elements, hence it grows linearly with

Recall that for a polynomial is equivalent to being divisible by How does this generalize to multiple evaluations?

First, let us recall some vocabulary from the section about Lagrange interpolation. An evaluation domain (or simply domain) of size is a subset of size The vanishing polynomial over is the polynomial defined as A multi-evaluation of size is a subset such that for The evaluation domain associated with is We say that a polynomial satisfies a multi-evaluation if for every

The idea of multi-evaluation proofs relies on the generalized polynomial remainder theorem that we restate here. Let be a multi-evaluation of size and Let be the vanishing polynomial for domain and be the Lagrange interpolation polynomial for i.e., the unique polynomial of degree at most such that for every Then satisfies if and only if divides

For one recovers the standard polynomial remainder theorem since for a single point the vanishing polynomial is and the Lagrange interpolation polynomial is the constant polynomial hence satisfies if and only if divides

Syntax and Security Definition

Let us now see how to adapt the syntax of a PC scheme to accommodate multi-evaluation proofs. Concretely, a PC scheme with multi-evaluation proofs consists of five algorithms: and have the same syntax as for a standard PC scheme, while and are replaced respectively by the following two algorithms:

  • a algorithm which on input parameters a polynomial a decommitment and a tuple of distinct field elements, returns a tuple and a proof

  • a algorithm which on input parameters a commitment a multi-evaluation and a proof returns 1 if is a valid proof that the polynomial committed to by satisfies and 0 otherwise.

The correctness property can be straightforwardly adapted: for every security parameter every every every and every subset the following game capturing the nominal execution of algorithms for multi-evaluation proving must return true with probability 1:

We must modify the security definitions accordingly. The poly-hiding and poly-binding notion are identical to the ones defined for a standard PC scheme. The eval-hiding notion is very similar: one only needs to adapt the oracle so that it may be queried on domains of size larger than 1.

The eval-binding notion requires more care: we still want that no adversary can prove that a polynomial evaluates to two different values and at the same input point however, now the adversary has the freedom to prove this for two different multi-evaluations and with the constraint that and To emphasize the difference, we call this adapted security notion multi-binding. It is defined via the following game.

KZG with Multi-evaluation Proofs: Description

The DL-KZG multi-evaluation PC scheme works as follows:

  • The algorithm, on input the security parameter runs draws random generators and of respectively and draws and returns public parameters

  • The and algorithms are defined exactly as for DL-KZG.

  • The algorithm, on input a commitment key a polynomial and a subset of size computes the polynomials and the group element and returns and the proof

  • The algorithm, on input a verification key a commitment a multi-evaluation of size and a proof computes the polynomials and returns 1 if and 0 otherwise.

Observe that the algorithm must compute and For a multi-evaluation of size has degree at most and has degree Hence, if proofs for multi-evaluations of size at most are to be supported, one can restrict the public parameters to and derive a commitment key and a verification key

Security Proof

The proof of Theorem 19.3 can be adapted to show that DL-KZG is multi-binding under a slightly different assumption called -bilinear strong Diffie-Hellman (-BSDH). This problem is as follows: given compute a pair such that Note that Indeed, given a solution for some SDH instance, one can compute a solution for the corresponding BSDH instance. The converse, though, is not known to hold, so that BSDH is presumably a stronger assumption than SDH.

Theorem 19.8. Assume that the -BSDH problem is hard for Then the DL-KZG multi-evaluation scheme for maximal degree and multi-evaluations of size at most is multi-binding. More precisely, for any adversary against the multi-binding security of DL-KZG, there exists an adversary for the -BSDH problem running in time similar to the time of and such that

Proof

Let be an adversary against the multi-binding security of the DL-KZG scheme for maximal degree We construct an adversary for the -BSDH problem. gets pairing group parameters and an instance of the -BSDH problem. The goal of is to return a pair such that Adversary runs on input parameters Assume that returns a commitment a tuple and two valid multi-evaluations/proof pairs and such that and If (which can verify by checking whether is equal to the second group element of the parameters then simply picks an arbitrary element and returns as solution to the -BSDH instance. From now on, we assume that

Let resp. be the evaluation domain corresponding to resp. Let also resp. be the vanishing polynomial for resp. and resp. be the Lagrange interpolation polynomial for resp. Validity of the two proofs imply that Combining these two equations, we obtain We know that is a root of both and Hence, there are polynomials and such that and We also know that polynomial evaluates to at Hence, by the polynomial remainder theorem, there is a polynomial such that Note that can explicitly compute and Injecting this in the previous equation, we get where for the last equality we used that and Hence, can return where is the right-hand side of the last equation, as solution to the -BSDH instance.

As a sanity check, observe that for a single evaluation ( one has and in which case the last equation simplifies to which allows to solve the -SDH problem and recover Theorem 19.3.

A Practical Use Case

Ethereum is planning to use the KZG polynomial commitment scheme for proto-danksharding. Its properties make it a convenient solution to the data availability problem. A distributed trusted setup is being run at the time of writing.

Additional Resources

There are many resources explaining KZG out there, here are a few:


1: As for standard commitment schemes, the name can vary and this is sometimes called a common reference string (crs) or structured reference string (srs) when it does not consist of random bits and has a specific "shape", as it is the case for KZG.

2: In the seminal paper introducing polynomial commitment schemes [KZG10a], evaluation hiding is simply called hiding.

3: KZG polynomial commitments are often described with a symmetric pairing (i.e., but we define them for an asymmetric pairing as this is the preferred option in practice.

4: Quite often, generators and are standard and specified in public parameters alongside and

5: The polynomial is well-defined by the polynomial remainder theorem.