Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Halo2 Soundness Bug Examples #15

Closed
jae-cuz opened this issue Oct 16, 2023 · 9 comments
Closed

Proposal: Halo2 Soundness Bug Examples #15

jae-cuz opened this issue Oct 16, 2023 · 9 comments
Assignees
Labels
Application Proposal Proposal submitted by applicants Revoked This application/grant has suspended

Comments

@jae-cuz
Copy link

jae-cuz commented Oct 16, 2023

General Grant Proposal

Project Overview 📄

Overview

Establish a repository featuring small halo2 circuits, each presenting a unique soundness bug

Project Details

  • Halo2, Soundness Bug, Documentation
  • Examples would show possible soundness bugs, guiding users to avoid them.
  • Reference

Team 👥

Team members

  • Names of team members: Jaewon In
  • Email: [email protected]
  • Telegram handle: jae-cuz
  • Discord handle: jae-cuz

Team Website

Team's experience

PSE Summer Contribution Program

Team Code Repos

Development Roadmap 🔩

This section should break out the development roadmap into a number of milestones. Since the milestones will appear in the grant contract, it helps to describe the functionality we should expect, plus how we can check that such functionality exists.

Below we provide an example roadmap. We recommend that the scope of the work can fit within a 3 month period and that teams structure their roadmap as 2 weeks = 1 milestone.

For each milestone:

  • Please be sure to include a specification of the software. The level of detail must be enough so that we are able to verify that the software meets the specification.
  • Please include total amount of funding requested per milestone.
  • Please note that we require documentation (e.g. tutorials, API specifications, architecture details) in each milestone. This ensures that the code can be widely used by the community.
  • Please provide a test suite, comprising unit and integration tests, along with a guide on how to run these.
  • Please indicate the milestone duration, as well as number of Full-Time Employees working on each milestone, and include the number of days along with their cost per day.

Overview

  • Total Estimated Duration: 6 weeks
  • Full-time equivalent (FTE): 0.5
  • Expected Start Date: Dec 18 2023
  • Expected End Date: Jan 28 2024

Milestone 1 Under-constrained Circuits

  • Estimated Duration: 2 weeks
  • FTE: 0.5
  • Estimated delivery date: Dec 31 2023

Deliverables and Specifications

1. Write vulnerable circuits & exploitation
  • Example vulnerable circuit with identifier mistakes
  • Example vulnerable circuit with logically missing constraints (ex: factoring circuit without constraing the input to be not one)
2. Write mitigations
  • Fix with correctly used identifier
  • Fix with sound constraints
3. Documentation
  • Explain the vulnerability in detail, showing that malicious prover can pass the verification.
  • Suggest mitigation technique using tools for detecting unused identifier, checking constraints, etc.
  • Explain about the role of Config, Chip, and Circuit
  • Suggest mitigation technique using correct template (avoiding copy and paste)

Milestone 2 Assigned but not Constrained

  • Estimated Duration: 2 weeks
  • FTE: 0.5
  • Estimated delivery date: Jan 14 2024

Deliverables and Specifications

1. Write vulnerable circuits & exploitation
  • Example vulnerable circuit where assigned but not constrained
2. Write mitigations
  • Fix with correct constraints
3. Documentation
  • Explain required mathematical background.
  • Explain the vulnerability in detail, showing that malicious prover can pass the verification.
  • Suggest mitigation technique using tools, as well as negative testing.

Milestone 3 Arithmetic Over/Underflow

  • Estimated Duration: 1 week
  • FTE: 0.5
  • Estimated delivery date: Jan 21 2024

Deliverables and Specifications

1. Write vulnerable circuits & exploitation
  • Example vulnerable circuit with arithmetic over/underflow
2. Write mitigations
  • Fix with constraints for over/underflow
3. Documentation
  • Explain the vulnerability in detail, showing that malicious prover can pass the verification.
  • Show real-world bugs
  • Suggest mitigation technique using tools, as well as negative testing.

Milestone 4 Review & Further

  • Estimated Duration: 1 week
  • FTE: 0.5
  • Estimated delivery date: Jan 28 2024

Deliverables and Specifications

1. Review codes and documents
  • Check typos and mistakes
  • Rewrite for any misunderstandings
2. Further research
  • Search for any other vulnerability categories
  • Add documents if a vulnerability was not suited for writing an example

Additional Information ➕

References

@NOOMA-42 NOOMA-42 added the Application Proposal Proposal submitted by applicants label Oct 17, 2023
@NOOMA-42
Copy link
Collaborator

looking for an evaluator for you

@NOOMA-42
Copy link
Collaborator

@ed255 Would you kindly review this proposal?

@ed255
Copy link
Member

ed255 commented Oct 19, 2023

Hi @jae-cuz!
I'm happy to see that you're interested in working on this!
Considering that this is a short period project I would ask that you list which vulnerabilities you plan to tackle. That would make the proposal more specific on its scope, and also help us evaluate if the amount of work fits in the timeline. I'm sure you've already taken a look at https://github.com/0xPARC/zk-bug-tracker#common-vulnerabilities-header where you can find a collection of zk circuit vulnerabilities. Some of those examples where found in non-halo2 circuits, but they may apply to halo2 circuits.

Aside from that, I would recommend to rearrange the roadmap like the following:

  • Identify all the work needed related to a single vulnerability (writing the vulnerable circuit, writing the exploit, writing the mitigation, document it, etc)
  • Then arrange the milestones as the set of vulnerabilities that you'll work on in each milestone; were all the work related to the vulnerabilities in scope would be completed by the end of each the milestone.

This is just a suggestion. I think this approach would help us review the process in the middle of the roadmap much better; and also at any point in time as output you would have complete vulnerability examples with full documentation.
Another benefit of this rearrangement is that it becomes more flexible: Imagine you plan to tackle 7 vulnerabilities, but it takes more time than expected: then you could spend a bit more time on each one and do 5 in total and still get a nice output. In your current plan, if you want to stay in the time scope and something takes more time, you could end up not finishing the documentation part.

@jae-cuz
Copy link
Author

jae-cuz commented Oct 19, 2023

@ed255 Thanks for your review!
Your suggestion was nice and I'm happy to apply them :) Please check my updated proposal.

Also, I wonder if you have any more vulnerability categories other than my list to suggest? Thanks!

@ed255
Copy link
Member

ed255 commented Oct 24, 2023

@ed255 Thanks for your review! Your suggestion was nice and I'm happy to apply them :) Please check my updated proposal.

The current proposal looks good to me! Thanks for updating it according to my feedback :)

Also, I wonder if you have any more vulnerability categories other than my list to suggest? Thanks!

Right now no more categories come to my mind, I think the current list looks good! What I would say is that the category "Under-constrained Circuits" can actually be very big! There are many different kind of situations that fall into under-constrained circuits. If you'd like to get more ideas of examples that fall into "under-constrained circuits" I would recommend to search for public security audits on projects built on top of halo2 (like this https://github.com/nullity00/zk-security-reviews/tree/main/Zcash but hopefully there are more). From those audits you can try find more examples. But overall I think your current list is good :)

@NOOMA-42
Copy link
Collaborator

@ed255 Thanks for the review,
@jae-cuz I will take care of the remaining operation matters. Please wait for further update

@NOOMA-42
Copy link
Collaborator

NOOMA-42 commented Oct 27, 2023

@jae-cuz
I've removed the pricing rate from proposal. Pricing rate will be processed internally and will not be revealed to public.

@adrianmcli
Copy link
Collaborator

This looks good to me!

@miha-stopar
Copy link

miha-stopar commented Dec 5, 2023

  1. One possible under-constrained circuit might be:
  • having two selectors z1 and z2 which should be boolean
  • when z1 = 1, the variable a needs to be some fixed value, let’s say 5
  • when z2 = 1, the variable b needs to be some fixed value, let’s say 7

The constraints might look like:

z1 * (z1 - 1) = 0
z2 * (z2 - 1) = 0
z1 * (a - 5) + z2 * (b - 7) = 0

The first two constraints ensure that z1 and z2 are boolean. The third constraint ensures that:

  • in z1 case, we have a = 5
  • in z2 case, we have b = 7

However, this circuit is under-constrained, because one could set z1 = 0, z2 = 0 and then the variables a and b can be set to any value. The missing constraint is:

z1 + z2 = 1

As a sidenote, we might have only one selector z and then the following constraints would suffice:

z * (z - 1) = 0
z * (a - 5) + (1 - z) * (b - 7) = 0

However, in some circuits we indeed have multiple selectors (like z1, z2 above), for example in the ZKEVM MPT proof we have boolean selectors to specify the proof type (whether it is NonceModification, BalanceModification, StorageModification…).

  1. Another possible under-constrained circuit might be:
  • we have an XML file with the following row specifying the price: <p>81</p> or <p>234</p>
  • the circuit needs to ensure that the price is either 81 or 112
  • let’s say we have an array of bytes representing the row: r
  • we have an additional witness in the boolean variable z that specify whether it’s the first case or the second

The constraints could be:

z * (z - 1) = 0
r[0] = <            // < should be translated into a byte
r[1] = p            // p should be translated into a byte
r[2] = >            // > should be translated into a byte
z * (r[4] - 8) + (1 - z) (r[4] - 2) = 0
z * (r[5] - 1) + (1 - z) (r[5] - 3) = 0
(1 - z) * (r[6] - 4) = 0

This is under-constrained because the constraints for the closing tag </p> are missing - one could set for example <p>812</p> or <p>814444</p> instead of <p>81</p>.

  1. Let’s have a similar scenario as in 2.
  • we have a file where one particular row need to be either abcd or dfga
  • let’s say we have an array of bytes representing this row: r
  • we have an additional witness in the boolean variable z that specify whether it’s the first case or the second

Instead of checking the row byte by byte (one constraint per each byte), we can use a linear combination. Let’s say we have a value rv and we define:

lc1 = a * rv + b * rv^2 + c * rv^3 + d * rv^4    // a should be a byte corresponding to the character “a”, …
lc2 = d * rv + f * rv^2 + g * rv^3 + a * rv^4    // a should be a byte corresponding to the character “a”, …

The constraints would be:

z * (z - 1) = 0
z * (r[0] * rv + r[1] * rv^2 + r[2] * rv^3 + r[3] * rv^4 - lc1) + (1 - z) * (r[0] * rv + r[1] * rv^2 + r[2] * rv^3 + r[3] * rv^4 - lc2)

However, if the value rv is not random (it is known ahead), one could easily set the bytes in r to obtain
lc1 = r[0] * rv + r[1] * rv^2 + r[2] * rv^3 + r[3] * rv^4. For example, one could simply set the first byte of r to be lc1/rv and the remaining bytes to be 0. So it’s essential to use a proper random value rv.

@NOOMA-42 NOOMA-42 added the Revoked This application/grant has suspended label Apr 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Application Proposal Proposal submitted by applicants Revoked This application/grant has suspended
Projects
None yet
Development

No branches or pull requests

5 participants