-
Notifications
You must be signed in to change notification settings - Fork 15
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: degrigis <[email protected]>
- Loading branch information
Showing
9 changed files
with
243 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
# 🚀 Advanced Topics | ||
|
||
|
||
|
||
# Memory Model | ||
|
||
|
||
|
||
# Chain Together Contracts Execution | ||
|
||
|
||
|
||
|
||
# State Plugins |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,135 @@ | ||
|
||
# 🔥 Core Concepts | ||
|
||
## Overview | ||
greed relies on analyses provided by [Gigahorse](https://github.com/nevillegrech/gigahorse-toolchain), a fantastic analysis framework for EVM-based smart contracts developed by [Dedaub](https://dedaub.com/). While Gigahorse offers precise CFG reconstruction and various out-of-the-box data-flow analyses (such as storage layout partial reconstruction, tainted calls, etc.), it lacks the capability to perform classic symbolic execution of a contract's code. For example, it cannot automatically discover the value of CALLDATA to reach a specific instruction. Additionally, we found accessing the contract's information through Datalog rules (or via the produced artifacts) to be somewhat impractical for creating more complex analyses. This led to the creation of greed! | ||
|
||
When designing greed, we not only wanted to provide a convenient Python wrapper for Gigahorse's results but also offer symbolic execution capabilities similar to the popular [Mythril](https://github.com/Consensys/mythril). Unlike Mythril, we don't offer a one-click solution to discover smart contract vulnerabilities; instead, we provide a powerful and user-friendly smart contract binary analysis platform (those familiar with [angr](https://github.com/angr/angr) will find many similarities with the project). That being said, one can implement all the analyses offered by Mythril on top of greed. | ||
|
||
We believe that greed provides a more flexible way to prototype research tools and smart contract analyses compared to its competitors. This is accomplished through a ([angr](https://github.com/angr/angr)-inspired) modular design of the symbolic executor, the inherited precision of the CFG analysis offered by [Gigahorse](https://github.com/nevillegrech/gigahorse-toolchain), and the conciseness of the TAC-IR that strips away many unnecessary complexity of the stack-based design of the EVM. | ||
|
||
To use greed, it is first necessary to analyze a target contract binary with Gigahorse, obtaining all the decompilation artifacts and CFG reconstruction. Please refer to the [quick start](/quickstart) to do so. | ||
|
||
## Three Address Code | ||
Our symbolic executor, greed, is based on [Gigahorse](https://github.com/nevillegrech/gigahorse-toolchain)'s T(hree) A(ddress) C(ode) intermediate representation. | ||
This is a convenient IR that transforms the (EVM) stack-based opcodes into a register based representation. | ||
|
||
For instance, the classic function `transferFrom` in TAC would look like this: | ||
|
||
```python | ||
function transferFrom(address,address,uint256)() public { | ||
Begin block 0x1ca | ||
prev=[], succ=[0x1d1, 0x1d5] | ||
================================= | ||
0x1cb: v1cb = CALLVALUE | ||
0x1cc: v1cc = ISZERO v1cb | ||
0x1cd: v1cd(0x1d5) = CONST | ||
0x1d0: JUMPI v1cd(0x1d5), v1cc | ||
|
||
Begin block 0x1d1 | ||
prev=[0x1ca], succ=[] | ||
================================= | ||
0x1d1: v1d1(0x0) = CONST | ||
0x1d4: REVERT v1d1(0x0), v1d1(0x0) | ||
... | ||
``` | ||
|
||
One thing to keep in mind when using greed is: every variable defined by the TAC IR is becoming a virtual register. For instance, when symbolic executing the `transferFrom` function, you can access to content of the variable `v1cb` like this: | ||
|
||
```python | ||
>>> state.registers['v1cb'] | ||
``` | ||
To access the value returned by the opcode `CALLVALUE`. | ||
|
||
|
||
## Project | ||
To create a greed Project you should just: | ||
|
||
```python | ||
>>> from greed import Project | ||
>>> proj = Project(target_dir=target_dir) | ||
``` | ||
|
||
Where `target_dir` is the path to the folder holding the Gigahorse analyses artifacts. | ||
The project will import in greed most of the analyses' results generated by Gigahorse (e.g., CFG), and will make them available through handy fields. | ||
|
||
## Basics | ||
|
||
The `Project` object offers some attributes to inspect the contract under analysis. | ||
|
||
|
||
```python | ||
>>> proj.block_at # dictionary of all the basic block in the contract | ||
>>> proj.callgraph # callgraph of the entire contract code | ||
>>> proj.code # raw bytecode | ||
>>> proj.statement_at # dictionary of all the (TAC) opcode in the contract | ||
``` | ||
|
||
### CFG | ||
|
||
Gigahorse provides state-of-the-art CFG reconstructon for smart contract binaries. This information is automatically imported and available via the greed's API. | ||
Every basic block in the CFG is terminated by an unconditional jump (i.e., `JUMP`), a conditional jump (i.e., `JUMPI`) or a terminating opcode (`STOP`/`REVERT`/`RETURN`). | ||
|
||
To access the CFG information of specific functions in the contract you can do the following: | ||
|
||
```python | ||
>>> proj.function_at['0x0'].cfg | ||
>>> proj.function_at['0x0'].cfg.dump("mycfg.dot") # this will dump the CFG in .dot format | ||
>>> proj.function_at['0x0'].cfg.bbs # this will return a list of block objects belonging to the CFG | ||
``` | ||
|
||
For instance, to access a specific `Statement` or `Block` information: | ||
|
||
```python | ||
>>> stmt = proj.statement_at['0x1e'] | ||
>>> block = proj.block_at['0x21'] | ||
``` | ||
|
||
The `Statement` and `Block` objects, once fetched, offer [numerous other attributes]() for advanced inspection. | ||
|
||
### State | ||
Similary to angr, a SimState (i.e., `State`) represents the state of the smart contract at a specific opcode. You can imagine a State as a snapshot of the execution at a specific program location. | ||
|
||
You can create your first state from a project using the `Factory` object: | ||
|
||
```python | ||
>>> first_state = proj.factory.entry_state(xid=1) # the xid represent an unique identifier for the current symbolic execution. | ||
``` | ||
|
||
When no initial context is specified, **every** environment variables (e.g., balance, calldata, calldata size, block, etc...) are gonna be set to a symbolic variable. | ||
Conversely, you can pass an *initial context* by passing a dictionary and specify some concrete values: | ||
|
||
```python | ||
>>> ctx = { | ||
"CALLDATA" : "0x001122330099aabb77ff00", | ||
"CALLDATASIZE" : 11, | ||
"CALLER" : "0x6b6Ae9eDA977833B379f4b49655124b4e5c64086", | ||
"ORIGIN" : "0x6b6Ae9eDA977833B379f4b49655124b4e5c64086", | ||
"ADDRESS" : "0x1f9840a85d5aF5bf1D1762F925BDADdC4201F984", | ||
} | ||
>>> entry_state = proj.factory.entry_state(xid=1,init_ctx=ctx) | ||
``` | ||
|
||
This will set the the CALLDATA for the symbolic execution to the value `0x001122330099aabb77ff00`, a CALLDATASIZE of 11 bytes, `ADDRESS` of the currently executing contract to `0x1f9840a85d5aF5bf1D1762F925BDADdC4201F984` and `CALLER`/`ORIGIN` to `0x6b6Ae9eDA977833B379f4b49655124b4e5c64086`. | ||
The full list of possible concrete values that an user can pass is showed [here](). | ||
|
||
A state offers many API for its inspection: | ||
|
||
```python | ||
>>> first_state.curr_stmt # print the current statement ready for execution | ||
>>> first_state.revert # weather the state reverted or not | ||
>>> first_state.instruction_count # how many instructions have been executed up to this point | ||
>>> first_state.pc # current program counter | ||
>>> first_state.registers # overview of all the virtual registers defined during the execution up to this point | ||
``` | ||
|
||
### Symbolic Execution | ||
Symbolic execution is the process of progressing SimState(s) and propagating the symbolic variables according to the program's execution. | ||
Imagine, for instance, that the condition of a `JUMPI` contains a symbolic variable X. | ||
If the condition checks for `X != 0`, and the constraints over the variable X make it possible for `X != 0` and `X == 0`, greed will create two different State(s), one at the location specified by the JUMPI (adding a constraint of `X != 0` to the state), and one at the fallthrough (adding a constraint of `X == 0`). | ||
|
||
|
||
### Simulation Manager | ||
The simulation manager is a concept borrowed from angr. It is the starting point for beginning symbolic execution and orchestrates the progression of the SimState(s) according to the specified `Exploration Techniques`. | ||
|
||
### Solver |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
# 👥 Credits |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
|
||
|
||
# ?FAQ # | ||
|
||
|
||
#### Whaaat |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
# Welcome to greed's documentation! | ||
|
||
Welcome to `greed`'s documentation! This documentation is intended to be a quick guide to use greed: our symbolic execution engine for EVM smart contract binaries. | ||
|
||
If `greed` makes you think of `angr`, is because we are basically from the same research group (UCSB Seclab). | ||
Some of the folks developing `greed` are also `angr`'s contributor, thus, many of the choices in this project are inspired by `angr`. | ||
|
||
|
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
|
||
# ⚡️ Quick Start | ||
|
||
|
||
## Installing from Source | ||
|
||
???+ note | ||
We strongly suggest using a python virtual environment rather than installing | ||
greed globally. Doing so reduces dependency conflicts and aids in | ||
reproducibility while debugging. Some popular tools that accomplish this | ||
include [virtualenv](https://virtualenv.pypa.io/en/latest/) and [virtualenvwrapper](https://virtualenvwrapper.readthedocs.io/en/latest/). | ||
|
||
|
||
```bash | ||
git clone [email protected]:ucsb-seclab/greed.git | ||
mkvirtualenv greed | ||
workon greed | ||
./setup.sh | ||
``` | ||
|
||
|
||
## Usage | ||
First, the contract needs to be pre-processed with gigahorse. This can be done in two ways: | ||
|
||
```bash | ||
# Analyses will pollute the current working directory | ||
mkdir /tmp/test_contract | ||
cd /tmp/test_contract/ | ||
|
||
# OPTION 1: From the solidity source | ||
cp <contract_source> contract.sol | ||
analyze_source.sh contract.sol | ||
|
||
# OPTION 2: From the contract bytecode | ||
cp <contract_bytecode> contract.hex | ||
analyze_contract_hex.sh contract.hex | ||
``` | ||
|
||
The bytecode analyses should work on any system where gigahorse can be properly compiled. | ||
|
||
|
||
## Troubleshooting | ||
Any Common problem? | ||
|
||
|
||
|
||
## Reporting Bugs | ||
How to report bugs 🐞 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
site_name: greed - A symbolic execution engine for EVM smart contract binaries | ||
theme: | ||
name: readthedocs | ||
features: | ||
- navigation.instant | ||
- navigation.tabs | ||
markdown_extensions: | ||
- tables | ||
- mkdocs-click | ||
- pymdownx.highlight: | ||
anchor_linenums: true | ||
line_spans: __span | ||
pygments_lang_class: true | ||
- pymdownx.inlinehilite | ||
- pymdownx.snippets | ||
- pymdownx.superfences | ||
- admonition | ||
- pymdownx.details | ||
- pymdownx.superfences | ||
|
||
nav: | ||
- Home: index.md | ||
- QuickStart: quickstart.md | ||
- Core Concepts: core.md | ||
- Advanced Topics: advanced.md | ||
- Options: options.md | ||
- Examples: examples.md | ||
- Credits: credits.md | ||
|
||
- FAQ: faq.md | ||
|