Skip to content

Commit

Permalink
added docs
Browse files Browse the repository at this point in the history
Signed-off-by: degrigis <[email protected]>
  • Loading branch information
degrigis committed Dec 16, 2023
1 parent 57d5018 commit 775a5a6
Show file tree
Hide file tree
Showing 9 changed files with 243 additions and 0 deletions.
14 changes: 14 additions & 0 deletions docs/docs/advanced.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# 🚀 Advanced Topics



# Memory Model



# Chain Together Contracts Execution




# State Plugins
135 changes: 135 additions & 0 deletions docs/docs/core.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@

# 🔥 Core Concepts

## Overview
greed relies on analyses provided by [Gigahorse](https://github.com/nevillegrech/gigahorse-toolchain), a fantastic analysis framework for EVM-based smart contracts developed by [Dedaub](https://dedaub.com/). While Gigahorse offers precise CFG reconstruction and various out-of-the-box data-flow analyses (such as storage layout partial reconstruction, tainted calls, etc.), it lacks the capability to perform classic symbolic execution of a contract's code. For example, it cannot automatically discover the value of CALLDATA to reach a specific instruction. Additionally, we found accessing the contract's information through Datalog rules (or via the produced artifacts) to be somewhat impractical for creating more complex analyses. This led to the creation of greed!

When designing greed, we not only wanted to provide a convenient Python wrapper for Gigahorse's results but also offer symbolic execution capabilities similar to the popular [Mythril](https://github.com/Consensys/mythril). Unlike Mythril, we don't offer a one-click solution to discover smart contract vulnerabilities; instead, we provide a powerful and user-friendly smart contract binary analysis platform (those familiar with [angr](https://github.com/angr/angr) will find many similarities with the project). That being said, one can implement all the analyses offered by Mythril on top of greed.

We believe that greed provides a more flexible way to prototype research tools and smart contract analyses compared to its competitors. This is accomplished through a ([angr](https://github.com/angr/angr)-inspired) modular design of the symbolic executor, the inherited precision of the CFG analysis offered by [Gigahorse](https://github.com/nevillegrech/gigahorse-toolchain), and the conciseness of the TAC-IR that strips away many unnecessary complexity of the stack-based design of the EVM.

To use greed, it is first necessary to analyze a target contract binary with Gigahorse, obtaining all the decompilation artifacts and CFG reconstruction. Please refer to the [quick start](/quickstart) to do so.

## Three Address Code
Our symbolic executor, greed, is based on [Gigahorse](https://github.com/nevillegrech/gigahorse-toolchain)'s T(hree) A(ddress) C(ode) intermediate representation.
This is a convenient IR that transforms the (EVM) stack-based opcodes into a register based representation.

For instance, the classic function `transferFrom` in TAC would look like this:

```python
function transferFrom(address,address,uint256)() public {
Begin block 0x1ca
prev=[], succ=[0x1d1, 0x1d5]
=================================
0x1cb: v1cb = CALLVALUE
0x1cc: v1cc = ISZERO v1cb
0x1cd: v1cd(0x1d5) = CONST
0x1d0: JUMPI v1cd(0x1d5), v1cc

Begin block 0x1d1
prev=[0x1ca], succ=[]
=================================
0x1d1: v1d1(0x0) = CONST
0x1d4: REVERT v1d1(0x0), v1d1(0x0)
...
```

One thing to keep in mind when using greed is: every variable defined by the TAC IR is becoming a virtual register. For instance, when symbolic executing the `transferFrom` function, you can access to content of the variable `v1cb` like this:

```python
>>> state.registers['v1cb']
```
To access the value returned by the opcode `CALLVALUE`.


## Project
To create a greed Project you should just:

```python
>>> from greed import Project
>>> proj = Project(target_dir=target_dir)
```

Where `target_dir` is the path to the folder holding the Gigahorse analyses artifacts.
The project will import in greed most of the analyses' results generated by Gigahorse (e.g., CFG), and will make them available through handy fields.

## Basics

The `Project` object offers some attributes to inspect the contract under analysis.


```python
>>> proj.block_at # dictionary of all the basic block in the contract
>>> proj.callgraph # callgraph of the entire contract code
>>> proj.code # raw bytecode
>>> proj.statement_at # dictionary of all the (TAC) opcode in the contract
```

### CFG

Gigahorse provides state-of-the-art CFG reconstructon for smart contract binaries. This information is automatically imported and available via the greed's API.
Every basic block in the CFG is terminated by an unconditional jump (i.e., `JUMP`), a conditional jump (i.e., `JUMPI`) or a terminating opcode (`STOP`/`REVERT`/`RETURN`).

To access the CFG information of specific functions in the contract you can do the following:

```python
>>> proj.function_at['0x0'].cfg
>>> proj.function_at['0x0'].cfg.dump("mycfg.dot") # this will dump the CFG in .dot format
>>> proj.function_at['0x0'].cfg.bbs # this will return a list of block objects belonging to the CFG
```

For instance, to access a specific `Statement` or `Block` information:

```python
>>> stmt = proj.statement_at['0x1e']
>>> block = proj.block_at['0x21']
```

The `Statement` and `Block` objects, once fetched, offer [numerous other attributes]() for advanced inspection.

### State
Similary to angr, a SimState (i.e., `State`) represents the state of the smart contract at a specific opcode. You can imagine a State as a snapshot of the execution at a specific program location.

You can create your first state from a project using the `Factory` object:

```python
>>> first_state = proj.factory.entry_state(xid=1) # the xid represent an unique identifier for the current symbolic execution.
```

When no initial context is specified, **every** environment variables (e.g., balance, calldata, calldata size, block, etc...) are gonna be set to a symbolic variable.
Conversely, you can pass an *initial context* by passing a dictionary and specify some concrete values:

```python
>>> ctx = {
"CALLDATA" : "0x001122330099aabb77ff00",
"CALLDATASIZE" : 11,
"CALLER" : "0x6b6Ae9eDA977833B379f4b49655124b4e5c64086",
"ORIGIN" : "0x6b6Ae9eDA977833B379f4b49655124b4e5c64086",
"ADDRESS" : "0x1f9840a85d5aF5bf1D1762F925BDADdC4201F984",
}
>>> entry_state = proj.factory.entry_state(xid=1,init_ctx=ctx)
```

This will set the the CALLDATA for the symbolic execution to the value `0x001122330099aabb77ff00`, a CALLDATASIZE of 11 bytes, `ADDRESS` of the currently executing contract to `0x1f9840a85d5aF5bf1D1762F925BDADdC4201F984` and `CALLER`/`ORIGIN` to `0x6b6Ae9eDA977833B379f4b49655124b4e5c64086`.
The full list of possible concrete values that an user can pass is showed [here]().

A state offers many API for its inspection:

```python
>>> first_state.curr_stmt # print the current statement ready for execution
>>> first_state.revert # weather the state reverted or not
>>> first_state.instruction_count # how many instructions have been executed up to this point
>>> first_state.pc # current program counter
>>> first_state.registers # overview of all the virtual registers defined during the execution up to this point
```

### Symbolic Execution
Symbolic execution is the process of progressing SimState(s) and propagating the symbolic variables according to the program's execution.
Imagine, for instance, that the condition of a `JUMPI` contains a symbolic variable X.
If the condition checks for `X != 0`, and the constraints over the variable X make it possible for `X != 0` and `X == 0`, greed will create two different State(s), one at the location specified by the JUMPI (adding a constraint of `X != 0` to the state), and one at the fallthrough (adding a constraint of `X == 0`).


### Simulation Manager
The simulation manager is a concept borrowed from angr. It is the starting point for beginning symbolic execution and orchestrates the progression of the SimState(s) according to the specified `Exploration Techniques`.

### Solver
1 change: 1 addition & 0 deletions docs/docs/credits.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# 👥 Credits
Empty file added docs/docs/examples.md
Empty file.
6 changes: 6 additions & 0 deletions docs/docs/faq.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@


# ?FAQ #


#### Whaaat
8 changes: 8 additions & 0 deletions docs/docs/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Welcome to greed's documentation!

Welcome to `greed`'s documentation! This documentation is intended to be a quick guide to use greed: our symbolic execution engine for EVM smart contract binaries.

If `greed` makes you think of `angr`, is because we are basically from the same research group (UCSB Seclab).
Some of the folks developing `greed` are also `angr`'s contributor, thus, many of the choices in this project are inspired by `angr`.


Empty file added docs/docs/options.md
Empty file.
48 changes: 48 additions & 0 deletions docs/docs/quickstart.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@

# ⚡️ Quick Start


## Installing from Source

???+ note
We strongly suggest using a python virtual environment rather than installing
greed globally. Doing so reduces dependency conflicts and aids in
reproducibility while debugging. Some popular tools that accomplish this
include [virtualenv](https://virtualenv.pypa.io/en/latest/) and [virtualenvwrapper](https://virtualenvwrapper.readthedocs.io/en/latest/).


```bash
git clone [email protected]:ucsb-seclab/greed.git
mkvirtualenv greed
workon greed
./setup.sh
```


## Usage
First, the contract needs to be pre-processed with gigahorse. This can be done in two ways:

```bash
# Analyses will pollute the current working directory
mkdir /tmp/test_contract
cd /tmp/test_contract/

# OPTION 1: From the solidity source
cp <contract_source> contract.sol
analyze_source.sh contract.sol

# OPTION 2: From the contract bytecode
cp <contract_bytecode> contract.hex
analyze_contract_hex.sh contract.hex
```

The bytecode analyses should work on any system where gigahorse can be properly compiled.


## Troubleshooting
Any Common problem?



## Reporting Bugs
How to report bugs 🐞
31 changes: 31 additions & 0 deletions docs/mkdocs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
site_name: greed - A symbolic execution engine for EVM smart contract binaries
theme:
name: readthedocs
features:
- navigation.instant
- navigation.tabs
markdown_extensions:
- tables
- mkdocs-click
- pymdownx.highlight:
anchor_linenums: true
line_spans: __span
pygments_lang_class: true
- pymdownx.inlinehilite
- pymdownx.snippets
- pymdownx.superfences
- admonition
- pymdownx.details
- pymdownx.superfences

nav:
- Home: index.md
- QuickStart: quickstart.md
- Core Concepts: core.md
- Advanced Topics: advanced.md
- Options: options.md
- Examples: examples.md
- Credits: credits.md

- FAQ: faq.md

0 comments on commit 775a5a6

Please sign in to comment.