Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guide to implementing a new runner + Python runner for PArSEC #241

Open
wants to merge 8 commits into
base: trunk
Choose a base branch
from

Conversation

maurermi
Copy link
Collaborator

@maurermi maurermi commented Dec 3, 2023

This is a Python-based runner and a corresponding guide for adding a new runner. The goal of this extension is to make development of new runner environments more straightforward.

Also embedded here are a script for running the Lua bench test, a ticket logging mechanism for reporting the status of tickets seen by a broker, and a get_row utility for reading directly from the shards. These could be split out into a separate PR if helpful.

Note (for reviewers): This introduces a dependency on Python developer tools. I think it's worth considering making this a separate branch or repository as a result.

@maurermi maurermi force-pushed the generic_runner_final branch 2 times, most recently from 98f2885 to 995709b Compare December 3, 2023 19:01
@maurermi
Copy link
Collaborator Author

maurermi commented Dec 4, 2023

Will address the linting errors shortly

@maurermi maurermi force-pushed the generic_runner_final branch 6 times, most recently from 230f402 to 24f860e Compare December 14, 2023 22:05
Adds a tool to get information directly from the
shards, for debug and testing purposes.

Signed-off-by: Michael Maurer <[email protected]>
Signed-off-by: Michael Maurer <[email protected]>
Signed-off-by: Michael Maurer <[email protected]>
Signed-off-by: Michael Maurer <[email protected]>
+ Includes linting fixes

Signed-off-by: Michael Maurer <[email protected]>
Signed-off-by: Michael Maurer <[email protected]>
@@ -0,0 +1,33 @@
def pay(pay, balance1, balance2):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type hints won't forcibly typecast or error out but helps for reading the code

Suggested change
def pay(pay, balance1, balance2):
def pay(pay: int, balance1: int, balance2: int) -> [int, int]:

pay = int(pay)
if(pay > balance1):
pay = 0
balance1 = balance1 - pay
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
balance1 = balance1 - pay
balance1 -= pay

if(pay > balance1):
pay = 0
balance1 = balance1 - pay
balance2 = balance2 + pay
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
balance2 = balance2 + pay
balance2 += pay

Comment on lines +2 to +4
balance1 = int(balance1)
balance2 = int(balance2)
pay = int(pay)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[optional] could compact it by assigning on one line. Just style preference

Suggested change
balance1 = int(balance1)
balance2 = int(balance2)
pay = int(pay)
balance1, balance2, pay = int(balance1), int(balance2), int(pay)

balance1 = int(balance1)
balance2 = int(balance2)
pay = int(pay)
if(pay > balance1):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spacing after the if is preferable for readability

Suggested change
if(pay > balance1):
if (pay > balance1):

Comment on lines +11 to +18
def accrueInterest(rate, balance1):
balance1 = int(balance1)
rate = float(rate)
print("Balance 1:", balance1)
print("Rate:", rate*100, "%")
balance1 = int(balance1 * (2.71828**rate))
print("Balance 1:", balance1)
return [balance1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would write it like this. Could add a "Balance 1 after rate:" print after the conversion has been applied.

Suggested change
def accrueInterest(rate, balance1):
balance1 = int(balance1)
rate = float(rate)
print("Balance 1:", balance1)
print("Rate:", rate*100, "%")
balance1 = int(balance1 * (2.71828**rate))
print("Balance 1:", balance1)
return [balance1]
def accrueInterest(rate: float, balance1: int) -> list[int]:
rate, balance1 = float(rate), int(balance1)
print(f"Balance 1: {balance1}")
print(f"Rate: {rate*100}%")
balance1 = int(balance1 * (2.71828**rate))
print(f"Balance 1: {balance1}")
return [balance1]

Comment on lines +20 to +32
def pay2(pay1, pay2, balance0, balance1, balance2):
balance0 = int(balance0)
balance1 = int(balance1)
balance2 = int(balance2)
pay1 = int(pay1)
pay2 = int(pay2)
if(pay1 + pay2 > balance0 and pay1 > 0 and pay2 > 0):
pay1 = 0
pay2 = 0
balance0 = balance0 - pay1 - pay2
balance1 = balance1 + pay1
balance2 = balance2 + pay2
return [balance0, balance1, balance2]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def pay2(pay1, pay2, balance0, balance1, balance2):
balance0 = int(balance0)
balance1 = int(balance1)
balance2 = int(balance2)
pay1 = int(pay1)
pay2 = int(pay2)
if(pay1 + pay2 > balance0 and pay1 > 0 and pay2 > 0):
pay1 = 0
pay2 = 0
balance0 = balance0 - pay1 - pay2
balance1 = balance1 + pay1
balance2 = balance2 + pay2
return [balance0, balance1, balance2]
def pay2(pay1, pay2, balance0, balance1, balance2) -> list[int, int, int]:
balance0, balance1, balance2 = int(balance0), int(balance1), int(balance2)
if (pay1 + pay2 > balance0) and (min(pay1, pay2) > 0):
pay1, pay2 = 0, 0
balance0 -= (pay1 + pay2)
balance1 += pay1
balance2 += pay2
return [balance0, balance1, balance2]

@@ -0,0 +1,115 @@
# assumes all return types are strings
def parseContract(file, funcName):
import re
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Best to put imports at the top before functions

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I included the import here instead of above is because python functions which are to be converted to smart contracts need to have their imports defined within the actual function, so this preserves that stylistic behavior. In this particular instance however the import can be defined outside of the function, so I would be happy to make that change.

Further, imports need to be specified in the C++ scope (see runners/py/pyutil.cpp:20) which is interesting to note, because it allows for the system designer to restrict what python libraries are available for smart contracts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, good context, thanks for explaining the logic

Comment on lines +1 to +14
#!/bin/bash

# A sample script for running the Lua bench test

IP="localhost"
PORT="8889"
N_WALLETS=2

./build/tools/bench/parsec/lua/lua_bench --component_id=0 \
--ticket_machine0_endpoint=$IP:7777 --ticket_machine_count=1 \
--shard_count=1 --shard0_count=1 --shard00_endpoint=$IP:5556 \
--agent_count=1 --agent0_endpoint=$IP:$PORT \
--loglevel=TRACE scripts/gen_bytecode.lua $N_WALLETS
echo done
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs rebasing for #253

@@ -41,7 +41,7 @@ fi

if [[ "$OSTYPE" == "linux-gnu"* ]]; then
apt update
apt install -y build-essential wget cmake libgtest-dev libbenchmark-dev lcov git software-properties-common rsync unzip
apt install -y build-essential wget cmake libgtest-dev libbenchmark-dev lcov git software-properties-common rsync unzip python3-dev
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified that this works. It installed python 3.10 in my Ubuntu environment, but in Dockerfile at root dir, it is specified that we use python 3.8 ENV PY_VER=python3.8 (line 30)

@@ -27,6 +27,8 @@ FROM $BASE_IMAGE AS builder
# Copy source
COPY . .

ENV PY_VER=python3.8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 3.8 instead of newer such as 3.10+ ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since #252 was merged, we require python 3.10 in-practice anyway (in terms of what's shipped by the image our docker images are based upon). I'm quite comfortable pushing this to 3.10.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, there is no reason that this needs to be 3.8 and cannot be 3.10 -- at the time of writing I was using python 3.8 but can run it just fine with python 3.10

Comment on lines +95 to +102
COPY --from=builder /opt/tx-processor/build/tools/bench/parsec/py/py_bench ./build/tools/bench/parsec/py/py_bench

# Copy wait script
COPY --from=builder /opt/tx-processor/scripts/wait-for-it.sh ./scripts/wait-for-it.sh

# Copy python scripts
COPY --from=builder /opt/tx-processor/scripts/pythonContractConverter.py ./scripts/pythonContractConverter.py
COPY --from=builder /opt/tx-processor/scripts/paycontract.py ./scripts/paycontract.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reasonable additions, but this is already a strong argument that we need a better maintainability story for runners and how we add them.

(i.e., the scripts directory is going to rapidly become unwieldy—let's be honest, it already is—should we continue adding runners without bound).

return contract


contract = parseContract(file, funcname)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"file" and funcname" are not defined. Is this meant to be a comment on how to run it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this file is actually just used to parse python functions such that arbitrary python functions can be formatted for usage as smart contracts. It cannot be run on its own, but is written to be used within pyutil.cpp:formContract.

In that function, file and funcname are specified within the C++ scope and registered within the Python context, which is why they are not defined here.

I do certainly understand the confusion here, and I don't actually think this is the cleanest way to do this contract parsing -- though it is much more straightforward to define string parsing behavior within python as opposed to C++ which is why I did it. This was just written to leverage the already-in-use Python VM to handle some string operations with the ease of Python string operations.

Comment on lines +1 to +2
# assumes all return types are strings
def parseContract(file, funcName):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would add type hints then

Suggested change
# assumes all return types are strings
def parseContract(file, funcName):
# assumes all return types are strings
def parseContract(file: str, funcName: str) -> str:

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fully agree that type-hints are a good practice, and make the commented assumption at least parsed (even if not enforced).

I might additionally suggest removing the comment if the type hints are added.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually refers to the return types of functions which are parsed by parseContract, so this behavior cannot be represented as a type hint (or at least not a standard Pythonic type hint). That said, these type hints are correct and could be added without an issue in this context, if helpful.

data_lines = data_lines[idx:]


while(True):
Copy link
Contributor

@rockett-m rockett-m Apr 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for parentheses

Suggested change
while(True):
while True:

Comment on lines +5 to +13
fin = -1
try:
fin = open(file)
except:
print("[ERROR] {fname} cannot be opened".format(fname = file))
return ""

data = fin.read()
fin.close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The with handles file closing automatically

Suggested change
fin = -1
try:
fin = open(file)
except:
print("[ERROR] {fname} cannot be opened".format(fname = file))
return ""
data = fin.read()
fin.close()
data = ''
try:
with open(file, 'r') as fi:
data = fi.read()
except:
print(f"[ERROR] {file} cannot be opened")
return ""

Comment on lines +18 to +20
if(("def " + funcName + "(") not in data):
print("[ERROR] {fname} does not contain a definition for \"{func}\"".format(fname = file, func = funcName))
return ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

f-strings are useful here

Suggested change
if(("def " + funcName + "(") not in data):
print("[ERROR] {fname} does not contain a definition for \"{func}\"".format(fname = file, func = funcName))
return ""
if f"def {funcName}(" not in data:
print(f"[ERROR] {file} does not contain a definition for \"{funcName}\"")
return ""

@HalosGhost
Copy link
Collaborator

Needs rebase.

Copy link
Collaborator

@HalosGhost HalosGhost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments as I skimmed through pre-rebase and testing.

Solid conceptACK; look forward to testing thoroughly and getting it merged!

@@ -27,6 +27,8 @@ FROM $BASE_IMAGE AS builder
# Copy source
COPY . .

ENV PY_VER=python3.8
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since #252 was merged, we require python 3.10 in-practice anyway (in terms of what's shipped by the image our docker images are based upon). I'm quite comfortable pushing this to 3.10.

Comment on lines +95 to +102
COPY --from=builder /opt/tx-processor/build/tools/bench/parsec/py/py_bench ./build/tools/bench/parsec/py/py_bench

# Copy wait script
COPY --from=builder /opt/tx-processor/scripts/wait-for-it.sh ./scripts/wait-for-it.sh

# Copy python scripts
COPY --from=builder /opt/tx-processor/scripts/pythonContractConverter.py ./scripts/pythonContractConverter.py
COPY --from=builder /opt/tx-processor/scripts/paycontract.py ./scripts/paycontract.py
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reasonable additions, but this is already a strong argument that we need a better maintainability story for runners and how we add them.

(i.e., the scripts directory is going to rapidly become unwieldy—let's be honest, it already is—should we continue adding runners without bound).

Comment on lines +1 to +14
#!/bin/bash

# A sample script for running the Lua bench test

IP="localhost"
PORT="8889"
N_WALLETS=2

./build/tools/bench/parsec/lua/lua_bench --component_id=0 \
--ticket_machine0_endpoint=$IP:7777 --ticket_machine_count=1 \
--shard_count=1 --shard0_count=1 --shard00_endpoint=$IP:5556 \
--agent_count=1 --agent0_endpoint=$IP:$PORT \
--loglevel=TRACE scripts/gen_bytecode.lua $N_WALLETS
echo done
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs rebasing for #253


#cp ./scripts/pythonContractConverter.py ./build/tools/bench/parsec/py/

rr record ./build/tools/bench/parsec/py/py_bench --component_id=0 --ticket_machine0_endpoint=$IP:7777 --ticket_machine_count=1 --shard_count=1 --shard0_count=1 --shard00_endpoint=$IP:5556 --agent_count=1 --agent0_endpoint=$IP:$PORT
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love rr, but we don't currently require it or set it up in our environment-setup scripts. ./scripts/native-system-benchmark.sh as added in #259 might offer some inspiration for how to optionally support it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an oversight on my end, should not be included here -- good catch

Comment on lines +1 to +2
# assumes all return types are strings
def parseContract(file, funcName):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fully agree that type-hints are a good practice, and make the commented assumption at least parsed (even if not enforced).

I might additionally suggest removing the comment if the type hints are added.

@@ -142,6 +143,16 @@ auto main(int argc, char** argv) -> int {
broker,
log,
cfg.value());
} else if(cfg->m_runner_type == cbdc::parsec::runner_type::py) {
Copy link
Collaborator

@HalosGhost HalosGhost May 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(optional) I think this full-block might be a candidate for replacement with switch/case (so that the number of supported runners doesn't necessarily change the start-up time).

However, there's a deeper question that may be worth considering here: are runners a good candidate for pulling into a dynamically-linked library enabling the installation/support of new runners without recompilation? This is not a very small engineering effort; but something like dylib might enable us to load the runners at runtime.

This might be a particularly positive way of pulling runners out of the main repo and modularizing them. Food for thought.

cf. #261

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WRT the first comment, I would agree that we ought to make this a switch statement -- for this, I think what'd be best is defining an enum with runner types

WRT the second comment, I think that's actually a very interesting idea -- and i am definitely for pulling runners out of the main repository. This may be a task for a later PR but I think this is good to keep in mind for sure

// get parameters that are user inputs
size_t bufferOffset = 0;
auto it = 0;
// Known that there are 3 parts of the header
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels a touch magical. Is this guaranteed, python-version-dependent, or internally-specified?

If internally-specified, is there benefit to pulling these cases out as an enum?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is internally specified, and the input validation is handled just above this code block -- just to check, are you referring to the cases in the switch statement below this block?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the actual line I highlighted for this comment was line 194 (the comment asserting that there are three parts of the header). 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants