Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persist Python interpreter for repeated use #6

Merged
merged 15 commits into from
Jul 23, 2024
Merged

Conversation

sevein
Copy link
Member

@sevein sevein commented Jul 11, 2024

This PR introduces changes to ensure that the Python interpreter is persisted for repeated use, instead of creating a new instance with each method call. This is a low priority change, as it is only beneficial when using *bagit.BagIt repeatedly, e.g.: when validating hundreds of bags.

ErrBusy is returned if you try to use *bagit.BagIt concurrently. If you need to perform concurrent validation you should have a dedicated *bagit.BagIt instance per goroutine. Adding support for concurrent work in *bagit.BagIt would be possible by implementing an internal pool of interpreters, but this does not seem necessary atm.

Using -benchtime=10s shows that the Validate method is significantly faster; within 10 seconds, we can now run Validate approximately 1,200 times compared to the previous 144 times.

Before:

(main) $ go test -run=^BenchmarkValidate -bench=. -benchtime=10s -cpu=1 -count=3
goos: linux
goarch: amd64
pkg: github.com/artefactual-labs/bagit-gython
cpu: Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz
BenchmarkValidate 	     144	  76810063 ns/op
BenchmarkValidate 	     141	  77826339 ns/op
BenchmarkValidate 	     144	  75244650 ns/op
PASS
ok  	github.com/artefactual-labs/bagit-gython	65.955s

After:

(dev/persistent-branch) $ go test -run=^BenchmarkValidate -bench=. -benchtime=10s -cpu=1 -count=3
goos: linux
goarch: amd64
pkg: github.com/artefactual-labs/bagit-gython
cpu: Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz
BenchmarkValidate 	    1171	   9401249 ns/op
BenchmarkValidate 	    1258	   9371621 ns/op
BenchmarkValidate 	    1255	   9158033 ns/op
PASS
ok  	github.com/artefactual-labs/bagit-gython	84.983s

I've made changes to the Python script to improve error handling and reporting. See the following interaction in a terminal where each request (e.g. INVALID-JSON) always receives a JSON-encoded request even when malformed, and errors are now including a type attribute with the name of the underlying Python exception.

$ python main.py
INVALID-JSON
{"err": "Expecting value: line 1 column 1 (char 0)", "type": "JSONDecodeError"}
{}
{"err": "'None' is not a valid command, use: validate, make, exit", "type": "UnknownCommandError"}
{"name": "validate", "args": {"path": "/etc/motd"}}
{"err": "Expected bagit.txt does not exist: /etc/motd/bagit.txt", "type": "BagError"}
{"name": "exit"}

@sevein sevein force-pushed the dev/persistent-interpreter branch from 1f27016 to d6d2170 Compare July 13, 2024 06:16
@sevein sevein requested a review from djjuhasz July 13, 2024 06:20
Copy link
Contributor

@djjuhasz djjuhasz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@sevein sevein merged commit d8d3815 into main Jul 23, 2024
4 checks passed
@sevein sevein deleted the dev/persistent-interpreter branch July 23, 2024 11:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants