-
Notifications
You must be signed in to change notification settings - Fork 329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speeds up module.add_debug_info and add_metadata by 2x #887
base: main
Are you sure you want to change the base?
Conversation
Looks like I need to run flake8 and address some style issues. Will do that shortly. |
Thanks for the PR! I'm curious about
Is there any public info about your compiler? |
It's not public yet! Hopefully early next year I'll publish it for people to see. There are a couple of sneak peeks here: https://twitter.com/haxor/status/1580959518984241158 I will also share the private project with you, Graham, so you can take a look for context. The part that this PR specifically improves is how I assign Thank you for looking and the attention on this PR! |
Is there anything else you need from me on this PR? Just making sure. Thank you for the reviews! |
Thanks for checking in - it's taking me a little while to review as I'm unfamiliar with most of the parts that are changed in this PR - apologies for the delay. I do have some review notes drafted but I'd like to finish the review before I post them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks for the PR, and apologies for the delay in reviewing.
The performance measurement reported by test_debug_info_caching
is more than doubled (execution time is less than half) when using this patch on my machine, but I can see how it has even more performance benefit for real-world use cases (as you report in the description). I think this is a good change overall.
I have a few questions / comments, which are mainly on the diff.
In addition, I think the debug metadata performance measurement is valuable to have because it'll enable us to work out where a performance regression occurred (if we notice one) but I don't think we can keep it in the unit test suite itself - generally the test suite should not output anything, and the test more than doubles the testsuite execution time on my system. Could you please:
- Copy the code from
test_debug_info_caching
into a standalone example in theexamples
directory, complete with timing and output? - Remove the timing and printing from
test_debug_info_metadata
, and have the test code only execute a single time (as opposed to 10) so that the test takes less time? I think it should be possible to simplify the test and make it much more "straight-line code" since separate functions won't be needed fortimeit
in the context of the test.
I hope the comments are clear and helpful - please let me know if any clarification or additional information would be helpful.
@@ -8,7 +7,7 @@ class DuplicatedNameError(NameError): | |||
class NameScope(object): | |||
def __init__(self): | |||
self._useset = set(['']) | |||
self._basenamemap = defaultdict(int) | |||
self._basenamemap = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is a defaultdict slower than a normal dict? It seems that making this change here requires you to then emulate a default dict below - is there a reason not to keep it as a defaultdict?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Follow-up - I tried reverting the changes in this file and it seemed to make no performance difference in test_debug_info_caching
)
try: | ||
ident = self._basenamemap[basename] | ||
except KeyError: | ||
ident = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the emulation of a default dict I mentioned in the comment above - could it just be
try: | |
ident = self._basenamemap[basename] | |
except KeyError: | |
ident = 0 | |
ident = self._basenamemap[basename] |
with _basenamemap
remaining a defaultdict
?
@@ -667,6 +667,7 @@ def __init__(self, parent, values, name): | |||
types.MetaDataType(), | |||
name=name) | |||
self.operands = tuple(values) | |||
self.hash_cache = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like hash_cache
is an implementation detail rather than an attribute we'd want to expose - should this be _hash_cache
instead?
# Ensure that the hash is not cached between Python invocations | ||
# due to pickling or other serialization. The hash seed changes | ||
# which will cause these not to match. | ||
self.hash_cache = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems a bit odd to mutate state inside __getstate__
- do you see any downside to removing the hash cache from a copy of the dict?
self.hash_cache = None | ||
return self.__dict__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.g.
self.hash_cache = None | |
return self.__dict__ | |
state = self.__dict__.copy() | |
state['hash_cache'] = None | |
return state |
@@ -725,6 +735,7 @@ def __init__(self, parent, is_distinct, kind, operands, name): | |||
self.is_distinct = is_distinct | |||
self.kind = kind | |||
self.operands = tuple(operands) | |||
self.hash_cache = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment on the name to above, i.e. should it be _hash_cache
?
# Ensure that the hash is not cached between Python invocations | ||
# due to pickling or other serialization. The hash seed changes | ||
# which will cause these not to match. | ||
self.hash_cache = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also should this invalidate the hash cache on a copy?
|
||
print('test_debug_info_performance took', total_time, 'to finish') | ||
|
||
self.assertEqual(100004, len(mod._metadatacache)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the expected result could do with a comment explaining its derivation - it took me a while to work out that it's 100000 locations, plus one file, one compile unit, one subprogram, and [di_subprogram, di_subprogram]
.
md = self._metadatacache[key] | ||
|
||
fixed_operands = self._fix_metadata_operands(operands) | ||
key = hash(tuple(fixed_operands)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the PR description, I see
Only generates the key for module._metadatacache a single time instead of twice on cache miss
I'm not sure I can understand where it was generated twice before - am I missing that somewhere?
In any case, I do think the changes here are an improvement - reverting the specific changes to this file does slow down the performance test in test_debug_info_caching
below.
Hi @gmarkall! My apologies for the delay. This is great feedback. I'll start working through it and update the PR once I've addressed everything. Thank you |
@bslatkin Hi! and thank you for your initial work on this. I am the release manager for 0.41.x and am checking on the status of this PR. I see that it has been in progress for quite some time. 0.41.x is now scheduled to be tagged during the week 7th August 2023, which is in 4 weeks time. What do you think, will this PR make it by then? |
Thank you for the encouragement! I'm behind where I wanted to be on this, obviously :) I'll give it a try to turn it around before then. But I'm not sure how much additional review it might take, especially during the summer time. So perhaps I should aim for the release after that? |
It's up to you, just let me know what you prefer. Reviewers are the most scarce resource in the Numba ecosystem so I share your gut feeling that perhaps the 0.42 release is a more realistic goal. Here is the Gnatt chart for the next few months, which shows the approximate tag dates for Numba 0.59 and 0.60 -- these will coincide with llvmlite releases, as the two libraries are released in lockstep: Effectively, the next releases will be in the Nov 2023 (0.42.x) and Feb 2024 (0.43.x) time-frame. |
@bslatkin Hope you are well - it's been a while! I'm having a bit of a tidy-up, and checking in - do you see a future for this PR? |
In my own compiler's codebase, this change reduces the time it takes for my debug info regression test to complete from 180 seconds down to 20 seconds. The more nesting of metadata there is, the larger the effect.