Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement bulk generation module #17

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Conversation

AusIV
Copy link

@AusIV AusIV commented Jun 20, 2016

I needed to generate several million names for a sample dataset,
and looking up names out of the file every time was very time
consuming.

I left the original logic untouched, as it's the best approach for
a quick, one-off lookup. I've added a separate module, 'names.bulk'
which offers the same interface, but caches the entire file in memory
and picks names with a binary search instead of a scan.

Should resolve: #3

I needed to generate several million names for a sample dataset,
and looking up names out of the file every time was very time
consuming.

I left the original logic untouched, as it's the best approach for
a quick, one-off lookup. I've added a separate module, 'names.bulk'
which offers the same interface, but caches the entire file in memory
and picks names with a binary search instead of a scan.
@coveralls
Copy link

coveralls commented Jun 20, 2016

Coverage Status

Coverage decreased (-1.3%) to 98.75% when pulling b0eb4b4 on AusIV:master into c485a43 on treyhunner:master.

@coveralls
Copy link

coveralls commented Jun 20, 2016

Coverage Status

Coverage remained the same at 100.0% when pulling 79934f8 on AusIV:master into c485a43 on treyhunner:master.

@coveralls
Copy link

coveralls commented Jun 20, 2016

Coverage Status

Coverage remained the same at 100.0% when pulling 495e650 on AusIV:master into c485a43 on treyhunner:master.

@AusIV
Copy link
Author

AusIV commented Jun 20, 2016

I'm a bit perplexed. The Travis-CI build seems to be failing on Python 3.2 on something that has nothing to do with my changes. It looks like the coverage package is failing on Python 3.2

@coveralls
Copy link

coveralls commented Jun 21, 2016

Coverage Status

Coverage remained the same at 100.0% when pulling 808e1b0 on AusIV:master into c485a43 on treyhunner:master.

@dblackdblack
Copy link

@treyhunner this looks like a great PR

@altendky
Copy link

altendky commented Aug 8, 2019

So I failed to bother to look at PRs before writing my own ([WIP]) in #23 which is 'similar' to this. Here at least a bisect is used but I think the random.choices() with weights specified might be better? I don't know how they actually implement it but maybe something to consider.

@alexisszabo
Copy link

@treyhunner Any chance we could approve this PR? This looks like a big performance boost on apps that call the library repeatedly. Anything I could do to help if QA needed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide an API to call name functions without having them open the data files on every call.
5 participants