-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
valid_days unnecessarily slow #108
Comments
Thank you very much for this. To date this package has not been optimized for performance, so that is definitely something that could be improved. The reason for the difference in speed you are seeing is that inside of every call to valid_day() is a call to create the pandas date_range with the holiday calendars. That construction is expensive. Since you are calling that function inside the loop it is going thought the holiday construction and date_range() on every iteration of the loop. In contrast your fast solution calls schedule(), which also does the date_range() and holiday call, but only once since you create it outside the loop. There are definitely opportunities for optimization of the library and I would welcome any PR you would like to contribute. |
I am interested in helping. As I don't have a strong computer science background, I have zero experience submitting to github, writing unit tests, and the like. If I submit a proposed fix (I assume that 'PR' means something similar to proposed fix), I would probably need a bit of hand holding to do all of the other stuff right (unit testing... proper conventions... etc.). Just wanted to give a heads up on my situation. Also, I am wrapped up fixing other performance issues with my app right now, I won't be able to work on this issue until that is done. |
No problem at all. We were all beginners at one point. There are some great online resources on how to create a pull request, here is one: https://www.digitalocean.com/community/tutorials/how-to-create-a-pull-request-on-github Myself or others would be happy to help |
This is addressed in #117 |
This is my first true github issue submission, my apologies if I buggered up the formatting or left out useful info.
Name: pandas-market-calendars
Version: 1.3.5
I was using the valid_days function to calculate trade days remaining for options expiration dates. This was extremely slow. About 500ms per call to valid_days. Overall, for me, this added 16 seconds to my app after calculating trade days for one stock's option chain (both put and call). The 16 seconds is slightly inflated, because I was calculating each date twice (once for puts and once for calls). But still, 8 seconds is a lot for such a simple set of function calls.
I was able to work around the issue by using the schedule function, caching the schedule, and then indexing and counting schedule to get the same data.
valid_days shouldn't be this slow. On one level, this is a feature request, but valid_days is so slow, it might also be fair to consider this a bug.
Here is code to reproduce the issue:
And the code's output:
The text was updated successfully, but these errors were encountered: