-
Notifications
You must be signed in to change notification settings - Fork 685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
slow conversion of dates from one tz to another #779
Comments
I don't know about python pandas, but with Linux's glibc when you convert a What you may want to do if you're converting a lot of |
Clearly approximation is not acceptable for the task I'm pursuing. Not sure about what pandas do though. Thanks for the help, but it looks more complicated than this. I've stepped through the code. In |
Is using Documentation: https://howardhinnant.github.io/date/tz.html#Installation |
Is that different to USE_SYSTEM_TZ_DB (already ON) ? ZonedTime forced me to parallelize / adapt lots of things (csv parser, vector zt change, vector zt serialization with only 1 zt, custom string parser). |
I don't use the CMake systems associated with this project and deeply regret their introduction. They introduce way too much complication for compiling a single source: tz.cpp. Their over encapsulation is a continual source of errors. For initializing a |
I don't have a quick fix for this besides Consider this HelloWorld timing test of converting 10,000 local times from New York to London: #include <cassert>
#include <iostream>
#include <stdexcept>
#include <numeric>
#include <vector>
int
main()
{
using namespace date;
using namespace std;
using namespace std::chrono;
auto tz_ny = locate_zone("America/New_York");
auto tz_lon = locate_zone("Europe/London");
vector<local_time<system_clock::duration>> v_ny(10'000);
for (auto& tp : v_ny)
tp = tz_ny->to_local(system_clock::now());
vector<local_time<system_clock::duration>> v_lon;
v_lon.reserve(v_ny.size());
auto t0 = steady_clock::now();
for (auto tp : v_ny)
v_lon.push_back(tz_lon->to_local(tz_ny->to_sys(tp)));
auto t1 = steady_clock::now();
cout << t1 - t0 << '\n';
} For me this prints out about:
However I can optimize it by caching recently used time zones and infos with: template <class Duration>
auto
to_local(date::sys_time<Duration> tp, date::time_zone const* tz)
{
using namespace date;
thread_local time_zone const* tz_save = tz;
thread_local sys_info info = tz->get_info(tp);
if (tz != tz_save || !(info.begin <= tp && tp < info.end))
{
tz_save = tz;
info = tz->get_info(tp);
}
return local_time<Duration>{tp.time_since_epoch() + info.offset};
}
template <class Duration>
auto
to_sys(date::local_time<Duration> tp, date::time_zone const* tz)
{
using namespace date;
auto get_info = [](date::local_time<Duration> tp, date::time_zone const* tz)
{
auto info = tz->get_info(tp);
if (info.result != local_info::unique)
throw std::runtime_error("local time point is not unique");
return info;
};
thread_local time_zone const* tz_save = tz;
thread_local local_info info = get_info(tp, tz);
if (tz != tz_save)
{
tz_save = tz;
info = get_info(tp, tz);
return sys_time<Duration>{tp.time_since_epoch() - info.first.offset};
}
sys_time<Duration> utc_tp{tp.time_since_epoch() - info.first.offset};
if (info.first.begin <= utc_tp && utc_tp < info.first.end)
return utc_tp;
info = get_info(tp, tz);
return sys_time<Duration>{tp.time_since_epoch() - info.first.offset};
} The test changes from: for (auto tp : v_ny)
v_lon.push_back(tz_lon->to_local(tz_ny->to_sys(tp))); to: for (auto tp : v_ny)
v_lon.push_back(to_local(to_sys(tp, tz_ny), tz_lon)); Now my output is more like:
About 33x faster. Of course this won't be effective if the time zones involved change a lot, or if the time points are not clumped such that they are often close together. ymmv. |
CMake is a necessary evil. Without it (and vcpkg) C++ folks look like idiots when they try to bring a library in, compared to Python people. |
Do you know if it is possible for clients of a repository to manage a CMake-less repository with their own CMake scripts? |
Thanks a lot, that indeed helped. In more than one place. With // I convert 7 million dates in 1 sec in debug. And I handled default value as well. The whole thing flies. |
One can use fetch_content to populate one own's tree at configure time, and include them in their build. I quite honestly think that the CMakeFile correctly transcripts the various build parameters if one needs to move away from vcpkg defaults (which I did for windows while I struggled with pybind11). |
I'm toying with a new tz database format that has the potential to obsolete any performance concerns. But is is very early days on that... |
It's possible to produce your own FindDate.cmake file which would allow a CMake based project use this library, however it's generally much less error-prone when using an installed library or a strictly header-only library that doesn't have many flags. Providing your own CMake files, however, solves issues that weren't apparent when you first made date.h
Of these, CMake might be the easiest to do, even though it has its problems. It also has the advantage of being used in vcpkg which takes care of many Microsoft-based developers. |
Converting a large amount of tz-dates to another tz is painfully slow. This does not look good wrt say, python pandas.
Most of the time is spent in
sys_info find_rule(...)
which seems to use a linear search.Maybe a lower_bound could/should be used initially, but the code is somewhat not easy to apprehend and I could not devise a solution by myself.
The text was updated successfully, but these errors were encountered: