You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While playing around with #282 I noticed an asymmetry in how leap days contribute to the distribution of YoY slopes. Ignoring filtering and first/last year complications for a moment, each day in the aggregated series is supposed to contribute to one forward and one backward slope. However it seems that there is a small bug related to leap days where a single point can contribute to three slopes instead of two.
To reproduce:
importpandasaspdimportrdtoolsimportmatplotlib.pyplotaspltdaily_pm=pd.Series(1, index=pd.date_range('2014-01-01', '2017-12-31', freq='d'))
daily_pm.loc['2015-02-28'] =0# outlier point that interacts with a leap dayrd, ci, calc_info=rdtools.degradation.degradation_year_on_year(daily_pm)
fig=rdtools.plotting.degradation_summary_plots(rd, ci, calc_info, daily_pm)
fig.axes[1].set_ylim(0, 10) # shrink y-axis to show detail
Note that in the histogram plot, the left-most bin has height=1 and the right-most bin has height=2. So a single outlier day that interacts with a leap day creates one big negative slope but two big positive slopes. Examining the df variable inside rdtools.degradation.degradation_year_on_year confirms this -- Feb 28 gets paired with Feb 28, but it also gets paired with Feb 29 in one direction:
While playing around with #282 I noticed an asymmetry in how leap days contribute to the distribution of YoY slopes. Ignoring filtering and first/last year complications for a moment, each day in the aggregated series is supposed to contribute to one forward and one backward slope. However it seems that there is a small bug related to leap days where a single point can contribute to three slopes instead of two.
To reproduce:
Note that in the histogram plot, the left-most bin has height=1 and the right-most bin has height=2. So a single outlier day that interacts with a leap day creates one big negative slope but two big positive slopes. Examining the
df
variable insiderdtools.degradation.degradation_year_on_year
confirms this -- Feb 28 gets paired with Feb 28, but it also gets paired with Feb 29 in one direction:df.loc['2015-02'].tail()
:df.loc['2016-02'].tail()
:I suspect, but did not verify, that this has to do with
pd.merge_asof
's default choice ofdirection='backward'
. Possible solutions:The text was updated successfully, but these errors were encountered: