-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEP: Deprecate all data reading functionality via pandas-datareader; ensure independence from SPY and FF #536
Changes from 19 commits
3253719
dca4ba5
c0b7ec7
b3e1469
ed4e11d
1bb6f39
8e36a2c
893bd76
a75944f
7d7a9cd
d12cf4a
fbb511b
e27a853
8c9ea2b
1ff60fb
8821774
4b11da6
91a6910
5a749ec
60ebc82
7c77d88
b8a0557
1fac58d
1f91172
b89e6c2
7f28f7a
7001240
8f3a243
2517419
81e91f4
d0255ab
b6b4518
dcbbd93
f9d3012
ea215db
cee52d2
2a6a81f
915f6df
389a248
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -80,6 +80,62 @@ If you find a bug, feel free to [open an issue](https://github.com/quantopian/py | |
You can also join our [mailing list](https://groups.google.com/forum/#!forum/pyfolio) or | ||
our [Gitter channel](https://gitter.im/quantopian/pyfolio). | ||
|
||
## Support | ||
|
||
Please [open an issue](https://github.com/quantopian/pyfolio/issues/new) for support. | ||
|
||
### Deprecated: Data Reading via `pandas-datareader` | ||
|
||
As of early 2018, Yahoo Finance has suffered major API breaks with no stable | ||
replacement, and the Google Finance API has not been stable since late 2017 | ||
[(source)](https://github.com/pydata/pandas-datareader/blob/da18fbd7621d473828d7fa81dfa5e0f9516b6793/README.rst). | ||
In recent months it has become a greater and greater strain on the `empyrical` | ||
and `pyfolio` development teams to maintain support for fetching data through | ||
`pandas-datareader` and other third-party libraries, as these APIs are known to | ||
be unstable. | ||
|
||
As a result, all `empyrical` (and therefore `pyfolio`, which is a downstream | ||
dependency) support for data reading functionality has been deprecated and will | ||
be removed in a future version. | ||
|
||
Users should beware that the following functions are now deprecated: | ||
|
||
- `pyfolio.utils.default_returns_func` | ||
- `pyfolio.utils.get_fama_french` | ||
- `pyfolio.utils.get_returns_cached` | ||
- `pyfolio.utils.get_symbol_returns_from_yahoo` | ||
- `pyfolio.utils.get_treasury_yield` | ||
- `pyfolio.utils.cache_dir` | ||
- `pyfolio.utils.ensure_directory` | ||
- `pyfolio.utils.data_path` | ||
- `pyfolio.utils._1_bday_ago` | ||
- `pyfolio.utils.load_portfolio_risk_factors` | ||
- `pyfolio.utils.register_return_func` | ||
- `pyfolio.utils.get_symbol_rets` | ||
|
||
Users should expect regular failures from the following functions, pending | ||
patches to the Yahoo or Google Finance API: | ||
|
||
- `pyfolio.utils.default_returns_func` | ||
- `pyfolio.utils.get_symbol_returns_from_yahoo` | ||
- `pyfolio.utils.get_symbol_rets` | ||
|
||
For alternative data sources, we suggest the following: | ||
|
||
1. Migrate your research workflow to the Quantopian Research environment, | ||
where there is [free and flexible data access to over 57 | ||
datasets](https://www.quantopian.com/data) | ||
2. Make use of any remaining functional APIs supported by | ||
`pandas-datareader`. These include: | ||
|
||
- [Morningstar](https://pydata.github.io/pandas-datareader/stable/remote_data.html#remote-data-morningstar) | ||
- [Quandl](https://pydata.github.io/pandas-datareader/stable/remote_data.html#remote-data-quandl) | ||
|
||
Please note that you may need to create free accounts with these data | ||
providers and receive an API key in order to access data. These API keys | ||
should be set as environment variables, or passed as an argument to | ||
`pandas-datareader`. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well written! |
||
|
||
## Contributing | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
# | ||
# Copyright 2017 Quantopian, Inc. | ||
# Copyright 2018 Quantopian, Inc. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
|
@@ -140,72 +140,6 @@ def axes_style(style='darkgrid', rc=None): | |
return sns.axes_style(style=style, rc=rc) | ||
|
||
|
||
def plot_rolling_fama_french(returns, | ||
factor_returns=None, | ||
rolling_window=APPROX_BDAYS_PER_MONTH * 6, | ||
legend_loc='best', | ||
ax=None, **kwargs): | ||
""" | ||
Plots rolling Fama-French single factor betas. | ||
|
||
Specifically, plots SMB, HML, and UMD vs. date with a legend. | ||
|
||
Parameters | ||
---------- | ||
returns : pd.Series | ||
Daily returns of the strategy, noncumulative. | ||
- See full explanation in tears.create_full_tear_sheet. | ||
factor_returns : pd.DataFrame, optional | ||
data set containing the Fama-French risk factors. See | ||
utils.load_portfolio_risk_factors. | ||
rolling_window : int, optional | ||
The days window over which to compute the beta. | ||
legend_loc : matplotlib.loc, optional | ||
The location of the legend on the plot. | ||
ax : matplotlib.Axes, optional | ||
Axes upon which to plot. | ||
**kwargs, optional | ||
Passed to plotting function. | ||
|
||
Returns | ||
------- | ||
ax : matplotlib.Axes | ||
The axes that were plotted on. | ||
""" | ||
|
||
if ax is None: | ||
ax = plt.gca() | ||
|
||
ax.set_title( | ||
"Rolling Fama-French single factor betas (%.0f-month)" % ( | ||
rolling_window / APPROX_BDAYS_PER_MONTH | ||
) | ||
) | ||
|
||
ax.set_ylabel('Beta') | ||
|
||
rolling_beta = timeseries.rolling_regression( | ||
returns, | ||
factor_returns=factor_returns, | ||
rolling_window=rolling_window) | ||
|
||
rolling_beta = rolling_beta[['SMB', 'HML', 'Mom']] | ||
rolling_beta.plot(alpha=0.7, ax=ax, **kwargs) | ||
|
||
ax.axhline(0.0, color='black') | ||
ax.legend(['Small cap (SMB)', | ||
'High growth (HML)', | ||
'Momentum (UMD)'], | ||
loc=legend_loc, frameon=True, framealpha=0.5) | ||
|
||
y_axis_formatter = FuncFormatter(utils.two_dec_places) | ||
ax.yaxis.set_major_formatter(FuncFormatter(y_axis_formatter)) | ||
ax.axhline(0.0, color='black') | ||
ax.set_xlabel('') | ||
ax.set_ylim((-1.0, 1.0)) | ||
return ax | ||
|
||
|
||
def plot_monthly_returns_heatmap(returns, ax=None, **kwargs): | ||
""" | ||
Plots a heatmap of returns by month. | ||
|
@@ -566,9 +500,8 @@ def plot_perf_stats(returns, factor_returns, ax=None): | |
returns : pd.Series | ||
Daily returns of the strategy, noncumulative. | ||
- See full explanation in tears.create_full_tear_sheet. | ||
factor_returns : pd.DataFrame, optional | ||
data set containing the Fama-French risk factors. See | ||
utils.load_portfolio_risk_factors. | ||
factor_returns : pd.DataFrame | ||
Data set containing the risk factors. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we be more specific about what we expect here? "containing the risk factors" doesn't tell me much about what I should be passing here. |
||
ax : matplotlib.Axes, optional | ||
Axes upon which to plot. | ||
|
||
|
@@ -601,7 +534,7 @@ def plot_perf_stats(returns, factor_returns, ax=None): | |
] | ||
|
||
|
||
def show_perf_stats(returns, factor_returns, positions=None, | ||
def show_perf_stats(returns, factor_returns=None, positions=None, | ||
transactions=None, turnover_denom='AGB', | ||
live_start_date=None, bootstrap=False, | ||
header_rows=None): | ||
|
@@ -619,7 +552,7 @@ def show_perf_stats(returns, factor_returns, positions=None, | |
returns : pd.Series | ||
Daily returns of the strategy, noncumulative. | ||
- See full explanation in tears.create_full_tear_sheet. | ||
factor_returns : pd.Series | ||
factor_returns : pd.Series, optional | ||
Daily noncumulative returns of the benchmark. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The description here seems like it doesn't match the parameter name. What does this parameter actually do? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In my mind, there isn't much ambiguity in this docstring: it's the daily returns of some benchmark factor (possibly risk factor). E.g. it could the returns of SPY, or the returns of any of the Fama French factors, etc. Perhaps "benchmark factor" would be more explicit? I've changed the docstring to reflect that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. With some searching, I'm realizing that the same parameter has very spotty docstrings in different functions. I'm changing all functions to be this same docstring:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That docstring is definitely an improvement. My general objection to this parameter name is that |
||
- This is in the same style as returns. | ||
positions : pd.DataFrame, optional | ||
|
@@ -841,7 +774,7 @@ def cone(in_sample_returns (pd.Series), | |
ax.set_yscale('log' if logy else 'linear') | ||
|
||
if volatility_match and factor_returns is None: | ||
raise ValueError('volatility_match requires passing of' | ||
raise ValueError('volatility_match requires passing of ' | ||
'factor_returns.') | ||
elif volatility_match and factor_returns is not None: | ||
bmark_vol = factor_returns.loc[returns.index].std() | ||
|
@@ -909,7 +842,7 @@ def plot_rolling_beta(returns, factor_returns, legend_loc='best', | |
returns : pd.Series | ||
Daily returns of the strategy, noncumulative. | ||
- See full explanation in tears.create_full_tear_sheet. | ||
factor_returns : pd.Series, optional | ||
factor_returns : pd.Series | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same general note on this docstring |
||
Daily noncumulative returns of the benchmark. | ||
- This is in the same style as returns. | ||
legend_loc : matplotlib.loc, optional | ||
|
@@ -1005,7 +938,7 @@ def plot_rolling_volatility(returns, factor_returns=None, | |
|
||
ax.set_ylabel('Volatility') | ||
ax.set_xlabel('') | ||
if factor_returns.empty: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Given that this would have crashed before if There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @ssanderson I'm not sure I see how this function would crash if There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On this line, we're accessing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @ssanderson yes, which is why I changed it to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @eigenfoo my point here was that, since There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah I see! No, that was intentional: see this issue. The idea is that we want to make pyfolio as benchmark-independent as possible: if a benchmark is passed, everything works as required. If not, pyfolio simply skips the analyses that depend on benchmarks. |
||
if factor_returns is None: | ||
ax.legend(['Volatility', 'Average volatility'], | ||
loc=legend_loc, frameon=True, framealpha=0.5) | ||
else: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than emphasize that the readers are deprecated, I would focus on making pyfolio independent of a benchmark being present, that's an enhancement. Then discuss how empyrical does not provide them anymore so if you need them, you have to find them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that's the case I'd rather put that in the release notes. I'll remove the README update here.