DEP: Deprecate all data reading functionality via pandas-datareader; ensure independence from SPY and FF #536

eigenfoo · 2018-05-18T19:15:06Z

Closes #538 #534 #532 #530 #525 #495 #484 #435 #413 by deprecating all data reading functionality.
Closes #499 by removing rolling Fama-French plots.

Similar to quantopian/empyrical#97

To quote from the README:

As of early 2018, Yahoo Finance has suffered major API breaks with no stable replacement, and the Google Finance API has not been stable since late 2017 (source). In recent months it has become a greater and greater strain on the empyrical and pyfolio development teams to maintain support for fetching data through pandas-datareader and other third-party libraries, as these APIs are known to be unstable.

As a result, all empyrical (and therefore pyfolio, which is a downstream dependency) support for data reading functionality has been deprecated and will be removed in a future version.

Users should beware that the following functions are now deprecated:

pyfolio.utils.default_returns_func
pyfolio.utils.get_fama_french
pyfolio.utils.get_returns_cached
pyfolio.utils.get_symbol_returns_from_yahoo
pyfolio.utils.get_treasury_yield
pyfolio.utils.cache_dir
pyfolio.utils.ensure_directory
pyfolio.utils.data_path
pyfolio.utils.load_portfolio_risk_factors

Users should expect regular failures from the following functions, pending patches to the Yahoo or Google Finance API:

pyfolio.utils.default_returns_func
pyfolio.utils.get_symbol_returns_from_yahoo

eigenfoo · 2018-05-21T14:33:12Z

It seems as if most of pyfolio is already independent of benchmarks. Most functionality in timeseries.py and plotting.py already check if benchmark_rets is None. So this is not as painful a PR as we initially expected, but there's still a lot to do. Mainly testing that all this nonsense actually works.

Brief summary of results:

pyfolio now no longer supports any data reading functionality at all: it is incumbent on users to pass in all data, even if this data is SPY or the Fama-French factors.
pyfolio should automatically adapt all tear sheet outputs to show what analysis/plots it can, given the data it has. This means that pyfolio is agnostic to data sources: i.e. because we are removing support for any data reading, it does not make sense to have anything tailored to any particular dataset.
Small changes to API: some arguments are now required, some are now optional. I've made sure that any changes produce graceful errors.
Unit tests have been changed to reflect this new API.

eigenfoo · 2018-05-21T16:14:57Z

This PR still isn't ready for merging, btw. Unit tests are done but I will need to do some manual tests to make sure everything is plotting correctly, etc.

In the meantime, @twiecki @richafrank requesting review now.

eigenfoo · 2018-05-22T18:18:21Z

@twiecki @richafrank finished manual tests, ready to review and merge now.

Note that for best results, we should ship both pyfolio and empyrical with my latest PRs together. Perhaps new releases?

twiecki · 2018-05-23T12:33:16Z

README.md

+   providers and receive an API key in order to access data. These API keys
+   should be set as environment variables, or passed as an argument to
+   `pandas-datareader`.
+


Well written!

twiecki · 2018-05-23T12:38:06Z

pyfolio/tears.py

@@ -925,12 +913,15 @@ def create_interesting_times_tear_sheet(
    bubble burst, EZB IR, Great Recession (August 2007, March and September
    of 2008, Q1 & Q2 2009), flash crash, April and October 2014.

+    benchmark_rets must be passed, as it is meaningless to analyze performance


Not sure we should make that decision for the user, many events in there are not related to macro market events.

So we should change the interesting times tear sheet to simply plot cumulative returns during interesting times? That should probably be a separate PR: I'll file an issue for that though.

Filed under #538

My read of the code suggested it was already possible to run it without a benchmark.

You're right, not too much work to do it.

twiecki

Nice, quite a big change that will make our lives so much easier.

twiecki · 2018-05-23T12:40:57Z

Assigning @richafrank to assign an engineering reviewer.

eigenfoo · 2018-06-11T20:20:34Z

@twiecki finished with engineering feedback; ready for another pass.

twiecki · 2018-06-12T08:08:00Z

Thanks @eigenfoo. @ssanderson can you check if you feedback was appropriately incorporated?

ssanderson · 2018-06-13T15:38:35Z

pyfolio/tests/test_tears.py

@@ -116,7 +118,9 @@ def test_create_round_trip_tear_sheet_breakdown(self, kwargs):
    @cleanup
    def test_create_interesting_times_tear_sheet_breakdown(self,
                                                           kwargs):
+        # FIXME needs benchmark_rets


Does this still need to be fixed?

ssanderson · 2018-06-13T15:57:14Z

pyfolio/utils.py

+cache_dir = empyrical.utils.cache_dir
+ensure_directory = empyrical.utils.ensure_directory
+data_path = empyrical.utils.data_path
+_1_bday_ago = empyrical.utils._1_bday_ago


This looks unused in pyfolio. Do we think external consumers were depending on this?

I couldn't say; I included it just in case though.

Let's remove these.

ssanderson · 2018-06-13T16:07:03Z

pyfolio/utils.py

    """
+    If the returns is longer than the benchmark's, limit strategy returns.


This description is a bit ambiguous. I would naively interpret "longer than" to mean "has more observations than", but it looks to me like the actual behavior here is that we remove any values from rets that are older than the oldest observation in benchmark_rets. I also wouldn't know what "limit" means without more context.

I would describe the currently-implemented behavior here as something like:

def clip_returns_to_benchmark(rets, benchmark_rets): """Drop entries from rets that precede the first entry in benchmark_rets. """

That said, is there a reason we're only clipping the left side here? Naively, I'd expect to have similar problems if rets/benchmark_rets don't align in either direction.

Interesting, I had just moved the code without thinking about it. The function now clips from both ends.

eigenfoo · 2018-06-13T19:37:58Z

@twiecki @ssanderson done!

twiecki · 2018-06-14T14:51:44Z

README.md

+
+### Deprecated: Data Reading via `pandas-datareader`
+
+As of early 2018, Yahoo Finance has suffered major API breaks with no stable


Rather than emphasize that the readers are deprecated, I would focus on making pyfolio independent of a benchmark being present, that's an enhancement. Then discuss how empyrical does not provide them anymore so if you need them, you have to find them.

If that's the case I'd rather put that in the release notes. I'll remove the README update here.

twiecki · 2018-06-14T16:21:48Z

Still needs to be added to the release notes.

twiecki · 2018-06-14T17:42:19Z

Thanks @eigenfoo!

DEP: Deprecate all data reading functionality via pandas-datareader; ensure independence from SPY and FF

EPGM-DES · 2018-11-17T17:01:18Z

@eigenfoo I was able to access yahoo finance historic prices using fix-yahoo-finance. However when I use price dataframe I receive the following error:

----pyfolio.create_returns_tear_sheet(stock_df)
Traceback (most recent call last):
File "", line 1, in
AttributeError: module 'pyfolio' has no attribute 'create_returns_tear_sheet'

I looked through the function in tears.py but it's unclear if there is another dependency on pandas-datareader in the function. Using fix-yahoo-finance seemed like it could be a helpful, short-term solution to the Google and Yahoo Finance API instability.

eigenfoo · 2018-11-18T14:58:28Z

Hi @EPGM-DES, for future reference, please open up a new GitHub issue for a separate problem you're having: I directed you to this PR since it was what caused your original problem.

Without knowing more about your environment or code, it's hard for me to help you with your error. Could you open another issue with more details?

eigenfoo added 12 commits May 18, 2018 15:08

DEP: deprecate all data reading functionality

3253719

DOC: Update README about deprecation

dca4ba5

REV: get_utc_timestamp should not deprecated

c0b7ec7

DEP: deprecate some other funcs

b3e1469

MAINT: remove set_context from tears

ed4e11d

DOC: Update copyright year

1bb6f39

MAINT: returns tearsheet now indep of benchmarks

8e36a2c

MAINT: interesting times tearsheet now indep of benchmarks

893bd76

MAINT: bayesian tearsheet now indep of benchmarks

a75944f

MAINT: simple tearsheet now indep of benchmarks

7d7a9cd

MAINT: timeseries.py now indep of benchmarks

d12cf4a

BUG: flagged bug for later

fbb511b

eigenfoo added 7 commits May 21, 2018 10:40

MAINT: remove fama-french plots

e27a853

DOC: update documentation and copyright years

8c9ea2b

MAINT: one more pass through

1ff60fb

DOC: flag things to do

8821774

MAINT: numbers checked

4b11da6

TST: updated tests

91a6910

TST: fixed tests

5a749ec

eigenfoo changed the title ~~DEP: Deprecate all data reading functionality via pandas-datareader~~ DEP: Deprecate all data reading functionality via pandas-datareader; ensure independence from SPY and FF May 21, 2018

twiecki reviewed May 23, 2018

View reviewed changes

twiecki approved these changes May 23, 2018

View reviewed changes

twiecki assigned richafrank May 23, 2018

eigenfoo mentioned this pull request May 23, 2018

MAINT: do not require benchmarks for interesting times tear sheet #538

Closed

MAINT: make benchmark optional for interesting times tear sheet

60ebc82

DOC: remove suggestions from readme

b6b4518

ssanderson reviewed Jun 13, 2018

View reviewed changes

eigenfoo added 3 commits June 13, 2018 11:48

DOC: remove fixme

dcbbd93

REV: put back set_context

f9d3012

DOC: add docstrings for set_context

ea215db

ssanderson reviewed Jun 13, 2018

View reviewed changes

eigenfoo mentioned this pull request Jun 13, 2018

BUG: set_context handled inconsistently #542

Open

MAINT: fix clip_rets_to_bench_rets

cee52d2

twiecki reviewed Jun 14, 2018

View reviewed changes

eigenfoo added 2 commits June 14, 2018 10:59

DEP: remove deprecated functions

2a6a81f

DOC: remove readme updates

915f6df

DOC: added WHATSNEW

389a248

twiecki merged commit 50d8190 into quantopian:master Jun 14, 2018

eigenfoo mentioned this pull request Jun 14, 2018

MAINT: add NOT_PASSED_SENTINEL #543

Merged

This was referenced Aug 1, 2018

ValueError: Found input variables with inconsistent numbers of samples: [127, 185] #486

Closed

Fix travis issues #530

Closed

eigenfoo mentioned this pull request Aug 2, 2018

Error with backtrader and pyfolio #555

Closed

brian-from-quantrocket pushed a commit to quantrocket-llc/pyfolio that referenced this pull request Aug 7, 2018

Merge pull request quantopian#536 from eigenfoo/master

0155ebd

DEP: Deprecate all data reading functionality via pandas-datareader; ensure independence from SPY and FF

eigenfoo mentioned this pull request Aug 23, 2018

plot_rolling_sharpe - add benchmark/factor #561

Closed

eigenfoo mentioned this pull request Sep 10, 2018

fix error when benchmark_rets is None #568

Merged

eigenfoo mentioned this pull request Nov 16, 2018

AttributeError: module 'pyfolio.utils has no attribute 'get_stock_returns' #580

Open

MirTunio mentioned this pull request Apr 13, 2021

AttributeError #658

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DEP: Deprecate all data reading functionality via pandas-datareader; ensure independence from SPY and FF #536

DEP: Deprecate all data reading functionality via pandas-datareader; ensure independence from SPY and FF #536

eigenfoo commented May 18, 2018 •

edited

Loading

eigenfoo commented May 21, 2018 •

edited

Loading

eigenfoo commented May 21, 2018 •

edited

Loading

eigenfoo commented May 22, 2018 •

edited

Loading

twiecki May 23, 2018

twiecki May 23, 2018

eigenfoo May 23, 2018

eigenfoo May 23, 2018

twiecki May 23, 2018

eigenfoo May 23, 2018

twiecki left a comment

twiecki commented May 23, 2018

eigenfoo commented Jun 11, 2018

twiecki commented Jun 12, 2018

ssanderson Jun 13, 2018

eigenfoo Jun 13, 2018

ssanderson Jun 13, 2018

eigenfoo Jun 13, 2018

twiecki Jun 14, 2018

eigenfoo Jun 14, 2018

ssanderson Jun 13, 2018

eigenfoo Jun 13, 2018

eigenfoo commented Jun 13, 2018

twiecki Jun 14, 2018

eigenfoo Jun 14, 2018

twiecki commented Jun 14, 2018

twiecki commented Jun 14, 2018

EPGM-DES commented Nov 17, 2018

eigenfoo commented Nov 18, 2018

		"""
		If the returns is longer than the benchmark's, limit strategy returns.


		### Deprecated: Data Reading via `pandas-datareader`

		As of early 2018, Yahoo Finance has suffered major API breaks with no stable

DEP: Deprecate all data reading functionality via pandas-datareader; ensure independence from SPY and FF #536

DEP: Deprecate all data reading functionality via pandas-datareader; ensure independence from SPY and FF #536

Conversation

eigenfoo commented May 18, 2018 • edited Loading

eigenfoo commented May 21, 2018 • edited Loading

eigenfoo commented May 21, 2018 • edited Loading

eigenfoo commented May 22, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

twiecki left a comment

Choose a reason for hiding this comment

twiecki commented May 23, 2018

eigenfoo commented Jun 11, 2018

twiecki commented Jun 12, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eigenfoo commented Jun 13, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

twiecki commented Jun 14, 2018

twiecki commented Jun 14, 2018

EPGM-DES commented Nov 17, 2018

eigenfoo commented Nov 18, 2018

eigenfoo commented May 18, 2018 •

edited

Loading

eigenfoo commented May 21, 2018 •

edited

Loading

eigenfoo commented May 21, 2018 •

edited

Loading

eigenfoo commented May 22, 2018 •

edited

Loading