Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missfit of tlines #425

Closed
TheGrowingPlant443 opened this issue Jul 29, 2021 · 4 comments
Closed

Missfit of tlines #425

TheGrowingPlant443 opened this issue Jul 29, 2021 · 4 comments
Labels
question Further information is requested

Comments

@TheGrowingPlant443
Copy link

TheGrowingPlant443 commented Jul 29, 2021

I may have done something wrong but this may also be a bug.

I have data from yahoofinance of the ticker symbol AMC from (2021-07-22 09:30:00-04:00 to 2021-07-28 15:59:00-04:00). I want to use the function tlines() but I get something that is not suppose to happen.

Code:

import yfinance as yf
import pandas as pd
import mplfinance as mpf

tickername = "AMC"

Ticker = yf.Ticker(tickername)

df = Ticker.history(period='1d', interval='1m', start='2021-07-22', end='2021-07-28')

datapairs = [('2021-07-22 09:38:00-04:00', '2021-07-22 09:58:00-04:00'), 
             ('2021-07-22 09:58:00-04:00', '2021-07-22 10:02:00-04:00'),
             ('2021-07-22 10:02:00-04:00', '2021-07-22 10:15:00-04:00'),
             ('2021-07-22 10:15:00-04:00', '2021-07-22 10:24:00-04:00'),
             ('2021-07-22 10:24:00-04:00', '2021-07-22 10:33:00-04:00'),
             ('2021-07-22 10:33:00-04:00', '2021-07-22 10:36:00-04:00'),
             ('2021-07-22 10:36:00-04:00', '2021-07-22 11:22:00-04:00')]

mpf.plot(df, tlines=[dict(tlines=datapairs,tline_use='high',colors='g')])

My problem is that the fit is wrong how do I fix this?

2021-07-22 09 30 00-2021-07-22 11 12 00

@TheGrowingPlant443 TheGrowingPlant443 added the question Further information is requested label Jul 29, 2021
@DanielGoldfarb
Copy link
Collaborator

@TheGrowingPlant443

Can you please, after df = Ticker.history(period='1d', interval='1m', start='2021-07-22', end='2021-07-28'), call

df.to_csv('AMC_20210722_20210728.csv')

... and then post the csv file here, or put it somewhere that I can get to it (or email to [email protected]) so that I can be sure I am working with the exact same data you are working with. I will take a look. Thank you. --Daniel

@TheGrowingPlant443
Copy link
Author

AMC_20210722_20210728.zip

@DanielGoldfarb

Thanks for the quick response. I have linked to a zip-folder with the csv file in it.

@DanielGoldfarb
Copy link
Collaborator

DanielGoldfarb commented Jul 29, 2021

This is indeed a bug that was reported previously here.

The explaination is as follows: Whenever plotting a time series with matplotlib, the datetimes must be converted to low-level matplotlib dates. Mplfinance does this for you. Normally this works fine, but when your dataframe's datetime index contains a timezone offset. matplotlib's date conversion always converts the time zone to UTC. (I'm not sure why matplotlib does that).

In order to avoid this automatic conversion to UTC, mplfinance detects if a timezone offset is present in the dataframe index, and if so, then it "localizes" the pandas Timestamp objects. This generally works fine.

The bug is this: when I added code to localize the pandas Timestamps, (See issue #236), I forgot to add code to also localize any datetimes passed in as tlines and/or alines specifiers. Effectively then, your trend lines are being drawn using price data that is offset by the datetime offset (in your case, 4 hours).

This will definitely be fixed eventually. In the meantime there are a couple of workarounds that you can try:

  1. The simplest workaround is to set kwarg tz_localize=False when calling mpf.plot(). This will allow your tlines date specifiers (like '2021-07-22 09:38:00-04:00') to match the offset in your data. Your trend lines will be correct, but your trading day will appear to start at 13:30 looking something like this:

image

  1. Alternatively you can write some code to remove the time zone offset from your dataframe index, and then specify your tlines datetimes using no timezone offset:

    df = Ticker.history(period='1d', interval='1m', start='2021-07-22', end='2021-07-28')
    
    # Remove offsets from timestamps in the dataframe index:
    df.index = pd.DatetimeIndex(df.index.tz_localize(None).to_pydatetime())
    
    # Now continue as before, but with no timezone offsets in the `tlines` specification:
    datapairs = [('2021-07-22 09:38:00', '2021-07-22 09:58:00'),
                 ('2021-07-22 09:58:00', '2021-07-22 10:02:00'),
                 ('2021-07-22 10:02:00', '2021-07-22 10:15:00'),
                 ('2021-07-22 10:15:00', '2021-07-22 10:24:00'),
                 ('2021-07-22 10:24:00', '2021-07-22 10:33:00'),
                 ('2021-07-22 10:33:00', '2021-07-22 10:36:00'),
                 ('2021-07-22 10:36:00', '2021-07-22 11:22:00')
                ]
    
    mpf.plot(df, tlines=[dict(tlines=datapairs,tline_use='high',colors='g')])

The result is:
image

@TheGrowingPlant443
Copy link
Author

Thank you so much for the help. This works perfektly 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants