IMPORTANT: Adding parquet functionality in #1290
Replies: 7 comments 16 replies
-
Thank you for the great tool!
|
Beta Was this translation helpful? Give feedback.
-
Rob, et al: I guess another way of phrasing the question is can I go straight from a clean install ov 1.80 straight into parquet without ever touching mongodb or arctic? Thanks. |
Beta Was this translation helpful? Give feedback.
-
A question Rob, when you say pull and install the latest commit, I assume you mean the droparctic branch? |
Beta Was this translation helpful? Give feedback.
-
I have done a full new install of the latest version. [05184ea] The Master branch. As you can see the test bed is a Virtual Machine (which I believe should not make a difference.) Regards |
Beta Was this translation helpful? Give feedback.
-
So I've tried to do this as well in a VM box.
I have the parquet_store variable set up in the private_config with the correct directory and even changed the default config but for some reason it won't read it and keeps looking in /home/me. |
Beta Was this translation helpful? Give feedback.
-
Apologies if this is a bad question: I am in the early stages of sourcing data from Barchart using bcutils and whatever remains through IB. My goal is to bring multiple prices up to date with as many contracts as possible to select from in dynamic optimization. In examining the code in pysystemtrade to help with the "Futures Workflow Process" there appears to be an embedded reliance on Arctic in many of these scripts. How challenging will this be to run down all the Arctic dependencies in the code and replace with Parquet or is that even necessary? Has anyone started this process fresh with Parquet instead of migrating from Arctic? |
Beta Was this translation helpful? Give feedback.
-
I’ve been working on all the scripts to update historical futures prices as well as multiple and adjusted prices, and I’m unsure where this data is ultimately being stored given the codebase vs changelog. The dataBlob object seems to handle data management and versatile storage options, but it’s unclear whether historical prices, multiple prices, and adjusted prices are being stored in Parquet, CSV, or MongoDB or some combination of the three Fro example, PST has moved static config data (e.g., /data/futures/csvconfig/) from Mongo to CSV files, but when it comes to storing historical prices, it seems that dataBlob might leave open the option to use any of these formats. In update_historical_prices, I’m having trouble tracking if data ends up in Parquet, or if it’s still going to MongoDB as well? Can anyone clarify the intended storage location for historical, multiple, and adjusted prices in this context? Also, can someone clarify what the specific role of mongo is for all price data at this point? Is there any metadata being stored in mongo for price or contract data? |
Beta Was this translation helpful? Give feedback.
-
As discussed here, this is to ensure that dependencies on arctic db can be removed, and so that the required packages and python version can be updated to the current versions.
IF YOU WANT TO USE PARQUET (recommended):
parquet_store: '/home/rob/data/parquet/'
(you do not need to create it, or subdirectories this is done automatically)python3 ~/pysystemtrade/sysinit/transfer/backup_arctic_to_parquet.py
IF YOU DO NOT WANT TO USE PARQUET BUT WANT TO GET THE LATEST COMMIT
sysproduction.data.production_data_objects
to point back to arcticsysdata.sim.db_futures_sim_data
to point back to arcticWHAT WILL HAPPEN IF YOU DON'T DO EITHIER AND PULL THE LATEST COMMIT
Beta Was this translation helpful? Give feedback.
All reactions