-
Imports
.csv
files into the database:- From the weather API
- From the calculator
Uses various parallelism methods (
map_async
,apply_async
,map
) from Python's nativemultiprocessing.Pool
class.Strategies under development:
- Splitting into smaller files
- Implementation in a low-level language like Rust
- Adjusts weather data from a 5-minute to a 15-minute time step to ajust to the calculator's step.
Upcoming features:
- Dynamic specification of initial and target time steps
- Creates SQL sub-tables containing the selected data range (time range, temperature, etc.) generated from the original table.
- Manipulates database data to generate a
.json
file for visualization. - Provides data preview with the option to filter out aberrant values.
Guidelines for a Linux environment
- Configured and running PostgreSQL server.
- Creation and configuration of a new database.
- Edit the
config.json
file with necessary parameters for connecting to the database.
git clone https://github.com/DevprojectEkla/HelioCity
cd HelioCity
python -m venv env
source env/bin/activate # On Linux
pip install -r requirements.txt
mkdir data
The main.py
file can be launched with arguments; otherwise, a series of prompts will ask for:
- The table name (either an existing table name or the name for a new table to be created in the database from the imported file).
- If applicable, the name of the
.csv
file to import into the database. - Optionally, use the
-f
flag to specify a simple import method; absence of the flag defaults to a parallelism-based import.
python main.py [table_name] [path_to_csv_file] [-f]
- Import a table from
./data/meteo_data.csv
in preprocessing or./data/test_helio.csv
in post-processing. - Filter out aberrant data and specify a time interval.
- Insert a new variable called
python_calc
* into a table for time-based representation.
* In this scenario, it involves preprocessing wind chill temperature as a function of temperature, wind speed, and relative humidity. In post-processing, it's a test calculation (to be adjusted with a relevant formula).
- Generate a
.json
file from this preview data for future use in another context.
python database_handler.py
You will be prompted for:
- The name of the new table to create (default:
meteo_data
). - The path to the
.csv
data file (default:./data/meteo_data.csv
). - Specify data origin (weather or calculator); calculator column processing takes place in adjustable portions of the number of lines answered
'y'
if it's a large file. 'n' or '' in the case of a large file.
Warning: Importing large
.csv
files from the calculator can take some time depending on the computer's memory capabilities. Adjust the value of the number of lines per portion to available memory.
Data manipulations can be performed using the DatabaseSelector
class to create new tables in the database. It allows:
- Creating sub-tables by interval of interest.
- Aggregating weather data at the calculator's timestep.
- Inserting calculated variables from existing table variables.
For a test, simply run the command:
python database_selector.py
Follow the instructions...
This class only reads from the database and does not write to it. It facilitates easy manipulation of data in dataframes for visualization and is used to generate a .json
format.