Entreprenauts Ledger Parser

Specialized HTML parser that converts downloaded ledger html pages into CSV files for export into Google Sheets. Currently designed with a focus on extracting and compiling per-truck financial data for assisting with semi operations data analysis on The Entreprenauts

DISCLAIMER: The parser has not been tested or updated since prior to the semi revamp, so it is provided as-is with no guarantee of functionality.

Installation

There are currently no releases, and no plans for any. However, the program can be run as a script. To do so, first time setup requires the following:

Install Python. You may also need to install the Python package installer, pip
Clone/download the repo to a directory of your choice (Clone with git clone https://github.com/Tropingenie/repo_Meta_Ledger_Parser.git or download as per the screenshot)
Open a terminal of your choice and navigate to the same directory the program is downloaded to
Run the command pip install -r requirements.txt to download the dependencies

Usage

The script is designed to work alongside dedicated spreadsheet software. Therefore, for very large entries the console will display a truncated table. All data is exported to the file processed_ledger.csv, which I recomment importing into Google Sheets for further analysis.

Modes

The parser runs in one of two modes: ID mode, or date mode.

ID Mode

In ID mode, the parser will parse all ledger entries above a certain ID. This is useful if you need to bring a partially filled table up to date, without parsing duplicate data.

Date Mode

In date mode, the parser will parse all ledger entries that fall on a certain day. This is useful for compiling daily per-truck financial reports.

Running the Parser

To run the parser, you need to run the program with at least 1 command line arguments: The ID of the oldest ledger entry you have saved or a date. Opetonally, the path(s) to at least one html file containing the ledger.

Generically, the parser can be run using:

python scraper.py "YYYY-MM-DD"
python scraper.py "OldestID"

See below for specific examples.

ID Mode

General steps are:

Download all pages of the ledger you want parsed to .html files
Determine the oldest ledger entry you want parsed, and take the ID of the ledger entry ONE BELOW the oldest entry you want parsed
Open your terminal in the directory of the scraper.py file and run python scraper.py OldestID+1 page_1.html ... page_n.html (where OldestID+1 is the ledger entry's ID one below the last ledger entry you want parsed, and page_1.html ... page_n.html are the ledger pages you want parsed)
The output will be printed to the terminal if it is short enough, and exported to a .csv file located in the same directory that scraper.py is in

For example, to parse the first three pages of my ledger, I would get the following:

Date Mode

Running the parser in date mode is identical to running it in ID mode, except you specify a date (in YYYY-MM-DD format, like in the ledger)

python scraper.py YYYY-MM-DD page_1.html ... page_n.html

For example, to parse two pages of my ledger in date mode:

Downloading the ledger pages

To parse the ledger, currently ledger pages must be downloaded. This can be done simply enough by going into your ledger on Entreprenauts, right clicking anywhere, and selecting "Save As"

One note is that you need to ensure the html containing the table is saved. If you have downloaded the page and are getting "no table found" errors, then check that the table is actually in the .html file. An html table looks like the following:

To avoid this issue it is recommended to download the entire webpage through the save as dialogue, instead of just the html file.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
scraper.py		scraper.py
truck_table.py		truck_table.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Entreprenauts Ledger Parser

Installation

Usage

Modes

ID Mode

Date Mode

Running the Parser

ID Mode

Date Mode

Downloading the ledger pages

About

Releases

Packages

Languages

Tropingenie/Entreprenauts-Ledger-Parser

Folders and files

Latest commit

History

Repository files navigation

Entreprenauts Ledger Parser

Installation

Usage

Modes

ID Mode

Date Mode

Running the Parser

ID Mode

Date Mode

Downloading the ledger pages

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages