Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance docstrings using google format #9

Closed
wants to merge 14 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,22 +22,22 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
uses: actions/checkout@v4
- name: Setup Pages
id: pages
uses: actions/configure-pages@v3
uses: actions/configure-pages@v4
- name: Install poetry
run: pipx install poetry
- name: Setup Python
uses: actions/setup-python@v3
uses: actions/setup-python@v5
- name: Install dependencies
run: poetry install
- name: Build with Sphinx
run: |
poetry run sphinx-build docs _site
- name: Upload artifact
# Automatically uploads an artifact from the './_site' directory by default
uses: actions/upload-pages-artifact@v2
uses: actions/upload-pages-artifact@v4

# Deployment job
deploy:
Expand All @@ -49,4 +49,4 @@ jobs:
steps:
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v2
uses: actions/deploy-pages@v4
File renamed without changes.
4 changes: 2 additions & 2 deletions .github/workflows/ci.yml → .github/workflows/testing.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: Build
name: Testing

on:
push:
Expand All @@ -11,7 +11,7 @@ permissions:
contents: read

jobs:
build:
testing:

runs-on: ${{matrix.os}}

Expand Down
661 changes: 661 additions & 0 deletions LICENSE.md

Large diffs are not rendered by default.

14 changes: 9 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@ To install the VTL Engine on any Operating System, you can use pip:
pip install vtlengine
```

*Note: it is recommended to install the VTL Engine in a virtual environment.*


## Usage

### Semantic Analysis
Expand All @@ -26,9 +29,10 @@ Here is an example:

```python

from API import semantic_analysis
from vtlengine import semantic_analysis
from pathlib import Path
base_path = Path(__file__).parent / "testSuite/API/data/"

base_path = Path(__file__).parent / "tests/API/data/"
script = base_path / Path("vtl/1.vtl")
datastructures = base_path / Path("DataStructure/input")
value_domains = base_path / Path("ValueDomain/VD_1.json")
Expand All @@ -46,10 +50,10 @@ To execute a VTL script, please use the run function. Here is an example:

```python

from API import run
from vtlengine import run
from pathlib import Path

base_path = Path(__file__).parent / "testSuite/API/data/"
base_path = Path(__file__).parent / "tests/API/data/"
script = base_path / Path("vtl/1.vtl")
datastructures = base_path / Path("DataStructure/input")
datapoints = base_path / Path("DataSet/input")
Expand All @@ -60,7 +64,7 @@ external_routines = None

run(script=script, data_structures=datastructures, datapoints=datapoints,
value_domains=value_domains, external_routines=external_routines,
output_path=output_folder, return_only_persistent=True
output_folder=output_folder, return_only_persistent=True
)
```
The VTL engine will load each datapoints file as being needed, reducing the memory footprint.
Expand Down
6 changes: 3 additions & 3 deletions docs/api.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
###########
API Package
###########
###
API
###
The ``API`` package contains all the methods to load data into the vtl engine. It has a function to ensure if the
operation can be performed, and another function to prepare it to be operated.

Expand Down
8 changes: 7 additions & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,9 @@ VTL Engine Documentation
########################

The VTL Engine is a Python library that allows you to validate and run VTL scripts.
It is a Python-based library around the `VTL Language 2.0 <http://sdmx.org/?page_id=5096>`_ VTL Language 2.1 will be adapted soon.
It is a Python-based library around the `VTL Language 2.0 <http://sdmx.org/?page_id=5096>`_

*VTL Language 2.1 will be supported soon.*

Installation
************
Expand All @@ -21,6 +23,10 @@ To install the VTL Engine on any Operating System, you can use pip:

pip install vtlengine

.. important::
It is recommended to install the VTL Engine in a virtual environment.
Please follow `these steps <https://packaging.python.org/en/latest/guides/installing-using-pip-and-virtual-environments/>`_

.. toctree::

index
Expand Down
55 changes: 55 additions & 0 deletions docs/walkthrough.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,59 @@
########################
10 minutes to VTL Engine
########################

Summarizes the main functions of the VTL Engine

*****************
Semantic Analysis
*****************
To perform the validation of a VTL script, please use the semantic_analysis function.
Here is an example:

.. code-block:: python

from vtlengine import semantic_analysis
from pathlib import Path

base_path = Path(__file__).parent / "tests/API/data/"
script = base_path / Path("vtl/1.vtl")
datastructures = base_path / Path("DataStructure/input")
value_domains = base_path / Path("ValueDomain/VD_1.json")
external_routines = base_path / Path("sql/1.sql")

semantic_analysis(script=script, data_structures=datastructures,
value_domains=value_domains, external_routines=external_routines)


The semantic analysis function will return a dictionary of the computed datasets and their structure.

*****************
Run VTL Scripts
*****************

To execute a VTL script, please use the run function. Here is an example:

.. code-block:: python

from vtlengine import run
from pathlib import Path

base_path = Path(__file__).parent / "tests/API/data/"
script = base_path / Path("vtl/1.vtl")
datastructures = base_path / Path("DataStructure/input")
datapoints = base_path / Path("DataSet/input")
output_folder = base_path / Path("DataSet/output")

value_domains = None
external_routines = None

run(script=script, data_structures=datastructures, datapoints=datapoints,
value_domains=value_domains, external_routines=external_routines,
output_folder=output_folder, return_only_persistent=True
)

The VTL engine will load each datapoints file as being needed, reducing the memory footprint.
When the output parameter is set, the engine will write the result of the computation
to the output folder, else it will include the data in the dictionary of the computed datasets.

For more information on usage, please refer to the `API documentation <https://docs.vtlengine.meaningfuldata.eu/api.html>`_
35 changes: 23 additions & 12 deletions main.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,22 +3,33 @@

from vtlengine.API import run

base_path = Path(__file__).parent / 'development' / 'data'
dev_name = 'BOP'
base_path = Path(__file__).parent / 'development' / 'data' / dev_name
input_dp = base_path / 'dataPoints' / 'input'
output_dp = base_path / 'dataPoints' / 'output'
input_ds = base_path / 'dataStructures' / 'input'
ext_routines = base_path / 'externalRoutines'
vds = base_path / 'valueDomains'
vtl = base_path / 'vtl' / 'monthVal.vtl'
vtl = base_path / 'vtl' / f'{dev_name}.vtl'

if __name__ == '__main__':
start = time()
run(
script=vtl,
data_structures=input_ds,
datapoints=input_dp,
value_domains=vds,
output_path=output_dp,
)
end = time()
print(f"Execution time: {round(end - start, 2)}s")
time_vector = []
num_executions = 3
for i in range(num_executions):
start = time()
run(
script=vtl,
data_structures=input_ds,
datapoints=input_dp,
value_domains=vds,
output_folder=output_dp,
)
end = time()
total_time = round(end - start, 2)
time_vector.append(total_time)
print(f'Execution ({i + 1}/{num_executions}): {total_time}s')
print('-' * 30)
print(f'Average time: {round(sum(time_vector) / num_executions, 2)}s')
print(f'Min time: {min(time_vector)}s')
print(f'Max time: {max(time_vector)}s')
print(f'Total time: {round(sum(time_vector), 2)}s')
40 changes: 18 additions & 22 deletions poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

21 changes: 14 additions & 7 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,26 +1,33 @@
[tool.poetry]
name = "vtlengine"
version = "0.1.0-rc7"
version = "1.0"
description = "Run and Validate VTL Scripts"
authors = ["MeaningfulData <[email protected]>"]
license = "Apache-2.0"
license = "AGPL-3.0"
readme = "README.md"
classifiers = [
"Development Status :: 5 - Production/Stable",
"Development Status :: 4 - Beta",
"Intended Audience :: Developers",
"Intended Audience :: Information Technology",
"Intended Audience :: Science/Research",
]

[tool.poetry.dependencies]
python = "^3.10"
antlr4-python3-runtime="4.9.3"
# PyPi dependencies
duckdb="^1.1.1"
networkx="^3.3"
pandas={extras = ["performance", "aws"], version = "^2.2"}
sqlglot="^25.23.2"
numba="^0.60.0"
s3fs="^2024.9.0"
pyarrow = "^17.0.0"

# APT dependencies
antlr4-python3-runtime="4.9.2"
networkx="^2.8.8"
numexpr="^2.9.0"
pandas="^2.1.4"
bottleneck="^1.3.4"
sqlglot="^22.2.0"

[tool.poetry.dev-dependencies]
pytest = "^7.3"
pytest-cov = "^5.0.0"
Expand Down
Binary file added requirements.txt
Binary file not shown.
9 changes: 7 additions & 2 deletions src/vtlengine/API/_InternalApi.py
Original file line number Diff line number Diff line change
Expand Up @@ -197,7 +197,10 @@ def load_vtl(input: Union[str, Path]):
the file.
"""
if isinstance(input, str):
return input
if os.path.exists(input):
input = Path(input)
else:
return input
if not isinstance(input, Path):
raise Exception('Invalid vtl file. Input is not a Path object')
if not input.exists():
Expand Down Expand Up @@ -319,7 +322,9 @@ def _check_output_folder(output_folder: Union[str, Path]):
except Exception:
raise Exception('Output folder must be a Path or S3 URI to a directory')

if not isinstance(output_folder, Path) or not output_folder.is_dir():
if not isinstance(output_folder, Path):
raise Exception('Output folder must be a Path or S3 URI to a directory')
if not output_folder.exists():
if output_folder.suffix != '':
raise Exception('Output folder must be a Path or S3 URI to a directory')
os.mkdir(output_folder)
Loading