Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement VSC AST using python dataclasses #32

Merged
merged 25 commits into from
Sep 21, 2022

Conversation

miketsukerman
Copy link
Collaborator

@miketsukerman miketsukerman commented Aug 19, 2022

Vehicle Service Catalog AST using python dataclasses

The general idea is to use https://docs.python.org/3/library/dataclasses.html instead of a python dictionary to describe the types from which an AST would consist.

Example code

Example python code for the method element. (https://github.com/COVESA/vehicle_service_catalog/blob/master/syntax.md#namespace-list-object-methods)

methods:
    - name: move
      description: Set the desired seat position
      input: 
         - name: seat
           description: The desired seat position
           datatype: movement.seat_t

will correspond to the following python code:

@dataclass
class Method:
    name: str
    description: Optional[str] = None
    error: Optional[List[Error]] = field(default_factory=EmptyList)
    input: Optional[List[Argument]] = field(default_factory=EmptyList)
    output: Optional[List[Argument]] = field(default_factory=EmptyList)

The fully AST python code is in the vsc_ast.py.

Reading AST from YAML file

from vsc.model import vsc_parser
service = vsc_parser.get_ast_from_file('input.yaml')

Building AST in the code

from vsc.model import vsc_ast
service = vsc_ast.AST(name='test', description='test', major_version=1, minor_version=0)

Additional changes

The PR contains additional changes. During the work on the dataclasses AST, the vsc-tools python module structure has been improved. Now we have vsc module with generators, model, scripts, templates submodules. vsc_ast.py contains AST specification implemented using python dataclasses. vsc_generator.py is responsible for the code generation. vsc_parser.py provides API to load and parse YAML files into AST classes.

├── LICENSE
├── Pipfile
├── Pipfile.lock
├── README.md
├── requirements.txt
├── setup.py
├── tests
│   ├── gen_test.py
│   ├── __init__.py
│   ├── input.yaml
│   ├── result
│   └── template
└── vsc
    ├── generators
    │   └── __init__.py
    ├── __init__.py
    ├── model
    │   ├── __init__.py
    │   ├── vsc_ast.py
    │   ├── vsc_generator.py
    │   └── vsc_parser.py
    ├── scripts
    │   ├── generator.py
    │   └── __init__.py
    └── templates
        ├── Argument-simple_doc.html
        ├── AST-simple_doc.tpl
        ├── dtdl.md
        ├── dtdl.tpl
        ├── Error-simple_doc.html
        ├── __init__.py
        ├── Member-simple_doc.html
        ├── Namespace-simple_doc.html
        ├── protobuf.md
        ├── protobuf.tpl
        ├── sds-bamm-aspect-model.md
        ├── sds-bamm-aspect-model.tpl
        ├── sds-bamm-macros.tpl
        ├── Service-simple_doc.html
        ├── simple_overview.tpl
        └── test.tpl

We added setup.py to simplify the installation of the module using pip or python3 setup.py install (python3 setup.py develop).

Installation from git

pip3 install git+https://github.com/COVESA/vsc-tools.git

Command line tools

Newly added setup.py will install a command line tool. The tool called vscgen can be used to generate the code using default templates or custom templates.

vscgen <path-to-yaml-file> <template-file-name>

The list of templates can be found in the README.md.

Mikhail Tsukerman [email protected] on behalf of MBiTION GmbH.

https://github.com/mercedes-benz/daimler-foss/blob/master/PROVIDER_INFORMATION.md

@gunnarx
Copy link
Collaborator

gunnarx commented Aug 19, 2022

First I cherry-picked and pushed e8ec7ee because it seems to be needed for the GH actions to work now.
(I found this out in PR #31).

But now a new failure.

@miketsukerman Could you take a look at the buildtest failure now? It looks like the import statement fails.
Did you run gen_test.py manually (or run pytest?) because I can't make it work from any directory but I only did a few quick tests.

Some hints: Before your changes in f0d886a I had a bit of a hack to set the search path before import: (see HERE)
Maybe you can find a better way and still make it work?

@miketsukerman
Copy link
Collaborator Author

First I cherry-picked and pushed e8ec7ee because it seems to be needed for the GH actions to work now. (I found this out in PR #31).

But now a new failure.

@miketsukerman Could you take a look at the buildtest failure now? It looks like the import statement fails. Did you run gen_test.py manually (or run pytest?) because I can't make it work from any directory but I only did a few quick tests.

Some hints: Before your changes in f0d886a I had a bit of a hack to set the search path before import: (see HERE) Maybe you can find a better way and still make it work?

I run the test from pycharm with dedicated python virtual environment.

# This is maybe not ideal way but seems efficient enough
from pathlib import Path
import sys
proj_root = Path(__file__).parents[1]
sys.path.append(str(proj_root) + "/model")

Should not be needed in general. I was asking myself why this repository doesn't have setup.py? I think if setup.py would be added should eliminate the problem.

@miketsukerman miketsukerman force-pushed the use-dataclasses-for-ast branch from 9d3f568 to d0a687b Compare August 19, 2022 19:46
@miketsukerman miketsukerman marked this pull request as ready for review August 20, 2022 07:39
@miketsukerman miketsukerman changed the title Use python dataclasses for AST Draft: Use python dataclasses for AST Aug 20, 2022
@miketsukerman miketsukerman changed the title Draft: Use python dataclasses for AST Use python dataclasses for AST Aug 20, 2022
@miketsukerman miketsukerman marked this pull request as draft August 20, 2022 07:43
@miketsukerman miketsukerman force-pushed the use-dataclasses-for-ast branch from c870c29 to eaf983d Compare August 22, 2022 14:13
@gunnarx
Copy link
Collaborator

gunnarx commented Aug 29, 2022

@miketsukerman In addition to what we talked about (not parsing some of the tree correctly) I saw today that it is missing
Enumerations. link to spec Did you plan to add them later (they are a little tricky I know)

EDIT: Actually lots more. .... the Namespace class is missing structs, methods, events, properties, includes?

I had just assumed you would take the previous code and convert from there. Because I went through the whole specification in detail when I wrote it and made sure the "ast definition" was as close as I could manage. So I recommend reusing the "schema" and just convert it to the classes. ^^^
We might look over the optionality once more however.

It will be hard to test this before we have the AST classes close to complete, I think.

Will you do that update, please?

Pipfile Outdated
@@ -13,3 +13,7 @@ pytest = ">=2.7.2"

[requires]
python_version = "3.10"

[packages.e1839a8]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a pip expert, but previously for VSS I got the impression that it is good practice to always have a Pipfile.lock as well as the Pipfile to "lock" the build to specific versions. Or do we see that as not needed here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will bring it back later. There were issues with pipenv on Linux. During the experiments I have removed it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@erikbosch I have reverted Pipenv.lock file back, please check and resolve the thread.

generated = vsc_generator._gen_with_text_template(ast, templatefile.read())
def test_gen(tmp_path):
# The files named 'input.yaml', 'template' and 'result' are in the tests directory
ast_root = vsc_parser.get_ast_from_file(os.path.join(TestPath,'input_old.yaml'))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a test to support the "old" syntax (without explicit "service"). Do we intend to support both or is this just a temporary test?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are in the process of migration and I've kept both. But, in the end, only one syntax will be left.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@erikbosch I have reverted most of the changes I've made with "syntax". Some minor things are still there, like input instead of in, etc., but it is not avoidable.

Copy link
Collaborator

@gunnarx gunnarx Aug 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@erikbosch Honestly it's all a bit of a mess at the moment! While merging and updating some of my code on top of this I noticed we had a mix of proposed syntax changes from @miketsukerman, a few from myself, the templates, the specification, different implementations in progress, unit-tests, examples files in VSC repo - all had various levels of differences. To get a stable situation I have also rolled back a lot, to a syntax that I don't feel is the final/best one.
But I'd rather we start from there and change incrementally. From now on, let's try to make sure all PRs will modify parts in sync. (This is still Draft PR of course, but nonetheless).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gunnarx let's call it refactoring :)

@gunnarx
Copy link
Collaborator

gunnarx commented Aug 30, 2022

Yes these latest commits are similar changes I had to make…. Also, we already chatted about this but just to document: I think the updated parsing function will instantiate all parts immediately so the post init functions here won’t be necessary. I will find some time later tomorrow to compare and merge it all once again, and then we will see if it works as expected.

@miketsukerman
Copy link
Collaborator Author

Yes these latest commits are similar changes I had to make…. Also, we already chatted about this but just to document: I think the updated parsing function will instantiate all parts immediately so the post init functions here won’t be necessary. I will find some time later tomorrow to compare and merge it all once again, and then we will see if it works as expected.

yes, that would be great. If it works with templates we are good to go I guess.

@miketsukerman miketsukerman force-pushed the use-dataclasses-for-ast branch 2 times, most recently from bc59c2e to 8eeeebf Compare September 1, 2022 13:41
@miketsukerman miketsukerman marked this pull request as ready for review September 1, 2022 13:51
@miketsukerman miketsukerman force-pushed the use-dataclasses-for-ast branch 6 times, most recently from e25331c to b6fa1af Compare September 5, 2022 11:40
@gunnarx
Copy link
Collaborator

gunnarx commented Sep 6, 2022

@miketsukerman We can't work like this... There's no way to reasonably merge changes together when you have commits like this recent push. It was a problem before as well that things were not separated the right way, and the squashing together of commits is rarely the solution to bad commits - it is likely to create even worse commits.

Too many of these commits contain too many changes that are not related to each other. The effect of this is that they can't be used separately or effectively combined with other changes without lots of merge conflicts. (b6fa1af is only one example).

Basically, I wonder why you didn't continue the work on top of my changes? Do you think it's better to just do this on your own?

There are even deeper issues here that I can explain separately, but before I go too far - is all this just because you need some help to know how to separate things in git, or are you kind of just developing without regard for others?

@miketsukerman
Copy link
Collaborator Author

@miketsukerman We can't work like this... There's no way to reasonably merge changes together when you have commits like this recent push. It was a problem before as well that things were not separated the right way, and the squashing together of commits is rarely the solution to bad commits - it is likely to create even worse commits.

Too many of these commits contain too many changes that are not related to each other. The effect of this is that they can't be used separately or effectively combined with other changes without lots of merge conflicts. (b6fa1af is only one example).

Basically, I wonder why you didn't continue the work on top of my changes? Do you think it's better to just do this on your own?

There are even deeper issues here that I can explain separately, but before I go too far - is all this just because you need some help to know how to separate things in git, or are you kind of just developing without regard for others?

I am doing experiments in this MR. I think, the idea to have two MRs in parallel is not so gut. I am sorry that I have disappointed you with the way how I approach this. For my excuse I can tell that I want to explore all the things first, then I can prepare the changes in, maybe, a new clean MR.

Would be better if we will have one MR, for example, this one and you will comment on it and I will fix it simply. Rebasing for me onto your MR is difficult as well.

When we will solve all the issues, I will open one smaller MR.

@miketsukerman miketsukerman force-pushed the use-dataclasses-for-ast branch 2 times, most recently from e8c85b3 to a427070 Compare September 7, 2022 09:49
@@ -58,7 +58,7 @@ def get_template(filename):
return jinja_env.get_template(filename)

# gen() function to be called from templates
def gen(node : AST, template_file = None):
def gen(node : Namespace, template_file = None):
Copy link
Collaborator

@gunnarx gunnarx Sep 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The idea to have the type as AST came from the original implementation where all node-types were sub-classes inheriting from the AST class. Thus the type of node "is-a" AST. But that class hierarchy does not exist anymore.

  2. Even if the top-level node will be Namespace, the purpose of the gen() function is that it can accept any node type.

So... in this version where the common type (parent class) AST does not exist, I would guess that it is just as good to not give a type hint at all.

Copy link
Collaborator Author

@miketsukerman miketsukerman Sep 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Once you mentioned that you wanted to make everything according to the specification and in the specification, there is no root element except Namespace. No problem, I will fix it later today and bring AST back.

  2. Hm, for this purpose in python there is a generic type, something like that would be

https://docs.python.org/3/library/typing.html#typing.Generic

from dataclass import is_dataclass
from typing import TypeVar, Generic

T = TypeVar['T']

def gen(node: Generic[T], template_file = None):
    assert is_dataclass(node)

I will try this, may be a good idea to write this function like that, but check the passed types for some criteria.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gunnarx done, you can check.

Copy link
Collaborator Author

@miketsukerman miketsukerman Sep 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, just a thought

def gen(node: Any, template_file = None):
...

would work

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The idea to have the type as AST came from the original implementation where all node-types were sub-classes inheriting from the AST class. Thus the type of node "is-a" AST. But that class hierarchy does not exist anymore.

OK... point 1 was just information, not really a change request. As I said the class hierarchy does not exist now, so therefore there is (was) no AST parent type. However 2. suggests that the gen() function should have a different type, like Any since we don't have a more specific common parent type anymore.

  1. Once you mentioned that you wanted to make everything according to the specification and in the specification, there is no root element except Namespace. No problem, I will fix it later today and bring AST back.

Yes of course I wanted the behavior to be according to specification. So I understand why that led you to put Namespace in some places, I'm just providing feedback and explanation on where I think the change makes sense and not.

Well, just a thought

def gen(node: Any, template_file = None):
...
  • Any seems correct with what we have now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK... point 1 was just information, not really a change request. As I said the class hierarchy does not exist now, so therefore there is (was) no AST parent type.

I know. I was not sure completely, waiting for your feedback.

@@ -12,14 +12,14 @@

# This performs the following related functions:

# 1. As input.yaml we expect an AST as provided by the parser module
# 1. As input.yaml we expect an Namespace as provided by the parser module
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still say that the whole thing that is provided into the generator is an Abstract Syntax Tree, even if in the current syntax definition the top level node type is Namespace.

I guess in this comment the word AST could be considered the concrete class name, but it could also be considered the "concept" of providing an AST from the parser. I would personally keep the name AST.

Copy link
Collaborator Author

@miketsukerman miketsukerman Sep 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've returned AST node back.

@dataclass
class AST(Namespace):
    pass

@miketsukerman
Copy link
Collaborator Author

miketsukerman commented Sep 7, 2022

@gunnarx some topics to think about:

  1. Do we need code below in the vsc_parser.py? We have vscgen instead.
# If run as a script, generate a single YAML file using a single root template
if __name__ == "__main__":
    if not len(sys.argv) == 3:
        usage()
    ast = vsc_ast.read_ast_from_yaml_file(sys.argv[1])
    templatename = sys.argv[2]
    print(gen(ast, templatename))
  1. I would like to move some code from the vsc_generator.py to templates module
# Set up Jinja
jinja_env = jinja2.Environment(
        # Use the subdirectory 'templates' relative to this file's location
        loader=jinja2.FileSystemLoader(TemplatePath),

        # Templates with these extension gets automatic auto escape for HTML
        # It's more annoying for code generation, so passing empty list for now.
        autoescape=jinja2.select_autoescape([])
        )

# We want the control blocks in the template to NOT result in any extra
# white space when rendering templates. However, this might be a choice
# made by each generator, so we need to export the ability to keep these
# settings public for other code to modify them.
jinja_env.trim_blocks = True
jinja_env.lstrip_blocks = True
jinja_env.undefined = jinja2.StrictUndefined

default_templates = {}

That would make vsc_generator.py more abstract.

miketsukerman and others added 24 commits September 21, 2022 13:47
* Add python setuptools setup.py script.
* Move python code to the vsc folder.
* Move templates folder to the vsc folder.
* Install vscgen tool to simplify codegenerator use via cmd.
* Update github actions (use python 3.10 and vscgen).

Signed-off-by: Mikhail Tsukerman <[email protected]>
Signed-off-by: Magnus Feuer <[email protected]>
* Implement vehicle service catalog AST using python dataclasses.
* Implement parsing of YAML vsc format into vsc AST format.

Signed-off-by: Mikhail Tsukerman <[email protected]>
Signed-off-by: Magnus Feuer <[email protected]>
* Add simple tests for newly written dataclasses AST.
* Add new example YAML files input2.yaml and input3.yaml.

Signed-off-by: Mikhail Tsukerman <[email protected]>
Signed-off-by: Magnus Feuer <[email protected]>
Take over typic gitignore for python projects from
https://github.com/github/gitignore/blob/main/Python.gitignore

Signed-off-by: Mikhail Tsukerman <[email protected]>
Signed-off-by: Magnus Feuer <[email protected]>
All the previous code in this file, to read YAML into an AST tree, is
now superseded by vsc_ast.py.

There are a few files and scripts referring to this file so at the
moment it remains as a wrapper around vst_ast but soon it should
probably be replaced completely.
(should be no functional change)
- Methods need to be inside a namespace
- includes: should be a list
- Change one test to output name of main file instead of first namespace

See also previous comment
From what I can tell, the new parsing function instantiates all needed
fields of all objects in the AST tree (or leaves the field as
the default (= None), if something is Optional and not provided by
the input YAML.

I therefore removed all the post_init functions, here as a separate
commit in case we need to test further, rollback, or something.

unit-tests and basic template generation seem to run fine (except for
those templates that need update, but that is known)
In vscgen tool template argument becomes optional. When vscgen
<yaml.file> called it would use default templates from
vsc.templates.default_templates dictionary.

Signed-off-by: Mikhail Tsukerman <[email protected]>
Signed-off-by: Magnus Feuer <[email protected]>
After short discussion we have decided to bring back the AST node.
This node is always a root, but inherits everything from the Namespace class.

And vsc_parser.gen() function now accepts object of any type. Later we
might want to introduce some checks for the passed node, like
assert dataclass.is_dataclass(node).

Signed-off-by: Mikhail Tsukerman <[email protected]>
Signed-off-by: Magnus Feuer <[email protected]>
Moving all code related to YAML processing to the vsc_parser.py
from vsc_ast.py to keep in vsc_ast module only AST nodes.

Signed-off-by: Mikhail Tsukerman <[email protected]>
Signed-off-by: Magnus Feuer <[email protected]>
Use Any instead of Generic[T] for gen() function.

Signed-off-by: Mikhail Tsukerman <[email protected]>
Signed-off-by: Magnus Feuer <[email protected]>
Improve setup.py script to properly package and install files inside the
templates folder. Now setup.py will install the *.tpl and *.html files
into the respective python3 packages directory. This allows to use
module independently.

Signed-off-by: Mikhail Tsukerman <[email protected]>
Signed-off-by: Magnus Feuer <[email protected]>
An attempt to reverse dependencies by moving all the jinja2 related code
to the vsc.templates module. This makes vsc_generator.py more abstract
and in theory removes dependency on the concrete template engine.

Signed-off-by: Mikhail Tsukerman <[email protected]>
Signed-off-by: Magnus Feuer <[email protected]>
Making description field empty by default instead of None.
Removing Optional[..] from the Lists and using default_factory
assigning empty lists by default to avoid additional checks in the
templates code.

Signed-off-by: Mikhail Tsukerman <[email protected]>
Signed-off-by: Magnus Feuer <[email protected]>
Bring back Optional[List[...]] and Optional[str] to some
of the fields to bring AST closer to the specification. It would
not change the runtime behaviour, lists would have default value [], etc.
Fix arraysize type in some of the classes from Optional[str]
to Optional[int].

Signed-off-by: Mikhail Tsukerman <[email protected]>
Signed-off-by: Magnus Feuer <[email protected]>
Change python version from 3.10.6 to 3.10.7.

Signed-off-by: Mikhail Tsukerman <[email protected]>
Signed-off-by: Magnus Feuer <[email protected]>
Comments added to the AST classes, using information from
the VSC specification and format of python docstrings.

Signed-off-by: Mikhail Tsukerman <[email protected]>
Signed-off-by: Magnus Feuer <[email protected]>
@miketsukerman miketsukerman force-pushed the use-dataclasses-for-ast branch from 4f21890 to f138492 Compare September 21, 2022 11:57
I noticed some discrepancy in the readme file and rawly fixed all the
incosistencies I have been able to find out. Maybe there are bigger
changes needed, but I will leave it for the future PRs.

Signed-off-by: Mikhail Tsukerman <[email protected]>
Signed-off-by: Magnus Feuer <[email protected]>
@miketsukerman miketsukerman force-pushed the use-dataclasses-for-ast branch from f138492 to fd47e9c Compare September 21, 2022 11:59
@miketsukerman
Copy link
Collaborator Author

Meeting 19 September: OK to merge after adding appropriate license-headers and signoffs.

Updated PR description with short summary. Added license headers etc.

@gunnarx gunnarx merged commit 3fdd7e2 into COVESA:master Sep 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants