Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Cleanup import statements #870

Open
mgovers opened this issue Jan 14, 2025 · 4 comments
Open

[FEATURE] Cleanup import statements #870

mgovers opened this issue Jan 14, 2025 · 4 comments
Assignees
Labels
feature New feature or request good first issue Indicates a good issue for first-time contributors

Comments

@mgovers
Copy link
Member

mgovers commented Jan 14, 2025

The module graph for the Python wrapper is fairly convoluted (see https://power-grid-model.readthedocs.io/en/stable/advanced_documentation/python-wrapper-design.html). This is a direct consequence of the fact that many

Background

Definitions

We make the following distinctions:

  • modules v.s. symbols:
    • modules: .py files (or, more correctly, anything that can be imported from)
    • symbols: anything that exists within a module: classes, functions, variables, imported modules, etc. within a file (that means, anything that can marked import from a module)
  • 'public' v.s. 'protected' v.s. 'private':
    • public: any module or symbol that is publicly available (appears in API)
    • protected: anything that is meant for internal usage, and that users should not directly use (prefixed _)
    • private: anything that is restricted to a module or class that outside users should not even have knowledge of (prefixed __; double underscore)

In the PGM, 'protected' and 'private' are used interchangably, as we don't really have actual symbols marked strictly private with double underscore.

Conventions on definitions

In addition, we have the following conventions:

  • 'public' functionality is considered stable and changes in name or behavior are considered breaking.
    • we intend to not do any breaking changes without a major version bump
      • therefore any such change within the same major version is either a bug or a bug fix
      • we will always communicate such changes
  • 'protected' or 'private' functionality is considered unstable and the behavior and names may change over time.
    • Using any such functionality is considered danger mode and the user is responsible for migrating and handling bugs that arise from it
  • any public symbol that lives in a private module (including public submodules of a private module) are considered private from the level of the private module

Note: in Python, nothing is truly private and anything can technically be publicly available if you really want it. However, there is an intentionality argument that says that we can treat protected and private symbols as such. Breaking with the intentionality of the functionality is danger mode.

Conventions on changes

  • Keep existing public symbols public, always; it's a one-way path
  • Keep private symbols private
  • Make new imports private as much as possible (rename if needed)
  • If you create a new module, either place it in the private _core module and/or make it private at least until it's stable
  • If you create a new symbol within a module, make it private at least until it's stable
  • If you have to make new public functionality, make sure to double-check; it's a one-way path
    • Make sure that the docstring is up-to-date
    • Make sure that the symbol is present in the API reference in the docs

Conventions in code

To make sure we maintain the intentionality across the code, we do the following tricks in the code base:

# safe importing modules
import module_a   # exposes publicly; only do this you know what you're doing
import _module_b  # reuse internally

# safe importing modules under a different name
import _module_b as _mod_b    # reuse internally
import module_a as _mod_a     # reuse internally under a new name

# unsafe importing modules; DON'T DO THIS, EVER!!!
import module_a as mod_a      # exposes as a new public symbol; weird
import _module_b as module_b  # exposes as a new public symbol; very dangerous

# unsafe reusing. BEWARE!!! this is a one-way path
from module_a import func_c    # exposes a public symbol from another module; usually bad practice
                               # because users may import from multiple files
from _module_b import func_d   # exposes a function from a private module publicly;
                               # useful when you want to prevent breaking changes now and in the future;
                               # only do this if if you know what you're doing

# unsafe reusing with renaming. BEWARE!!! this is a one-way path
from module_a import func_c as func_e   # expose a public symbol from another module under a different name;
                                        # why ever do this???
from _module_b import func_d as func_f  # expose a public symbol from a private module;
                                        # useful if the name in the private module does not suit this module

# unsafe exposing private functionality; DON'T DO THIS, EVER!!!
from module_a import _func_g as func_g   # expose a private symbol from another module
from _module_b import _func_h as func_h  # expose a public symbol from anot

# safe reusing and good practice; rename new private symbols as you see fit
from module_a import func_c as _func_g   # reuse a public symbol from another module;
from _module_b import func_d as _func_h  # reuse a public symbol from another private module;

# safe reusing but usually bad practice (unless in tests), but not harmful; change as you see fit
from module_a import _func_e              # reuse a private symbol from another module;
from _module_b import _func_f             # reuse a private symbol from another private module;
from module_a import _func_e as _func_g   # reuse a private symbol from another module;
from _module_b import _func_f as _func_h  # reuse a private symbol from another private module;

TODO

Try to find a way to make the module graph less convoluted by changing expositions. Make sure to follow the above conventions.

Feel free to create new private modules and private functions where necessary.

Relates to #869 .

@mgovers mgovers added the feature New feature or request label Jan 14, 2025
@mgovers mgovers added the good first issue Indicates a good issue for first-time contributors label Jan 14, 2025
@emabre
Copy link
Contributor

emabre commented Feb 6, 2025

Hi @mgovers,
This one seems to be still unassigned, so I could think of picking it up, unless you need it very quickly (when would you need it?), or you have other plans for this.

I also have a couple of technical questions:

  • there are cases of public symbols whose locations are not documented in the API. I guess that even in that case they should stay public at any cost, right? For example, ValidationException is not listed here https://power-grid-model.readthedocs.io/en/stable/api_reference/python-api-reference.html as part of the validation module, even though the code python -c 'from power_grid_model.validation import ValidationException' does not error out.
  • similarly, there are modules that import lots of symbols as public, but those symbols are importable by the user in other ways, too. Even in that case those symbols should remain public and importable from all the locations where they currently use to, right? As an example, consider that this code works python -c 'from power_grid_model.validation.assertions import CalculationType', but a user could also import it as python -c 'from power_grid_model.enum import CalculationType'.

Anyways, I don't think that changing the expositions of the symbols in the examples above from public to private would directly simplify the package diagram. The reason why I ask is just to have a basis to start some more deep reasoning.
Thanks

@mgovers
Copy link
Member Author

mgovers commented Feb 7, 2025

Hi @emabre ,

Thank you for contacting us again! This definitely is still available. There's no time pressure on this, so feel free to pick this up. Can you assign yourself to this issue?

On the technical questions:

  • there are cases of public symbols whose locations are not documented in the API. I guess that even in that case they should stay public at any cost, right? For example, ValidationException is not listed here https://power-grid-model.readthedocs.io/en/stable/api_reference/python-api-reference.html as part of the validation module, even though the code python -c 'from power_grid_model.validation import ValidationException' does not error out.

Oh that's not good. They should indeed be listed in the API reference and should definitely remain public. Can you please try to make an exhaustive list of things that are publicly available that are missing from documentation? We can then go over it on a case-by-case basis. You can do this in a separate PR to isolate the changes. Good find!

  • similarly, there are modules that import lots of symbols as public, but those symbols are importable by the user in other ways, too. Even in that case those symbols should remain public and importable from all the locations where they currently use to, right? As an example, consider that this code works python -c 'from power_grid_model.validation.assertions import CalculationType', but a user could also import it as python -c 'from power_grid_model.enum import CalculationType'.

In short, I believe that only things that are either input types or return types to public functions should remain public. The rest can become private. I think we can do another PR like #787 if needed (see also this announcement).

Anyways, I don't think that changing the expositions of the symbols in the examples above from public to private would directly simplify the package diagram. The reason why I ask is just to have a basis to start some more deep reasoning.

Changing exposition in itself doesn't necessarily improve the dependency graph, but it is an enabler for it. There are 2 ways you can go about this:

  • first change the exposition and then clean up
  • or the other way around: first clean the graph up and use temporary exposition imports for backwards compatibility, and then at a later stage fix the backwards compatibility

Cleaning the graph would e.g. amount to moving functions and class declarations to other modules and create another module that acts as an exposition. The import statement can then change to the exposition module. E.g. the following change can happen naturally (where D is e.g. the _core module and D/E is e.g. _core/data_handling):

A -- B --- C                   A -- B ---- C
| \   \   /  \                 | \   \      \
|  \-- D/E   |       ===>      |  \-- D/E -- D/C2
|          \ |                 |        \     \
 \-----------D/F               \-------------- D/F

Most of all, we just want to get rid of the links back from _core to the main power_grid_model module (D/E -> C in the above). Dependencies should go one way, not bidirectionally. I think if you can achieve that, then that would already be a great feat in itself. After that, we can always re-evaluate and decide if we want to go even further.

Hope this helps.

Martijn

@emabre
Copy link
Contributor

emabre commented Feb 7, 2025

Hi @mgovers , thanks a lot for your explanation.
Unfortunately, it seems that I lack the necessary permissions to assign myself to this issue, as I see no clickable button near the assignee's tab on the right.
Thanks again,
Ema

@mgovers
Copy link
Member Author

mgovers commented Feb 7, 2025

Hi @mgovers , thanks a lot for your explanation. Unfortunately, it seems that I lack the necessary permissions to assign myself to this issue, as I see no clickable button near the assignee's tab on the right. Thanks again, Ema

Apologies, I didn't realize that that requires additional permissions. Done. Enjoy!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request good first issue Indicates a good issue for first-time contributors
Projects
Status: No status
Development

No branches or pull requests

2 participants