Proposal: Scanner Modularization #479
rumpelsepp
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
TL;DR: This breaks every API, so this qualifies for
gallia
2.0.Problem Statement
Startup Performance
gallia
suffers from a very slow startup experience; which causes a bad user experience, especially on slow devices like the raspberry pi. The reason for this lies ingallia
's architecture which is strongly coupled with the CLI interface. The whole codebase is imported when--help
is issued; resulting in a lot of syscalls and filesystem traversal.Developer Experience
Further, the developer experience for modules that use multiple scanner classes in sequential order (e.g. session scanner depending on the results of a discover scanner) is bad. This bad experience is also caused by the tight coupling of the cli and the scanner modules. It is indeed impossible to run a scanner from within Python code without specifying a command line invocation string.
Another problem with using scanners as modules is the creation of an event loop.
.entry_point()
creates an own event loop. Multiple event loops in a single CPython invocation cause odd problems due to garbage collection.Testing
Testing scanner modules is cumbersome, since they rely on cli arguments and untyped
argparse.Namespace
containers. Further, there is no definition of scanner outputs/results in the sence of a function return value.Configuration
The configuration file
gallia.toml
has no type checks. A moved config value was not discovered by a tester and silently broke a test setup; see also #377.Logging
The logging module uses exit handlers with
atexit
. This causes problems when multiple scanners are used and a logfile per scanner is desired.Proposed Solution
Separating CLI and Scanners
As a first step, the scanner classes need to be separated from any code related to the cli. Further, this code should be replaced by typed configuration containers. These containers should carry all information that is required to:
gallia.toml
configuration file and benefit from the typing feature.Perhaps while at it, the source code tree under
commands
could be restructured, since it will be separated from the cli.Event Loop
Due to the event loop problem,
entry_point()
must beasync
.Testing
Define outputs of scanners.
Alternatives Considered
Regarding slow tab completion: Tab completion scripts could be generated for each shell. Since we often use
gallia
from git and rely on the tab completion, this would create an additional maintenance cost. I assume that there even won't be any volunteers for this.^^Implementation Draft
Living draft; will be modified as needed
From within Python Code, this could be used standalone; simply do:
Testing is currently not existant for scanners (ok, there is the vecu runner, but this only tests if the scanners do not crash). There is pytest-mock to mimic the network. AFAIC this could be used somehow like this.
Filenames
log.jsonl.zst
to conform to the jsonline standard: https://jsonlines.org/Open for Discussion
lookup_command_class()
work without importing the whole codebase?COMMAND
,SUBCOMMAND
, and so on? An idea might be an embedded class in the proposedCommandConfig
->CommandConfig.CLIConfig
.References
Beta Was this translation helpful? Give feedback.
All reactions