Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Centralize malicious #132

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from
Draft

Centralize malicious #132

wants to merge 9 commits into from

Conversation

Robin5605
Copy link
Contributor

@Robin5605 Robin5605 commented Jul 25, 2023

Blocked by #131
Closes #95

  • Add a new constant in constants.py that allows tweaking the score threshold
  • New endpoint: GET /scans that returns a list of all packages scanned since since, and a list of malicious packages.
    • Malicious package is defined to be any package with a score greater than or equal to that defined in the constants

Add inspector URL to `first_safe_second_malicious.json` to records that have a score
greater than 0 (since packages that match rules *must* have an inspector
URL). This was done because of the conditional in L34-35 of
`src/endpoints/scans.py`:
```py
if scan.inspector_url is None:
    continue
```
The previous conditional that checks the score should ensure that the
inspector_url is never actually null in a production environment where
we get real data instead of our own fed in from test data. However,
since we use one big table for everything, most of our columns are
nullable and and "runtime checked." This means that even though the
`inspector_url` will always not be null if the score is greater than
0, the type-checker doesn't know this. So this is mostly a check to
appease the typechecker.
Add a `score_threshold` field to the `MainframeSettings` configuration
class in `constants.py` that determines the minimum score required
for a scan to show up in the `malicious_packages` field of `GET /scans`
response. This score determines what packages are considered "malicious"
Add a new `GET /scans` endpoint under `src/mainframe/endpoints/scans.py`
This endpoint takes one query string parameter, `since`, which is the
UNIX epoch timestamp. It returns two fields: `all_scans` and
`malicious_packages`. `all_scans` returns the package name and version
of all packages that were scanned since `since`, while
`malicious_packages` returns a list of packages that have a score higher
than the set `score_threshold`.
@import-pandas-as-numpy
Copy link
Member

I thought we had a /scans since endpoint already? I feel like I've seen it in the logs. Unless you moved things around.

@Robin5605
Copy link
Contributor Author

Yeah, that's what this PR originally added. I wanted to see if a websocket would be feasible, but I don't think it is anymore. We'd have to handle things like CD redeployments, disconnects, etc. Stateless HTTP might just be better.

@Robin5605 Robin5605 requested review from a team as code owners April 5, 2024 14:43
@shenanigansd shenanigansd marked this pull request as draft May 30, 2024 01:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Centralize what flags as "malicious"
2 participants