Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roadmap - Documentation #3

Open
rothoma2 opened this issue May 27, 2024 · 0 comments
Open

Roadmap - Documentation #3

rothoma2 opened this issue May 27, 2024 · 0 comments
Assignees

Comments

@rothoma2
Copy link
Contributor

rothoma2 commented May 27, 2024

The following is to document the Vision for the Roadmap and how pieces fit together.

Foundational Datasets:

When we look at the top level pieces, we have at the very base a set of Dataset that are collected from different sources. This dataset of fresh updated samples of different types of data (malware, phishing emails, malicious domains, malicious urls etc) is used to train, and constantly evaluate and retrain ML models. Its necessary to write and maintain automation scripts to harvest and collect this datasets. The language of tools is Python.

Foundational Models

Some base ML models, that are used in the industry. There's a lot of research on the features and success of this models. We will maintain the scripts to train the models and publish both the tooling to train and the model. Overtime we could approach Researchers to see if they want to do research with us, or improve one of our models. If a model drops below a certain threshold an idea would be to Create an open Kaggle Competition to get Data Scientist to compete and improve the model back into a given acceptable threshold.

Tools

Some good Security Tools, to remove unneeded functionality, or functionality that aids on additional prevention or detection. The language of choice is Python.

Thread Feeds

Another valuable service, is an aggregated IOC Feed, for CyberSecurity that are valuable to be added to other platforms such as DNS, or Firewalls.

OSS Modules

Once we have foundational datasets and models that are well maintain and frequently published, we can approach certain Opensource Projects and present to the maintainers our intentions to write plugins/modules to expand their functionality by leveraging inference of the ML models. This modules will likely require some work, and have to be written in C. At this point, hundreds or thousands of OpenSource users, could enable this modules in their systems and benefit of the work from the alliance.

Image

Image

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Progress
Development

No branches or pull requests

1 participant