Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why are CVE list entries not conforming to any specified schema? #110

Closed
cookiengineer opened this issue Nov 29, 2022 · 4 comments
Closed

Comments

@cookiengineer
Copy link

cookiengineer commented Nov 29, 2022

(In reference to the current CVE list which is available for git clone via https://github.com/CVEProject/cvelist)

When parsing the cvelist repository, the following schema validation errors happen for hundreds of CVEs. In my case, I'm talking about CVE entries with the data_version set to 4.0 (not the previous schema variants, which are still in varying degrees in the dataset, and have the same problems).

  • impact / cvss can be either an Array or an Object
  • impact / cvss / baseScore can be either a Number or a String
  • generator can be either an Object or a String
  • credit can be either an Array of text values or a String
  • affects / vendor / vendor_data / vendor_name / product / product_data . product_name / version / version_data / version_value contains n/a, N/A, Not available, Multiple, Unspecified, Various and dozens of other variants of the grammatical text meaning of "not available", including HTML entities in encoded form.
  • version_value contains unicode characters (Chinese separation idioms) in the version string. So 1.2.3 becomes unparseable and uncompareable when ASCII is assumed.

I opened up an issue in regards to the CVE v5 export (which was done one time, and eversince is incomplete), and it was basically ignored for the last year.

  • Who is working on this?
  • How is the export done?
  • How can we fix the dataset?
  • What can the community do to improve the dataset?
  • Will pull-requests to the dataset be merged upstream? (44 Pull Requests are open and ignored right now)

I can provide a list of schema-violating CVEs, if necessary. But it's a very long list.

@cookiengineer cookiengineer changed the title Why is nothing of the CVE list conforming to any kind of schema? Why are CVE list entries not conforming to any specified schema? Nov 29, 2022
@kurtseifried
Copy link
Contributor

kurtseifried commented Nov 29, 2022 via email

@zmanion
Copy link
Contributor

zmanion commented Jul 11, 2023

Does this help?

https://www.cve.org/Media/News/item/blog/2023/03/29/CVE-Downloads-in-JSON-5-Format

The current future is CVE JSON 5.0, there are known conversion/maintenance problems with CVE JSON 4.0 that will not be addressed as 4.0 will be deprecated this year.

@zmanion
Copy link
Contributor

zmanion commented Jul 11, 2023

Will pull-requests to the dataset be merged upstream? (44 Pull Requests are open and ignored right now)

If these are PRs to change the content of CVE records, I believe they will not be accepted, GitHub is no longer being used for updates.

@mprpic
Copy link
Collaborator

mprpic commented Apr 24, 2024

The new cvelist is available at https://github.com/CVEProject/cvelistv5 as a read-only copy. All CVE records should be updated through the CVE Services API: https://github.com/cveProject/cve-services. Closing!

@mprpic mprpic closed this as not planned Won't fix, can't repro, duplicate, stale Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants
@mprpic @kurtseifried @cookiengineer @zmanion and others