-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why are CVE list entries not conforming to any specified schema? #110
Comments
Agreed, I wanted to check and enforce schemas for data the global security database pulls in, but I can’t for the CVE cvelist data as there are just too many broken entries. My suggestion, and original intent with the standard was to have a core required set of data that had to confirm, and everything was optional (like http headers), but a lot of core stuff is broken in cvelist.-KurtOn Nov 28, 2022, at 11:10 PM, Cookie Engineer ***@***.***> wrote:
(In reference to the current CVE list which is available for git clone via https://github.com/CVEProject/cvelist)
When parsing the cvelist repository, the following schema validation errors happen for hundreds of CVEs:
impact / cvss can be either an Array or an Object
impact / cvss / baseScore can be either a Number or a String
generator can be either an Object or a String
credit can be either an Array of text values or a String
affects / vendor / vendor_data / vendor_name / product / product_data . product_name / version / version_data / version_value contains n/a, N/A, Not available, Multiple, Unspecified, Various and dozens of other variants of the grammatical text meaning of "not available", including HTML entities in encoded form.
version_value contains unicode characters (Chinese separation idioms) in the version string. So 1.2.3 becomes unparseable and uncompareable when ASCII is assumed.
I opened up an issue in regards to the CVE v5 export (which was done one time, and eversince is incomplete), and it was basically ignored for the last year.
Who is working on this?
How is the export done?
How can we fix the dataset?
What can the community do to improve the dataset?
Will pull-requests to the dataset be merged upstream? (44 Pull Requests are open and ignored right now)
I can provide a list of schema-violating CVEs, if necessary. But it's a very long list.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Does this help? https://www.cve.org/Media/News/item/blog/2023/03/29/CVE-Downloads-in-JSON-5-Format The current future is CVE JSON 5.0, there are known conversion/maintenance problems with CVE JSON 4.0 that will not be addressed as 4.0 will be deprecated this year. |
If these are PRs to change the content of CVE records, I believe they will not be accepted, GitHub is no longer being used for updates. |
The new cvelist is available at https://github.com/CVEProject/cvelistv5 as a read-only copy. All CVE records should be updated through the CVE Services API: https://github.com/cveProject/cve-services. Closing! |
(In reference to the current CVE list which is available for git clone via https://github.com/CVEProject/cvelist)
When parsing the
cvelist
repository, the following schema validation errors happen for hundreds of CVEs. In my case, I'm talking about CVE entries with thedata_version
set to4.0
(not the previous schema variants, which are still in varying degrees in the dataset, and have the same problems).impact
/cvss
can be either an Array or an Objectimpact
/cvss
/baseScore
can be either a Number or a Stringgenerator
can be either an Object or a Stringcredit
can be either an Array of text values or a Stringaffects
/vendor
/vendor_data
/vendor_name
/product
/product_data
.product_name
/version
/version_data
/version_value
containsn/a
,N/A
,Not available
,Multiple
,Unspecified
,Various
and dozens of other variants of the grammatical text meaning of "not available", including HTML entities in encoded form.version_value
contains unicode characters (Chinese separation idioms) in the version string. So1.2.3
becomes unparseable and uncompareable when ASCII is assumed.I opened up an issue in regards to the CVE v5 export (which was done one time, and eversince is incomplete), and it was basically ignored for the last year.
I can provide a list of schema-violating CVEs, if necessary. But it's a very long list.
The text was updated successfully, but these errors were encountered: