Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructuring the Software class hierarchy #596

Open
3 of 16 tasks
ajnelson-nist opened this issue Mar 6, 2024 · 18 comments · May be fixed by #598 or #597
Open
3 of 16 tasks

Restructuring the Software class hierarchy #596

ajnelson-nist opened this issue Mar 6, 2024 · 18 comments · May be fixed by #598 or #597

Comments

@ajnelson-nist
Copy link
Contributor

ajnelson-nist commented Mar 6, 2024

Disclaimer

Participation by NIST in the creation of the documentation of mentioned software is not intended to imply a recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that any specific software is necessarily the best available for the purpose.

Background

UCO Issue 583 proposed several revisions around representations pertaining to software and its configuration.

This Issue focuses on one set of changes pertaining to a restructure of the class hierarchy pertaining to software, so some changes from 583 can be discussed and implemented for UCO 1.4.0.

Requirements

Requirement 583-2

This requirement is ported from Issue 583:

Ability to characterize different types of software objects
At a minimum this should include Software, Code, Application, Script, Library, Package, Process, Compiler, BuildUtility, SoftwareBuild, OperatingSystem, and ServicePack.

Risk / Benefit analysis

Benefits

This benefit is ported from Issue 583:

  • Clarity and consistency of different forms of software observable objects

Risks

These risks are in addition to those listed on Issue 583.

  • The rdfs:comment definition of ProcessThread does not seem entirely coherent with ProcessThread being a subclass of Process. This software overhaul provides an opportunity for clarification.
  • It is unclear whether any observable:Software subclasses should be considered disjoint with other observable:Software subclasses. The rich and adaptive behavioral nature of software might make it impractical to designate any of these classes disjoint.
  • The new class observable:Package has some usage modes where it is an observable:File and where it is not. Take for example the wheel distribution (URL ending .whl) of case-utils, as listed here. The .whl file that was prepared for upload could be considered both an observable:Package, because it is an installable artifact, and observable:File, because it's a file on the build system's file system. However, the object on PyPI might not be classifiable as an observable:File.
    • The file-or-not point is a point for debate somewhat out of scope of this proposal. UCO models File as a subclass of FileSystemObject. PyPI, and other package management ecosystems, might not store blobs like this as files. They're free to store the backing contents of these URLs as blobs in relational database tables or NoSQL stores if they wanted to. But this is generally invisible to the package consumer.

Competencies demonstrated

(For the sake of discusssion, these examples avoid the UCO rule ending IRIs with UUIDs.)

Competency 1

On a laptop, a directory contains a lone, regular file that contains Python code.

#!/usr/bin/env python3
print("Hello, world!")

The SHA3-256 hash of this file's contents is 496e34e7fe23cf69f078cd1fe860b98b2e91101194773b2f144656c0bab877c3.

This below snippet characterizes this Python file with concepts predating this restructuring proposal: There is a File; separately there is a ContentData; and last there is a Relationship stating that the File contains that ContentData, for all times that the Relationship holds. (Let's assume the Relationship still holds.)

Note: This demonstration purposefully avoids attaching a ContentDataFacet directly to the File.

kb:File-1
	a
		observable:File ,
		observable:Script
		;
	core:hasFacet kb:FileFacet-2 ;
	.
kb:FileFacet-2
	a observable:FileFacet ;
	observable:fileName "hello.py" ;
	.
kb:ContentData-3
	a observable:ContentData ;
	core:hasFacet kb:ContentDataFacet-4 ;
	.
kb:ContentDataFacet-4 ;
	a observable:ContentDataFacet ;
	types:hash kb:Hash-5 ;
	.
kb:Hash-5 ;
	a types:Hash ;
	types:hashMethod "SHA3-256"^^vocabulary:HashNameVocab ;
	types:hashValue "496e34e7fe23cf69f078cd1fe860b98b2e91101194773b2f144656c0bab877c3"^^xsd:hexBinary ;
	.
kb:Relationship-6
	a observable:ObservableRelationship ;
	core:isDirectional true ;
	core:kindOfRelationship "Contained_Within" ;
	core:source kb:ContentData-3 ;
	core:target kb:File-1 ;
	.

Competency Question 1.1

Which objects, between the File, ContentData and ObservableRelationship, are classified as, or constitute, the following?

  • observable:Application
  • observable:Code
  • observable:Script

Result 1.1

TODO

Competency 2

An Ubuntu server runs a service called mywebapp. Running the command service mywebapp status reports three tasks associated with the service. The primary task has PID 10001, and two other worker tasks have PIDs 10002 and 10003. A graph containing these objects contains at least the following:

kb:Process-10001
	a
		observable:LinuxService ,
		observable:LinuxTask
		;
	core:hasFacet kb:ProcessFacet-1 ;
	.
kb:ProcessFacet-1
	a observable:ProcessFacet ;
	observable:pid 10001 ;
	.

kb:Process-10002
	a observable:LinuxTask ;
	core:hasFacet kb:ProcessFacet-2 ;
	.
kb:ProcessFacet-2
	a observable:ProcessFacet ;
	observable:parent kb:Process-10001 ;
	observable:pid 10002 ;
	.

kb:Process-10003
	a observable:LinuxTask ;
	core:hasFacet kb:ProcessFacet-3 ;
	.
kb:ProcessFacet-3
	a observable:ProcessFacet ;
	observable:parent kb:Process-10001 ;
	observable:pid 10003 ;
	.

(NOTE: observable:parent might require a revision to its modeling, due to the potential for processes to become daemons, orphans, zombies - each of which severs the original parent link. The community should consider this an invitation to propose updating practices pertaining to observable:parent, and whether deprecation is appropriate.)

Competency Question 2.1

Which objects are classified as observable:Tasks?

SELECT ?nTask
WHERE {
  ?nTask a/rdfs:subClassOf* observable:Task ;
}

Result 2.1

  • kb:Process-10001
  • kb:Process-10002
  • kb:Process-10003

Competency Question 2.2

Which objects are classified as observable:Services?

SELECT ?nService
WHERE {
  ?nService a/rdfs:subClassOf* observable:Service ;
}

Result 2.2

  • kb:Process-10001

Competency Question 2.3

Which processes are, or were, currently non-primary tasks for the service kb:Process-10001? If the process was a task, when is the relationship known to have ended?

Note this requires terminable parent-child relationship objects; and also, this example applies a custom string for core:kindOfRelationship. (Another proposal about strongly-typed ObservableRelationships linking child processes to parents would complement this example well.)

SELECT ?nTask ?lEndTime
WHERE {
  ?nRelationship
    core:kindOfRelationship "Child_Process_Of_Process" ;
    core:source ?nTask ;
    core:target kb:Process-10001 ;
    .
  OPTIONAL {
    ?nRelationship
      core:endTime ?lEndTime ;
      .
  }
}

Result 2.3

Assume that the example is modified to remove these statements (which removes reliance on the mutative observable:pid property) ...

kb:ProcessFacet-2 observable:parent kb:Process-10001 .
kb:ProcessFacet-3 observable:parent kb:Process-10001 .

... and to add these instead:

kb:Relationship-10002-10001
	a observable:ObservableRelationship ;
	core:isDirectional true ;
	core:kindOfRelationship "Child_Process_Of_Process" ;
	core:source kb:Process-10002 ;
	core:target kb:Process-10001 ;
	.
kb:Relationship-10003-10001
	a observable:ObservableRelationship ;
	core:isDirectional true ;
	core:kindOfRelationship "Child_Process_Of_Process" ;
	core:source kb:Process-10003 ;
	core:target kb:Process-10001 ;
	.

To motivate modeling terminable relationships, consider this extra example data, which includes a representation that some process was spawned and became detached from the website service:

kb:Process-1
	a observable:Process ;
	core:description "/sbin/init" ;
	.
kb:Process-10987
	a observable:Process ;
	.

kb:Relationship-10987-10001
	a observable:ObservableRelationship ;
	core:isDirectional true ;
	core:kindOfRelationship "Child_Process_Of_Process" ;
	core:source kb:Process-10987 ;
	core:target kb:Process-10001 ;
	core:endTime "2023-12-25T08:14:15.9Z"^^xsd:dateTime ;
	.
kb:Relationship-10987-1
	a observable:ObservableRelationship ;
	core:isDirectional true ;
	core:kindOfRelationship "Child_Process_Of_Process" ;
	core:source kb:Process-10987 ;
	core:target kb:Process-1 ;
	core:startTime "2023-12-25T08:14:15.9Z"^^xsd:dateTime ;
	.

Then, the query portion pertaining to detached processes would show a process that left home for the holiday:

?nTask ?lEndTime
kb:Process-10002
kb:Process-10003
kb:Process-10987 2023-12-25T08:14:15.9Z

Solution suggestion

This diagram is ported from Issue 583's solution suggestion:

Semantically Structuring Software ObservableObjects

Since the initial implementation sketch of Issue 583, the following changes have been made:

  • observable:LinuxService, a subclass of observable:Service and sibling to observable:WindowsService, has been added.
  • Issue 583 included some subclass rearrangement that would not be considered a backwards-compatible change. For existing classes that will change their position in the subclass hierarchy, shapes are added for UCO 1.4.0, to warn users their current instances should be multi-typed to line up with what will be the parents in UCO 2.0.0.

Coordination

  • Tracking in Jira ticket OCUCO-312
  • Administrative review completed, proposal announced to Ontology Committees (OCs) on 2024-03-05
  • Requirements to be discussed in OC meeting, 2024-05-30 (rescheduled from Mar. 14)
  • Requirements to be discussed in OC meeting, TBD
  • Requirements Review vote has not occurred
  • Requirements development phase completed.
  • Solution announced to OCs on TODO-date
  • Solutions Approval to be discussed in OC meeting, date TBD
  • Solutions Approval vote has not occurred
  • Solutions development phase completed.
  • Backwards-compatible implementation merged into develop for the next release
  • develop state with backwards-compatible implementation merged into develop-2.0.0
  • Backwards-incompatible implementation merged into develop-2.0.0
  • Milestone linked
  • Documentation logged in pending release page
  • Prerelease publication: CASE develop branch updated to track UCO's updated develop branch
  • Prerelease publication: CASE develop-2.0.0 branch updated to track UCO's updated develop-2.0.0 branch
@ajnelson-nist
Copy link
Contributor Author

@sbarnum , there are a few points needed to finish preparing this proposal well enough for a Requirements Review vote:

  • Can you please supply definitions for the new classes added in this Issue's Pull Request.
  • Can you please state whether any of the freshly-rearranged classes are intended to be disjoint with any others. I've inlined my guesses on disjointedness in the proposal.
  • Can you please give your response to Competency Question 1.1.

ajnelson-nist added a commit that referenced this issue Mar 6, 2024
This patch also updates a test result from one of the to-be-rearranged
classes.

A follow-on patch will regenerate Make-managed files.

References:
* RDFLib/pySHACL#222
* #596
* https://www.w3.org/TR/shacl/#NodeConstraintComponent

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit that referenced this issue Mar 6, 2024
References:
* #596

Signed-off-by: Alex Nelson <[email protected]>
@ajnelson-nist
Copy link
Contributor Author

Update on the implementation: The initial PR tried inlining some anonymous sh:NodeShapes to gently warn that some additional types should be applied. Those were written on an incorrect understanding of how sh:node and sh:property work - it seems that if a supplemental shape attached by sh:node fails validation, even with sh:Info-level severity, the entire shape fails.

This patch, particularly between pre-line 8740 and post-line 13771, changes the implementation style to move all of those "gentle warning" shapes into anonymous shapes trailing at the end of observable.ttl. They are removed in the UCO 2.0.0 PR.

I moved them to anonymous nodes because it felt unhelpful to devise shape IRIs for temporary shapes, because they would no longer be relevant after UCO 2.0.0, but as introduced IRIs we might need to retain them permanently as part of backwards compatibility.

From at least how UCO's current testing infrastructure works, there is a slight difference in the validation reporting depending on whether the shape is identified with a blank node or with an IRI.

So, there is a question to address before Solutions Approval: Should these "gentle warning" shapes be given IRIs, or is it fine to have them be blank nodes? Absent requests otherwise, they will be left as blank nodes.

ajnelson-nist added a commit that referenced this issue May 8, 2024
… class

This applies a practice being tried in Issue 602.

A follow-on patch will regenerate Make-managed files.

References:
* #596
* #602

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit that referenced this issue May 8, 2024
References:
* #596

Signed-off-by: Alex Nelson <[email protected]>
@ajnelson-nist
Copy link
Contributor Author

So, there is a question to address before Solutions Approval: Should these "gentle warning" shapes be given IRIs, or is it fine to have them be blank nodes? Absent requests otherwise, they will be left as blank nodes.

From Issue 602, I found a middle ground: The temporary shapes are blank nodes, but are linked to their associated classes with rdfs:seeAlso. I did just confirm that this will have the blank node shape render on the generated documentation page. The next-minor and next-major PRs have been updated with this implementation.

ajnelson-nist added a commit to casework/CASE-Archive that referenced this issue May 10, 2024
No effects were observed on Make-managed files.

References:
* ucoProject/UCO#596

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/CASE-Archive that referenced this issue May 10, 2024
No effects were observed on Make-managed files.

References:
* ucoProject/UCO#596

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/casework.github.io that referenced this issue May 10, 2024
A follow-on patch will regenerate Make-managed files.

References:
* ucoProject/UCO#596

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/casework.github.io that referenced this issue May 10, 2024
References:
* ucoProject/UCO#596

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/casework.github.io that referenced this issue May 10, 2024
A follow-on patch will regenerate Make-managed files.

References:
* ucoProject/UCO#596

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/casework.github.io that referenced this issue May 10, 2024
References:
* ucoProject/UCO#596

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/CASE-Examples that referenced this issue May 10, 2024
A follow-on patch will regenerate Make-managed files.

References:
* ucoProject/UCO#596

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/CASE-Examples that referenced this issue May 10, 2024
References:
* ucoProject/UCO#596

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/CASE-Corpora that referenced this issue May 10, 2024
No effects were observed on Make-managed files.

References:
* ucoProject/UCO#596

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/CASE-Corpora that referenced this issue May 10, 2024
No effects were observed on Make-managed files.

References:
* ucoProject/UCO#596

Signed-off-by: Alex Nelson <[email protected]>
@ajnelson-nist
Copy link
Contributor Author

This Issue is awaiting English definitions to be added to the new classes before we consider whether the requirements are sufficiently specified.

@ajnelson-nist
Copy link
Contributor Author

There was some question in last week's call on the mutability of a process's parent-reference.

Here is a demonstration in macOS or Linux of a process changing its parent, using a Bash shell:

sleep 60

That command runs an idle process for 60 seconds, and the process remains the foreground of the shell. Terminating the parent shell would recursively terminate the child sleep process.

sleep 60 &

That command runs an idle process and backgrounds the process. The shell is still the process's parent - terminating the parent shell would recursively terminate the child sleep process.

nohup sleep 60 &

That command runs like sleep 60 &, except now if the shell is terminated, the root process init inherits the child. This can be seen with ps -ef | grep sleep - see the "PPID" (parent process ID) column.

@ajnelson-nist
Copy link
Contributor Author

@sbarnum : I was looking at the classes in this proposal and thinking of how to demonstrate them. "Service pack" is giving me some confusion versus the other classes.

How would you instantiate Windows XP Service Pack 2 (relevant for at least a lot of available forensic reference data), in these ways:

  • As the software sitting on a hologrammed DVD.
  • As a running operating system.

I expect in both of these cases several of the software types will apply to each node.

ajnelson-nist added a commit to ucoProject/UCO-Profile-gufo that referenced this issue Jun 7, 2024
No effects were observed on Make-managed files.

References:
* ucoProject/UCO#596

Signed-off-by: Alex Nelson <[email protected]>
@sbarnum
Copy link
Contributor

sbarnum commented Oct 24, 2024

This Issue is awaiting English definitions to be added to the new classes before we consider whether the requirements are sufficiently specified.

Here are proposed English definitions for each of the new proposed classes as well as tweaks to a few existing definitions to better align over all and to address some issues that came about from creating clear distinct definitions for Task, ProcessThread, and Process.
Fleshing these definitions out also led to a need to slightly alter the Software class taxonomy as Task and ProcessThread are heavily related to Process but should not be subclasses.
Here is the new updated diagram:

Software Deployment Overview-2 - ObservableObjects drawio

  • BuildUtility
    • A Build Utility is a software-based tool that automates portions or all of the process of creating executable software from source code
  • Compiler
    • A Compiler is a software program that translates source code written in a high-level language (e.g., C++, Python, Java) into machine code that can be understood and executed by a computer processor.
  • DeploymentScript
    • A Deployment Script is a software script used to deploy artifacts, packages, modules, patches, or other resources into an intended execution environment
  • LinuxService
    • A Linux Service (often referred to as a daemon) is a Service running within a Linux operating system, similar to the way a Windows Service runs on Windows.
  • LinuxTask
    • A Linux Task is a set of software computer instructions loaded into memory with the potential to be scheduled for execution within the Linux operating system.
  • Package
    • A Pakcage is a body of software consisting of a collection of individual software (programs, libraries, files, etc.) packaged together to collectively serve a broader purpose
  • Process
    • rdfs:comment "A Process is an instance of a software program that is being executed within a scope having dedicated memory, address space, execution variables, code instructions, state, security info, file handles, etc. Process execution consists of one or more component threads sharing the process resources."@en ;
  • ProcessThread
    • rdfs:comment "A Process Thread is the smallest sequence of programmed instructions that can be managed independently by a scheduler on a computer, which is typically a part of the operating system. It is a scheduled running instantiation of one or more tasks (including CPU flags, counters, timers, stack, etc.) as a component of a process. Multiple threads can exist within one process, executing concurrently and sharing resources such as memory, while different processes do not share these resources. In particular, the threads of a process share its executable code and the values of its dynamically allocated variables and non-thread-local global variables at any given time. based on [https://en.wikipedia.org/wiki/Thread_(computing)]"@en ;
  • Script
    • A Script is a software consisting of computer instructions that can be interpreted and executed in real-time (typically by an interpreter rather than directly by a computer processor) without requiring advance compilation
  • Service
    • A Service is a process that runs in the background rather than under the control of an interactive user. Services are typically long-running and can be configured to start when the operating system starts and continue as long as the operating system is running.
  • ServicePack
    • A Service Pack is a software consisting of a collection of software updates or fixes (patches) for a software delivered as an aggregated single package for ease of installation
  • SoftwareBuild
    • A Software Build is a particular executable version of software that has been created from source code and is ready for testing or deployment
  • Task
    • A Task is a set of software computer instructions loaded into memory with the potential to be scheduled for execution
  • WindowsService
  • WindowsTask
    • rdfs:comment "A Windows Task is a set of software computer instructions loaded into memory with the potential to be scheduled for execution within the Windows operating system."@en;
  • WindowsThread
    • rdfs:comment "A Windows thread is a Process Thread within a Windows process."@en ;

ajnelson-nist pushed a commit that referenced this issue Oct 25, 2024
No effects were observed on Make-managed files.

AJN: This is my transcription of Sean's Issue Comment (see references),
with a few minor grammatical and typographical fixes.

References:
* #596 (comment)

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist pushed a commit that referenced this issue Oct 25, 2024
No effects were observed on Make-managed files.

AJN: This is my transcription of Sean's Issue Comment (see references),
with a few minor grammatical and typographical fixes.

References:
* #596 (comment)

Signed-off-by: Alex Nelson <[email protected]>
@ajnelson-nist
Copy link
Contributor Author

Thank you, @sbarnum, I've incorporated the definition updates.

ajnelson-nist added a commit to ucoProject/UCO-Profile-gufo that referenced this issue Nov 1, 2024
This patch leaves one incompletely-typed class, `ServicePack`, pending
discussion.

References:
* ucoProject/UCO#596
* ucoProject/UCO@faae89b

Signed-off-by: Alex Nelson <[email protected]>
@ajnelson-nist
Copy link
Contributor Author

@sbarnum : I'm looking at some class-pairings.

  • I suspect Compiler is a subclass of Application. Do you have an example where some Compiler is not an Application?
  • Is there an example of a Library that is not a File? I suspect yes, when considering in-process-memory objects, but we haven't delved into modeling that to date.

@sbarnum
Copy link
Contributor

sbarnum commented Nov 4, 2024

  • Can you please state whether any of the freshly-rearranged classes are intended to be disjoint with any others. I've inlined my guesses on disjointedness in the proposal.

I think it is very tricky and risky in defining disjoint between classes of software as there is a large amount of inherent potential overlap.
I am not sure I see significant value in trying to tease apart this issue.
That being said, if I had to take a cut at disjoint assertion that could likely be safe I would go with something like:

• Process disjoint from
	○ Code
	○ Application
	○ Script
	○ Library
	○ ProcessThread
	○ Task
	○ Compiler
	○ BuildUtility
	○ SoftwareBuild
	○ OperatingSystem
	○ ServicePack
• ProcessThread disjoint from
	○ Code
	○ Application
	○ Script
	○ Library
	○ Process
	○ Task
	○ Compiler
	○ BuildUtility
	○ SoftwareBuild
	○ OperatingSystem
	○ ServicePack
• Task disjoint from
	○ Code
	○ Application
	○ Script
	○ Library
	○ Process
	○ ProcessThread
	○ Compiler
	○ BuildUtility
	○ SoftwareBuild
	○ OperatingSystem
	○ ServicePack
• Application disjoint from 
	○ OperatingSystem
	○ Library
• BuildUtility disjoint from
	○ Library
	○ OperatingSystem
• Library disjoint from
	○ OperatingSystem
	○ Compiler
	○ BuildUtility
• ServicePack disjoint from
	○ Application
	○ Library
	○ Compiler
	○ BuildUtility
	○ OperatingSystem
• OperatingSystem disjoint from
	○ Compiler

ajnelson-nist added a commit to ucoProject/UCO-Profile-gufo that referenced this issue Nov 6, 2024
This patch is known to not pass CI due to an already-existing and
unresolved modeling question on ServicePack.

References:
* ucoProject/UCO#596 (comment)

Co-authored-by: Sean Barnum <[email protected]>
Signed-off-by: Alex Nelson <[email protected]>
@ajnelson-nist
Copy link
Contributor Author

@sbarnum , I will agree on tricky, but I do think we will find benefits from trying to specify these disjointedness statements. For instance, this helped catch a specification issue with one of the requirements over on Issue 626 (described here).

@ajnelson-nist
Copy link
Contributor Author

@sbarnum : Also in light of Issue 626 (constraining observable:cpeid), it looks like your disjointedness statements work out so that these subclasses of Software (as proposed in the current Issue) can't have CPEs associated, because they are disjoint with both Application and OperatingSystem.

?nClass
drafting:ServicePack
drafting:Task
uco-observable:Library
uco-observable:Process
uco-observable:ProcessThread

Per this SPARQL query:

PREFIX uco-observable: <https://ontology.unifiedcyberontology.org/uco/observable/>
SELECT ?nClass
WHERE {
  ?nClass
    rdfs:subClassOf* uco-observable:Software ;
    owl:disjointWith
      uco-observable:Application ,
      uco-observable:OperatingSystem
      ;
    .
}
ORDER BY ?nClass

Task, Process, and ProcessThread, I see no controversy. I still need your help understanding ServicePack. But what about Library? A vulnerability in a library would be a significant point of interest in supply chain review.

Could you perhaps illustrate how a Library fits into the composition of an Application?

@ajnelson-nist
Copy link
Contributor Author

@sbarnum , given the definition you provided for ServicePack, why is it not a subclass of Package?

@sbarnum
Copy link
Contributor

sbarnum commented Nov 8, 2024

@sbarnum : Also in light of Issue 626 (constraining observable:cpeid), it looks like your disjointedness statements work out so that these subclasses of Software (as proposed in the current Issue) can't have CPEs associated, because they are disjoint with both Application and OperatingSystem.

?nClass
drafting:ServicePack
drafting:Task
uco-observable:Library
uco-observable:Process
uco-observable:ProcessThread
Per this SPARQL query:

PREFIX uco-observable: <https://ontology.unifiedcyberontology.org/uco/observable/>
SELECT ?nClass
WHERE {
  ?nClass
    rdfs:subClassOf* uco-observable:Software ;
    owl:disjointWith
      uco-observable:Application ,
      uco-observable:OperatingSystem
      ;
    .
}
ORDER BY ?nClass

Task, Process, and ProcessThread, I see no controversy. I still need your help understanding ServicePack. But what about Library? A vulnerability in a library would be a significant point of interest in supply chain review.

Could you perhaps illustrate how a Library fits into the composition of an Application?

I believe that Application ("An application is a particular software program designed for end users.") and OperatingSystem ("An operating system is the software that manages computer hardware, software resources, and provides common services for computer programs.") are conceptually disjoint. To me that is fairly obvious from their definitions.

I think that CPE also felt this way which is why they have separate 'a' and 'o' types. That being said I do think that CPE missed the ball in that they basically presume ALL software falls into these two categories which I think is clearly inaccurate.

If the approach we need to follow for constraining cpeid does not align with this then I think we either need to remove and ignore the cpeid constraint or simply not declare Application and OperatingSystem disjoint though I doubt the latter will fix the issue given all the other forms of software. I do NOT believe we should in any way attempt to change the semantics of Application and OperatingSystem due to pursuing cpeid constraints.

On the issue of Library, I do not think that a Library should be asserted within the composition of an Application. An Application is not composed of a Library. A Library could be associated with an Application as part of its provenance in development/build but not in its final composition. A particular component from a Library may be part of the composition of an Application but not the Library writ large. If an Application leverages the foo component from Library A and it is known that the bar component from Library A has a vulnerability, the fact that the Application leveraged part of Library A does not mean the vulnerability affects the Application. A particular component (as expressed using Code or possibly by adding a new Component subclass of Software) could be expressed as part of the composition of Library A and of Application X.

Does that make sense?

@sbarnum
Copy link
Contributor

sbarnum commented Nov 8, 2024

@sbarnum , given the definition you provided for ServicePack, why is it not a subclass of Package?

Good point.

I think you are correct, it should be a subclass of Package.

I think it is important to maintain semantic separation between a ServicePack as an object made up of multiple software updates/fixes and a software version identifier referencing a service pack.

The latter (e.g., Microsoft Windows 11 SP1) is an identifier for a body of software consisting of the Windows 11 operating system AND the ServicePack 1 updates/fixes. This is not the same thing as the ServicePack 1 updates/fixes which would be represented by a ServicePack object. You could assert that the ServicePack 1 ServicePack object is associated with the Windows 11 OperatingSystem object or even that as part of an Action that the ServicePack 1 object was applied to the Windows 11 OperatingSystem object resulting in the Windows 11 SP1 OperatingSystem object.

Does that answer your uncertainties around ServicePack?

@sbarnum
Copy link
Contributor

sbarnum commented Nov 8, 2024

@sbarnum : I'm looking at some class-pairings.

  • I suspect Compiler is a subclass of Application. Do you have an example where some Compiler is not an Application?
  • Is there an example of a Library that is not a File? I suspect yes, when considering in-process-memory objects, but we haven't delved into modeling that to date.

From one angle I guess you could consider a Compiler as a subclass of Application but I tend to not think that this is inherently always the case. I think that compilers can be viewed as much intended for build utilities as for end users, and thus they are not simply applications. This is why I kept them separate and continue to believe that was the right decision.

I definitely do not think of a Library as a File. A Library could be serialized into file form but it itself is NOT a File. As you stated it could be an in-process-memory object but could also be presented to a user within a development application independent of any file or persisted as a set of components within a database or other such structures.

@sbarnum
Copy link
Contributor

sbarnum commented Nov 8, 2024

@sbarnum : I was looking at the classes in this proposal and thinking of how to demonstrate them. "Service pack" is giving me some confusion versus the other classes.

How would you instantiate Windows XP Service Pack 2 (relevant for at least a lot of available forensic reference data), in these ways:

  • As the software sitting on a hologrammed DVD.
  • As a running operating system.

I expect in both of these cases several of the software types will apply to each node.

I would start by highlighting my comment above on the importance of distinguishing between an actual ServicePack of SW update content and a software version identifier for a software that has had a service pack applied.

Beyond that, I am having some trouble understanding where the confusion lies. So, if I don't answer your question please let me know where I am missing the area of confusion.

The below are VERY simple graph diagrams of classes, properties and relationships at a conceptual level (ignoring facets and the like) that attempt to convey how to use UCO to represent a few scenarios.

If you wished to convey the relationship between an OS, a ServicePack for that OS, and the updated version of the OS with the ServicePack applied you could do that with something like:

Updating OS using ServicePack

If you wished to convey a storage medium (such as a DVD) containing a service pack updated version of an OS you could do it with something like:

Storage Medium containing OS

If you wanted to show the ServicePack itself on the storage medium instead of the updated OS then just replace the OperatingSystem object with the ServicePack object. You could also simply leave out the File object hop if you wanted to be even more abstract.

If you wished to convey a service pack updated version of an OS running on a computer you could do it with something like:

Computer running OS

All of these scenarios could also easily be combined with something like:

OS ServicePack combined scenarios

Does that make sense?

Does it provide the clarity you were looking for?

ajnelson-nist added a commit to ucoProject/UCO-Profile-gufo that referenced this issue Nov 12, 2024
…dividual

Thanks to @sbarnum for discussion leading to ServicePack's motion.

Thanks to @plbt5 for discussion leading to the FunctionalComplex
alignment.

No effects were observed on Make-managed files.

References:
* ucoProject/UCO#596 (comment)

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit that referenced this issue Nov 12, 2024
No effects were observed on Make-managed files.

References:
* #596

Signed-off-by: Alex Nelson <[email protected]>
@ajnelson-nist
Copy link
Contributor Author

@sbarnum , thank you for the illustration figure. It identified a point where I don't think we agree - indirect, but still in scope of this proposal, and I think important to address to the benefit of at least forensics and software supply chain review.

Your second figure describes some DVD that contains one file, and that one file contains an OperatingSystem. You describe soon after that this object constellation could also apply to a ServicePack object, swapping the OS for a service pack graph-node.

I think that in the general case this constellation is inconsistent with software deployment strategies. Take for example the ISO installation image of Windows 101, which today Microsoft provides here. If you mount and review the contents of that DVD, you will see there are hundreds of non-directory files in it. (There is a similar file tally in a recent Ubuntu1 installation image.)

No one file among them is "The operating system". It seems to me that the operating system is some subset of the collection of files. Is /setup.exe in the image a DeploymentScript (and a File) that is part of the OS, or is the OS just what is booted into?

There's also a point of nuance that this .iso "provides" two operating systems: one boots a machine well enough from removal media to run an operating system installation application, and the other is the bootable operating system resulting from installation, and that no longer requires the (virtual) DVD.

There's also another interpretation of operating system - the running process that has the hardware-interfacing kernel. You've previously designated Process as disjoint from OperatingSystem. On macOS, do you consider the process with ID 0, kernel_task, just a Process? ID 1, launchd, just a Process? Both distinct from the OperatingSystem? How do we represent an application, or malware, that has a run target of only certain versions of an operating system?

I think significant parts of software supply chain review processes hinge on the representation of operating system, in similar manners that affect Package and Application. My current understanding is these are supportable with some practices of mereology, or composition/decomposition: If an operating system is taken to be an "Object-aggregate" or "Functional Complex" (terms1 from BFO (BFO-27) and gUFO, respectively), then we can talk about their component parts, and start noting properties of those components (like hashes) and relating those components back to the larger, aggregate thing (Application/OperatingSystem/Package) to determine things like signatory characteristics of versions. (I mean "component" here as used in those two foundational ontologies, not necessarily as in CPE.)

I'd also like to understand how you think SoftwareBuild fits into your recent illustrations. If I compile a code-base, and the output is two installable binaries, interface1 and interface2, is the SoftwareBuild the set (Bundle?) of both of those? Or is interface1 a SoftwareBuild, and interface2 its own SoftwareBuild?

This might help with answering Competency Question 1, which I apologize but I'd forgotten hadn't been answered yet. For what it's worth, my current thinking is that kb:ContentData-3 would also be classified as an observable:Code. And it now seems to me (from reviewing your sketch) that there is a standalone Application, which I'll just call kb:Application-7, which is a separate object from kb:File-1, and some property or Relationship would like kb:File-1 and kb:Application-7 as, roughly, "File 1 constitutes Application 7."

Footnotes

  1. Disclaimer: Participation by NIST in the creation of the documentation of mentioned software is not intended to imply a recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that any specific software is necessarily the best available for the purpose. 2 3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants