Vulnerabilities in Reused Software

This repository contains the necessary scripts in order to build a dataset of open-source projects and analyze how their reuse characteristics are related to their security vulnerabiltities.

For the ICSR'19 paper version of this dataset checkout the icsr19 branch.

This document presents:

Folder structure
Setting up study environment
Steps for data collection
Steps for analysis

Folder structure

This repository consists of two main directories:

data: stores all files that will be analyzed
tooling: stores all scripts for building and analyzing the dataset

Setting up study environment

The analysis was performed using the following tools:

Linux Mint (v 19.3)
Python (v 3)
Anaconda (v. 4.5.12)
Java (v > 8)
Maven (v. 3.6)

Steps to set up study environment

Install Anaconda
From a terminal, create a conda environment for the study.

$ conda create -n study-env
$ conda activate study-env

From a terminal, install the necessary packages.

$ conda install -c conda-forge notebook maven xmltodict numpy scipy pandas matplotlib seaborn

Now, from a terminal, execute the

$ tooling/download-vendor-tools.sh

Next, open the tooling/script.py and replace the STUDY_HOME path variable with the path of your locally cloned repository.
Finally, create a JAVA_HOME system variable and export to the PATH. See more instructions here.

Steps for data collection

The steps for the data collection are described in the tooling/DataCollection.ipynb, tooling/DataCollectionRQ2.ipynb and tooling/DataCollectionRQ3.ipynb jupyter notebooks. More specific instruction for each substep are included before each substep.

Steps for analysis

The steps for the data analysis are described in the tooling/DataVisualization.ipynb jupyter notebook. The execution of the steps is linear, and thus it should be executed from the top to the bottom. Analyzing the dataset requires a local Maven .m2 directory which have all built projects and their dependencies jars.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data		data
datasets		datasets
tooling		tooling
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vulnerabilities in Reused Software

Folder structure

Setting up study environment

Steps to set up study environment

Steps for data collection

Steps for analysis

About

Releases

Packages

Languages

License

AntonisGkortzis/Vulnerabilities-in-Reused-Software

Folders and files

Latest commit

History

Repository files navigation

Vulnerabilities in Reused Software

Folder structure

Setting up study environment

Steps to set up study environment

Steps for data collection

Steps for analysis

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages