This chapter covers simulating data and applying everything learned in chapters 1-6 to catching hackers attempting to authenticate to a website, using rule-based strategies for anomaly detection.
After discussing how to build the login_attempt_simulator
package, we will build the simulate.py
script for running the simulation. The simulation will generate the files in the logs/
and user_data/
directories. Then, we will use the simulated data in the logs/
directory to conduct our analysis in the anomaly_detection.ipynb
notebook.
All of the aforementioned files are provided in this directory:
logs/
: Directory containing all simulated log files for the analysisuser_data/
: Directory containing information on the user base used for the simulation (for thesimulate.py
script to use)anomaly_detection.ipynb
: Jupyter notebook used to perform our analysissimulate.py
: Python script for simulating the data using thelogin_attempt_simulator
package
The end-of-chapter exercises will use the simulate.py
script to generate a new dataset; solutions to these exercises can be found in the repository's solutions/ch_08/
directory.