EE 126 project, with William Jow and David Lin. [Report]
To download the KTH dataset,
run ./get_data.sh
. By default, the script will download the KTH videos,
sequences metadata file,
and precomputed normalization statistics. It will also split the data into train and test subsets.
Do not rename the files or tamper with the directory structure, as the program parses activities
from filenames and expects a data folder with six subfolders in it (one for each activity's videos).
Once you have the data, you can run the following commands with config/quickstart.yaml
as <path to config>
.
python main.py extract <path to config> # extract features
python main.py build <path to config> # build models
python main.py classify <path to config> # classify activity
Important: install dependencies according to requirements.txt
. Other versions of the modules are not officially supported, and may or may not work. (For example, it is probably okay to upgrade NumPy, but not hmmlearn
or scikit-learn
.) If you want to use a virtual environment (recommended), you may find quickstart.sh
helpful as a reference.
From a user perspective, the program is almost entirely specified by the input config file.
There are examples of such files in the config
directory, e.g. Owen's.
Descriptions of configurable parameters can be found below. Note that parameters are
grouped by their associated run command.
extract_features:
# Path to the video from which features should be extracted.
# Only one of `base_dir`, `video_dir`, and `video_path` will be used;
# `video_path` has the lowest priority, and is used primarily for debugging.
video_path: /Users/owen/hmm_activity_recognition/data/kth/train/boxing/person01_boxing_d4_uncomp.avi
# Path to the top-level folder containing all of the activity video subfolders.
# Only one of `base_dir`, `video_dir`, and `video_path` will be used;
# `base_dir` has the highest priority, and is used when training all of the HMMs.
base_dir: /Users/owen/hmm_activity_recognition/data/kth/train
# Path to the folder or file to which feature matrices should be saved.
# Comment out if saving is not desirable, e.g. if debugging.
save_path: /Users/owen/hmm_activity_recognition/features
# Set these to True for a little (or a lot?) more output.
debug: False
verbose: False
# Foreground estimation method.
# 0 means no foreground estimation.
# 1 means OpenCV background subtraction followed by a noise filter.
# 2 means average subtraction.
fg_handler: 0
# Parameters for Shi-Tomasi corner detection.
st:
maxCorners: 200
qualityLevel: 0.05
minDistance: 3
blockSize: 10
# Parameters for Lucas-Kanade optical flow.
lk:
winSize: (15, 15)
maxLevel: 2
# Settings for feature inclusion.
# Each feature will be included if its value is True, and excluded if its value is False.
feature_toggles:
optical_flow: False
freq_optical_flow: False
dense_optical_flow: False
# If True, PCA will not be applied.
freq_dense_optical_flow: True
divergence: True
curl: True
avg_velocity: True
edge: True
centroid: True
# Noise filtering parameters.
# Used in the first foreground estimation method.
denoise:
kernel_size: 5
threshold: 3
# Dense optical flow parameters.
dense_params:
# Window height as fraction of total height.
roi_h: 0.8
# Window width as fraction of total width.
roi_w: 0.4
pyr_scale: 0.5
levels: 3
winsize: 15
iterations: 3
poly_n: 5
poly_sigma: 1.2
# Dimensionality after PCA.
# If None, PCA is not used.
n_components: 20
# Number of maximal values to use for each flow direction.
# If None, all values are used.
top_k: None
# Side length of square maximum response window.
# If None, the entire ROI shall be considered.
mres_wind: None
# Divergence parameters.
div_params:
# If None, all divergence values are used
# (instead of taking a histogram of divergence values).
n_bins: None
# If None, PCA is not used.
pca_dim: 8
# Curl parameters.
curl_params:
# If None, all curl values are used
# (instead of taking a histogram of curl values).
n_bins: None
# If None, PCA is not used.
pca_dim: 8
# Dimensionality of `freq_optical_flow` feature.
n_bins: 20
# Dimensionality after PCA of `edge` feature.
edge_dim: 20
# Amount by which frames should be trimmed (at the beginning and the end).
trim: 1
# Whether features should be normalized.
# Set to False to compute global stats for later normalization.
normalize: False
# Path to the sequences metadata file.
sequences_path: /Users/owen/hmm_activity_recognition/data/kth/sequences.txt
# Path to which mean and variance data should be saved.
stats_path: /Users/owen/hmm_activity_recognition/data/kth/norm_stats.pkl
build_models:
# Path to the directory containing saved feature matrices.
h5_dir: /Users/owen/hmm_activity_recognition/features
# Path to the directory into which models should be saved.
model_dir: /Users/owen/hmm_activity_recognition/models
# Dimensionality to which all feature matrices will be standardized.
# If using the `optical_flow` feature, this must be provided as an integer.
# Otherwise, it's probably preferable to leave it unspecified (i.e. set to a non-integral value).
n_features: infer
# Whether stats should be computed for output normalization.
compute_stats: False
# Configuration for each model in the ensemble.
# To add another model, just add a list item similar to the one(s) shown below.
mconf:
# Number of states in the HMM.
- n_components: 4
# Maximum number of iterations we're willing to run.
n_iter: 20
# Model type (either `gmm` or `gaussian`).
m_type: gmm
# Fraction of sequences the model should train on.
subsample: 1.0
classify_activity:
# Whether all activities should be classified, or just one.
all: True
# Path to video data. A directory if `all` is True, otherwise a video.
path: /Users/owen/hmm_activity_recognition/data/kth/test
# Path to folder in which models are saved.
model_dir: /Users/owen/hmm_activity_recognition/models
# Portion of videos that should be classified. Only meaningful if `all` is True.
eval_fraction: 1.0
# Dimensionality to which all feature matrices will be standardized.
# If using the `optical_flow` feature, this must be provided as an integer.
# Otherwise, it's probably preferable to leave it unspecified (i.e. set to a non-integral value).
n_features: infer