To do

Resampling to obtain similar distributions (http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/RUBNER/emd.htm)
Kaggle submission ensembling with correlations and submission performance taken into account
Random forest imputation
Model performance boxplot (like Airbnb)
Optimize the parameters of the exponential smoothing methods through train/test splitting
Top-terms classifier should accept sparse matrices
Every class should pass scikit-learn tests

Provide feedback