Predict a classification tag for a body of text in a all vs one strategy. The final output is a file, classification.pkl, that contains a row tuple for each of the top 100 tags in the training data set: ("some tag name", [prediction_values]*len(number of test cases))
- Uses PySpark and Word2Vec