Skip to content

Latest commit

 

History

History
38 lines (25 loc) · 1.6 KB

File metadata and controls

38 lines (25 loc) · 1.6 KB

3.8 One-hot encoding

Slides

Notes

One-Hot Encoding allows encoding categorical variables in numerical ones. This method represents each category of a variable as one column, and a 1 is assigned if the value belongs to the category or 0 otherwise.

Classes, functions, and methods:

  • df[x].to_dict(orient='records') - convert x series to dictionaries, oriented by rows.
  • DictVectorizer().fit_transform(x) - Scikit-Learn class for one-hot encoding by converting x dictionaries into a sparse matrix. It does not affect the numerical variables.
  • DictVectorizer().get_feature_names() - return the names of the columns in the sparse matrix.

The entire code of this project is available in this jupyter notebook.

⚠️ The notes are written by the community.
If you see an error here, please create a PR with a fix.

Navigation