Personalised Recommendation system using Neural Collaborative Filtering

Step 1: Train NCF Model on Movie Ratings

Objective: Use hard sampling to improve the model.

Interaction Matrix: Created a matrix of users and items (movies).
Labels:
- Positive interaction: labeled as 1.
  - Hard Positive: Rating ≥ 4.
  - Soft Positive: Rating ≥ 3.
- Negative interaction: labeled as 0.
  - Hard Negative: Rating ≤ 1.
  - Soft Negative: 2 ≤ Rating < 3.
Sampling: For each positive interaction, 4 negative interactions were sampled for the same user.
Final Dataset Format:
- index | userid | movieid | interaction | rating.

Outcome: Trained the NCF model on this data.

Step 2: Prepare User, Movie, and Genre Embeddings

User Embedding:
- Movies with positive interactions were selected (interaction = 1).
- Used the movie description metadata and computed embeddings using the Lamini model on Hugging Face.
- User embedding: Calculated as the mean of all positive movie embeddings for that user.
- Reasoning: This dense vector represents the user's preferences, which can then be compared with movie embeddings to predict future likes.
Genre Embedding:
- Movies were grouped by genres (a movie can belong to multiple genres).
- Genre embedding: Calculated as the mean of all movies belonging to that genre.
  - Example: Embedding of "Comedy" = Mean of all comedy movie embeddings.
- User-Genre Similarity: Compared the user embedding with genre embeddings to find genres the user is likely to enjoy.
- Genre-Genre Similarity Matrix: A 17x17 matrix was computed to determine similarities between different genres. This helps in recommending genres related to ones the user already likes.

Step 3: Calculate User Embeddings for All Users and Compute Median Similarity Scores

Repeated the embedding process for all users.
Calculated user-genre similarity for every user and stored results in a nested dictionary (dict of DataFrames).
Median Similarity Score: Computed for each user.
- Genres with similarity scores higher than the median were classified as positive genres for the user.
- Genres with similarity scores lower than the median were classified as negative genres for the user.

Step 4: Identify Actual Positive and Negative Genres for Each User

Actual Positive Genres:
- Intersection of genres where the similarity score is higher than the median and the user has interacted positively with the genre.
- Example: If a user has high similarity with "Action" but low similarity with "Animation," the model would only mark "Action" as a positive genre.
Actual Negative Genres:
- Genres where the similarity score is lower than the median and the user hasn't interacted with them.
Final Dataset:
- Maintained a 1:3 ratio of positive to negative genres.
- userid | movieid | interaction | genre | movie title.

Step 5: Fine-Tune NCF Model on New Data

Fine-tuned the previously trained NCF model on this new dataset.
Final Metrics:
- Precision@k: 0.4
- Recall@k: 0.2-0.3
- NDCG@10: 0.302 (varies by user).
- Genre NDCG@10: 0.84, indicating that most suggested movies belonged to similar or related genres.

Step 6: Frontend Development

Created a frontend for the model using Flask, Spline.js, HTML, and CSS.
Deployed the app as a web service using Azure App Service.
User Interface: Users can hover over movie posters, select them, and view 10 recommended movie posters based on the model's predictions.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.github/workflows		.github/workflows
__pycache__		__pycache__
flask_session		flask_session
notebooks_and_tasks		notebooks_and_tasks
scripts		scripts
static/images		static/images
templates		templates
README.md		README.md
app.py		app.py
appspec.yml		appspec.yml
buildspec.yml		buildspec.yml
final_merged_df_for_model.csv		final_merged_df_for_model.csv
get_model_prediction.py		get_model_prediction.py
get_poster.py		get_poster.py
requirements.txt		requirements.txt
txt		txt
wsgi.py		wsgi.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Personalised Recommendation system using Neural Collaborative Filtering

Step 1: Train NCF Model on Movie Ratings

Step 2: Prepare User, Movie, and Genre Embeddings

Step 3: Calculate User Embeddings for All Users and Compute Median Similarity Scores

Step 4: Identify Actual Positive and Negative Genres for Each User

Step 5: Fine-Tune NCF Model on New Data

Step 6: Frontend Development

About

Releases

Packages

Languages

ddcrpf/Neural-Collaborative-Filtering

Folders and files

Latest commit

History

Repository files navigation

Personalised Recommendation system using Neural Collaborative Filtering

Step 1: Train NCF Model on Movie Ratings

Step 2: Prepare User, Movie, and Genre Embeddings

Step 3: Calculate User Embeddings for All Users and Compute Median Similarity Scores

Step 4: Identify Actual Positive and Negative Genres for Each User

Step 5: Fine-Tune NCF Model on New Data

Step 6: Frontend Development

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages