Skip to content

Devubavaria/FLIPKART-GRID-4.0

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FLIPKART-GRID 4.0

EXTRACT TRENDS FROM SOCIAL MEDIA

INTRODUCTION

This system is developed by Devanshi Bavaria and Dhruv Shah as part of Flipkart Grid 2022. The aim of this system is to help fashion retailers in identifying trend-setting and best selling products on Flipkart website.

DESIGN

This system is broken down into 3 important parts:

SCRAPERS

  • Used to scrape data from various social media. Currently social media which is scraped include trending tweets from twitter.
  • All this raw scraped data is currently stored on Drive, with links to it being stored on an Atlas database.
  • These scrapers are designed to run automatically after a fixed amount of time, so that fresh trending data is always being scraped.
  • Implemented using libraries like selenium, tweepy etc in python.
  • MODELS

    • Our solution consists of 2 Models. The first model is the colour and category detection model which takes an image as input and helps in determining the type of clothing(Tshirt, trousers) and the colour of the clothing.
      • Knowing the type of clothing helps in classifying the clothes, and since this is based on images, the internal category name given to the object does not hamper with our classification(E.g. the site may be classifying tshirts as T-shirts, but this does not affect the image detection model, it will classify all tshirt data into the 'tshirt' category)
      • Knowing the colour of the model helps in preprocessing the image before passing it to the trendiness detection model. This is done using OpenCV techniques like colour detection, image segmentation, face detection. Using these techniques, during preprocessing, the backgrounds are removed and the image is cropped to just the object of interest.
    • The second model is used to find the trending products. This is done by assigning a score(Trendiness score) to each product. The products with high score have a higher chance of being in trend. The model takes these parameters as input to determine the trendiness score:
      • Preprocessed image
      • Date when product was released
      • No of likes/ratings/views
      • No of sites referring this product
      • Comments/Reviews

    MAPPING

    The trending images from database along with the extracted keywords with Flipkart category, sub category, vertical and product attribute, search page links are mapped to flipkart products which are displayed as output.

    DATASETS

    Scraped Data(till today): https://drive.google.com/drive/u/1/folders/134GDKE7i0MdBYcIyab3mKH6LdxFl-ci0

    Processed Scraped Data: https://drive.google.com/drive/u/1/folders/134GDKE7i0MdBYcIyab3mKH6LdxFl-ci0

    MODEL FILES

    Model for predicting trendiness: https://drive.google.com/file/d/1092MDfexBLnap2q9ijxvEG-GmBfw9my2/view?usp=sharing

    Model for predicting color and type of garment from image : https://drive.google.com/file/d/1092MDfexBLnap2q9ijxvEG-GmBfw9my2/view?usp=sharing

    (Supporting Binary File): https://drive.google.com/file/d/1-6_5lyHXmDrhEvjJpdQh8jjyXoX-Gvv2/view?usp=sharing