Note: The folder crawler is deprecated. All code are move into website folder. Be carefult to configure the path if you want to rerun the code.
- Text Search Mode
- Facets search
- Search statistic
- Spell check
- Price interval, different rank rules, filter rules
- Image Search Mode
- Combination Search Mode: Text with Image
- Smart Recommendation System
- Ubuntu 16.04
- Python 2.7
- Pytorch, torchvision
- Elasticsearch And then install all the other python dependencies using pip:
pip install -r pip_list.txt
Use crawler.py to craw data from Craigslist website, totally 35 classes (~80,000 products) Files will be stored in website/data/ file. The raw data tar file is also provided.
Use crawImages_mpi.py to craw the images and store in local computer. The download log is stored in downloadImages.log. The data will be stored in website/images/. The filtered new data json will be also generated and stored.
Use extractFeatureFinal.py. The features are stored here. Put the file totalRes18feat.txt under website/
python insert.py
elasticsearch # remember add PATH in ~/.bashrc
Link: localhost:8080/
python server.py
Ruiyang Ma, Zesen Wang, Zehang Weng, Zitao Zhang