A sample application showcasing a simple e-commerce site built with Vespa. See Use Case - shopping for features and details:
Also included are scripts to convert data from Julian McAuley's Amazon product data set at https://cseweb.ucsd.edu/~jmcauley/datasets.html to a Vespa data feed. This repository contains a small sample of this data from the sports and outdoor category, but you can download other data from the site above and use the scripts to convert.
Requirements:
- Docker Desktop installed and running. 4 GB available memory for Docker is minimum. Refer to Docker memory for details and troubleshooting
- Alternatively, deploy using Vespa Cloud
- Operating system: Linux, macOS or Windows 10 Pro (Docker requirement)
- Architecture: x86_64 or arm64
- Homebrew to install Vespa CLI, or download a vespa cli release from GitHub releases.
- Java 17 installed.
- Apache Maven This sample app uses custom Java components and Maven is used to build the application.
- python3
- zstd:
brew install zstd
See also Vespa quick start guide.
Validate environment, should be minimum 4 GB:
$ docker info | grep "Total Memory" or $ podman info | grep "memTotal"
Install Vespa CLI:
$ brew install vespa-cli
For local deployment using Docker image:
$ vespa config set target local
Pull and start the vespa docker container image:
$ docker pull vespaengine/vespa $ docker run --detach --name vespa --hostname vespa-container \ --publish 127.0.0.1:8080:8080 --publish 127.0.0.1:19071:19071 \ vespaengine/vespa
Verify that configuration service (deploy api) is ready:
$ vespa status deploy --wait 300
Download this sample application:
$ vespa clone use-case-shopping myapp && cd myapp
Build the application package:
$ mvn clean package -U
Deploy the application package:
$ vespa deploy --wait 300
It is possible to deploy this app to Vespa Cloud.
Run Vespa System Tests - this runs a set of basic tests to verify that the application is working as expected:
$ vespa test src/test/application/tests/system-test/product-search-test.json
First, create data feed for products:
$ curl -L -o meta_sports_20k_sample.json.zst https://data.vespa-cloud.com/sample-apps-data/meta_sports_20k_sample.json.zst $ zstdcat meta_sports_20k_sample.json.zst | ./convert_meta.py > feed_items.json
Next, data feed for reviews:
$ curl -L -o reviews_sports_24k_sample.json.zst https://data.vespa-cloud.com/sample-apps-data/reviews_sports_24k_sample.json.zst $ zstdcat reviews_sports_24k_sample.json.zst | ./convert_reviews.py > feed_reviews.json
Next, data feed for query suggestions:
$ pip3 install spacy mmh3 $ python3 -m spacy download en_core_web_sm $ ./create_suggestions.py feed_items.json > feed_suggestions.json
Feed products data:
$ vespa feed feed_items.json
Feed reviews data:
$ vespa feed feed_reviews.json
Feed query suggestions data:
$ vespa feed feed_suggestions.json
Test the application:
$ vespa query "query=golf"
Browse the site: http://localhost:8080/site
Shutdown and remove the container:
$ docker rm -f vespa
Instead of using vespa feed
, you can use Logstash to feed items and reviews. This way:
- You can more easily adapt this sample application to your own data. For example, by making Logstash read from different files or other sources, because Logstash is an excellent ETL tool.
- You don't need to convert the reviews to Vespa documents via
./convert_reviews.py
. - You don't need to convert the items to Vespa documents via
./convert_meta.py
in order to feed them to Vespa. However, this is still needed for suggestions, as./create_suggestions.py
depends onfeed_items.json
.
You'll need to install Logstash. Then:
- Install Logstash Output Plugin for Vespa via:
bin/logstash-plugin install logstash-output-vespa_feed
- Change logstash.conf to point to the absolute paths of
meta_sports_20k_sample.json
andreviews_sports_24k_sample.json
. Which still need to be downloaded and uncompressed, as mentioned above:
$ curl -L -o meta_sports_20k_sample.json.zst https://data.vespa-cloud.com/sample-apps-data/meta_sports_20k_sample.json.zst $ unzstd meta_sports_20k_sample.json.zst $ curl -L -o reviews_sports_24k_sample.json.zst https://data.vespa-cloud.com/sample-apps-data/reviews_sports_24k_sample.json.zst $ unzstd reviews_sports_24k_sample.json.zst
- Run Logstash with the modified
logstash.conf
:
bin/logstash -f $PATH_TO_LOGSTASH_CONF/logstash.conf