Skip to content

Latest commit

 

History

History

use-case-shopping

#Vespa

Vespa sample applications - e-commerce

A sample application showcasing a simple e-commerce site built with Vespa. See Use Case - shopping for features and details:

Sample app screenshot

Also included are scripts to convert data from Julian McAuley's Amazon product data set at https://cseweb.ucsd.edu/~jmcauley/datasets.html to a Vespa data feed. This repository contains a small sample of this data from the sports and outdoor category, but you can download other data from the site above and use the scripts to convert.

Requirements:

  • Docker Desktop installed and running. 4 GB available memory for Docker is minimum. Refer to Docker memory for details and troubleshooting
  • Alternatively, deploy using Vespa Cloud
  • Operating system: Linux, macOS or Windows 10 Pro (Docker requirement)
  • Architecture: x86_64 or arm64
  • Homebrew to install Vespa CLI, or download a vespa cli release from GitHub releases.
  • Java 17 installed.
  • Apache Maven This sample app uses custom Java components and Maven is used to build the application.
  • python3
  • zstd: brew install zstd

See also Vespa quick start guide.

Validate environment, should be minimum 4 GB:

$ docker info | grep "Total Memory"
or
$ podman info | grep "memTotal"

Install Vespa CLI:

$ brew install vespa-cli

For local deployment using Docker image:

$ vespa config set target local

Pull and start the vespa docker container image:

$ docker pull vespaengine/vespa
$ docker run --detach --name vespa --hostname vespa-container \
  --publish 127.0.0.1:8080:8080 --publish 127.0.0.1:19071:19071 \
  vespaengine/vespa

Verify that configuration service (deploy api) is ready:

$ vespa status deploy --wait 300

Download this sample application:

$ vespa clone use-case-shopping myapp && cd myapp

Build the application package:

$ mvn clean package -U

Deploy the application package:

$ vespa deploy --wait 300

Deployment note

It is possible to deploy this app to Vespa Cloud.

Run Vespa System Tests - this runs a set of basic tests to verify that the application is working as expected:

$ vespa test src/test/application/tests/system-test/product-search-test.json

First, create data feed for products:

$ curl -L -o meta_sports_20k_sample.json.zst https://data.vespa-cloud.com/sample-apps-data/meta_sports_20k_sample.json.zst
$ zstdcat meta_sports_20k_sample.json.zst | ./convert_meta.py > feed_items.json

Next, data feed for reviews:

$ curl -L -o reviews_sports_24k_sample.json.zst https://data.vespa-cloud.com/sample-apps-data/reviews_sports_24k_sample.json.zst
$ zstdcat reviews_sports_24k_sample.json.zst | ./convert_reviews.py > feed_reviews.json

Next, data feed for query suggestions:

$ pip3 install spacy mmh3
$ python3 -m spacy download en_core_web_sm
$ ./create_suggestions.py feed_items.json > feed_suggestions.json

Feed products data:

$ vespa feed feed_items.json

Feed reviews data:

$ vespa feed feed_reviews.json

Feed query suggestions data:

$ vespa feed feed_suggestions.json

Test the application:

$ vespa query "query=golf"

Browse the site: http://localhost:8080/site

Shutdown and remove the container:

$ docker rm -f vespa

Using Logstash to feed items and reviews

Instead of using vespa feed, you can use Logstash to feed items and reviews. This way:

  • You can more easily adapt this sample application to your own data. For example, by making Logstash read from different files or other sources, because Logstash is an excellent ETL tool.
  • You don't need to convert the reviews to Vespa documents via ./convert_reviews.py.
  • You don't need to convert the items to Vespa documents via ./convert_meta.py in order to feed them to Vespa. However, this is still needed for suggestions, as ./create_suggestions.py depends on feed_items.json.

You'll need to install Logstash. Then:

  1. Install Logstash Output Plugin for Vespa via:
bin/logstash-plugin install logstash-output-vespa_feed
  1. Change logstash.conf to point to the absolute paths of meta_sports_20k_sample.json and reviews_sports_24k_sample.json. Which still need to be downloaded and uncompressed, as mentioned above:
$ curl -L -o meta_sports_20k_sample.json.zst https://data.vespa-cloud.com/sample-apps-data/meta_sports_20k_sample.json.zst
$ unzstd meta_sports_20k_sample.json.zst
$ curl -L -o reviews_sports_24k_sample.json.zst https://data.vespa-cloud.com/sample-apps-data/reviews_sports_24k_sample.json.zst
$ unzstd reviews_sports_24k_sample.json.zst
  1. Run Logstash with the modified logstash.conf:
bin/logstash -f $PATH_TO_LOGSTASH_CONF/logstash.conf