Guidance for Simple Visual Search on AWS

Overview

With mobile commerce leading retail growth, seamless in-app visual search unlocks frictionless purchase experiences. Visual search has evolved from a novelty to a business necessity. While technically complex to build at scale, visual search drives measurable metrics around engagement, conversion, and revenue when implemented successfully. As consumer expectations and behaviors shift towards more visual and intuitive shopping, brands need robust visual search to deliver next-generation shopping experiences. With visual search powering shopping across channels, brands can provide consumers with flexibility and convenience while capturing valuable data on emerging visual trends and consumer preferences.

Visual search allows consumers to take or upload an image to search for visually similar images and products. This enables more intuitive and seamless product discoveries, allowing consumers to find products they see around them or even user-generated image content that resembles their particular style and tastes.

Developing accurate and scalable visual search is a complex technical challenge, which demands considerable investments in technology infrastructure and data management. However, recent advancements in generative AI and multimodal models are enabling exciting new possibilities in visual search.

This repo contains code that creates a visual search solution using services like Amazon Bedrock, Amazon Opensearch Serverless, Amazon Lambda etc.

Solution Overview

The solution is an implementation of semantic search, based on product images. To enable search by product images, we first need to create a vector store index of multimodal embeddings from the image and description of all products in the catalog. When you search with an image, the image is run through Claude Sonnet V3 to generate a caption for the image, and then both the input image and generated caption are used to create a multimodal embedding. This multimodal embedding is used to query the vector store index, which then returns the requested number of semantic search results based on similarity scores.

Architecture Diagram

A time-based Amazon EventBridge scheduler invokes an AWS Lambda Function to populate search index with multimodal embeddings and product meta-data.
The AWS Lambda Function first retrieves product feed stored as a JSON file in Amazon Simple Storage Service (Amazon S3).
The Lambda Function then invokes Amazon Bedrock’s Titan Multimodal Embedding model to create vector embeddings for each product in the catalog, based on the primary image and description of the products.
The Lambda Function finally persists these vector embeddings as a k-NN vectors, along with product meta-data in the vector store (e.g. Amazon OpenSearch or Amazon DocumentDB or Amazon Aurora, etc.). This index is used as the source for semantic image search
The user initiates a visual search request through frontend application by uploading a product image.
The application uses Amazon API Gateway REST API to invoke a pre-configured proxy Lambda function to process the visual search request.
Lambda function first generates the caption for the input image using the Anthropic Claude 3 Sonnet model hosted on Amazon Bedrock. Optional step to create multimodal embedding based on both the input image and caption of the image, for improved search results.
Lambda function then invokes Amazon Titan Multimodal Embeddings model hosted on Amazon Bedrock to generate the multimodal embedding based on the input image uploaded by user and the image caption (if generated in step 7).
Lambda function then, performs a k-NN search on the vector store index, to find semantically similar results for the embedding generated in step 8.
The resultant semantic search results from the vector store are then filtered to eliminate any duplicates, enriched with product meta-data from the search index and passed back to API Gateway.
Finally, API Gateway response is returned to the client, to display the search results.

Cost

You are responsible for the cost of the AWS services used while running this Guidance. As of June 2024, the cost for running this Guidance with the default settings in the US East (N. Virginia) AWS Region is approximately $412.43 per month for processing 100,000 image searches.

We recommend creating a Budget through AWS Cost Explorer to help manage costs. Prices are subject to change. For full details, refer to the pricing webpage for each AWS service used in this Guidance.

Sample Cost Table

The following table provides a sample cost breakdown for deploying this Guidance with the default parameters in the US East (N. Virginia) Region for one month.

AWS service	Dimensions	Cost [USD]
Amazon API Gateway	100,000 REST API calls per month	$ 0.35month
AWS Lambda	100,000 invocations per month	$ 0.68
Amazon Bedrock Titan Multimodal Embeddings feature	100,000 input images with corresponding text description	$ 166.00
Amazon Bedrock Anthropic Claude v3 Sonnet	100,000 input images per month	$ 4677.72
Amazon Opensearch Serverless	2 OCU(Indexing, Search and query cost) and 1GB storage	$ 350.42

Prerequisites

Operating System

These deployment instructions are optimized to best work on a pre-configured Amazon Linux 2023 AWS Cloud9 development environment. Refer to the Individual user setup for AWS Cloud9 for more information on how to set up Cloud9 as a user in the AWS account. Deployment using another OS may require additional steps, and configured python libraries (see Third-party tools).

Third-party tools

Before deploying the guidance code, ensure that the following required tools have been installed:

AWS Cloud Development Kit (CDK) >= 2.126.0
Python >= 3.8

AWS account requirements

Bedrock Model access for Claude 3 Sonnet and Amazon Titan Multimodal embeddings

aws cdk bootstrap

This Guidance uses AWS CDK. If you are using aws-cdk for the first time, please see the Bootstrapping section of the AWS Cloud Development Kit (AWS CDK) v2 developer guide, to provision the required resources, before you can deploy AWS CDK apps into an AWS environment.

Deployment Steps

In the Cloud9 IDE, use the terminal to clone the repository:

git clone https://github.com/aws-solutions-library-samples/guidance-for-simple-visual-search-on-aws

Change to the repository root folder:

cd guidance-for-simple-visual-search-on-aws

Initialize the Python virtual environment:
```
python3 -m venv .venv
```
Activate the virtual environment:
```
source .venv/bin/activate
```
Install the necessary python libraries in the virtual environment:
```
python -m pip install -r requirements.txt
```
Install the necessary node libraries in the virtual environment with your relevant package manager. For example with npm:
```
npm install
```
Verify that the CDK deployment correctly synthesizes the CloudFormation template:
```
cdk synth
```
Deploy the guidance:
```
cdk deploy 
```

Deployment Validation

To verify a successful deployment of this guidance, open CloudFormation console, and verify that the status of the stack named VisualSearchStack is CREATE_COMPLETE.

Running the Guidance

Ingest products into the OpenSearch vector database

Open AWS Console and go to Lambda
Select the checkbox next to function prefixed with VisualSearchStack-VisualSearchProductIngestionLamb
Select "Actions"
Select "Test"
For Test event, select “Test” to run the lambda function
This will ingest the product data into Amazon OpenSearch serverless by downloading the product.json from the S3 bucket and product images from Berkeley's S3 bucket s3://amazon-berkeley-objects. It also copies the product images to the local S3 bucket.

Do visual search

From UI

Open API Gateway's prod stage URL which looks like https://xxxxx.execute-api.<region>.amazonaws.com/prod
- You can get the API Gateway's URL from the Outputs of the CDK execution.
- Alternatively, you can go to API Gateway in AWS Console, select VisualSearchAPIGateway API, select Stages from left navigation sidebar, select prod stage and copy the value of Invoke URL.

This shows a sample UI that can be used for visual search.
Select one of the given images as input.
Provide the API Key in the API Key text box.
- To get the API Key, go to API Gateway in AWS Console, select API Keys from the left navigation sidebar, find the API Key of Visual Search and copy its API Key.
Click on "Find visually similar products". The search results would be shown.

Through API

Open AWS Console and go to API Gateway
Go to the Visual Search API
Find the prod stage URL and do POST https://xxxxx.execute-api.<region>.amazonaws.com/prod/products/search by passing a JSON in the format {"content": "<base64 encoded image>"}

Sample searches

Sample search - Sunglasses

Sample search - Suitcases

Next Steps

Several improvements can be made to make this code production ready.

Use opensearch-py's bulk load capabilities for inserting data into the vector store for better performance.
Use Amazon Bedrock's batch inference API during product ingestion.
Load multiple images of a product.
Filter the search results from OpenSearch to remove duplicates.
Deploy the OpenSearch and Lambda in a VPC.
Consider using Amazon Bedrock Provisioned Throughput pricing model if more capacity is needed.
Move product ingestion code to ECS/EKS if the data set is large.

Cleanup

To delete the deployed resources, use the AWS CDK CLI to run the following steps:

Using the Cloud9 terminal window, change to the root of the cloned repository:
```
cd guidance-for-simple-visual-search-on-aws
```
Run the command to delete the CloudFormation stack:
```
cdk destroy
```

Authors

Rajesh Sripathi
Benedict Nartey-Tokoli
Dantis Stephen

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github		.github
assets		assets
bin		bin
deployment		deployment
lib		lib
source		source
test		test
.gitignore		.gitignore
.npmignore		.npmignore
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
cdk.context.json		cdk.context.json
cdk.json		cdk.json
jest.config.js		jest.config.js
package-lock.json		package-lock.json
package.json		package.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Guidance for Simple Visual Search on AWS

Table of Contents

Overview

Solution Overview

Architecture Diagram

Cost

Sample Cost Table

Prerequisites

Operating System

Third-party tools

AWS account requirements

aws cdk bootstrap

Deployment Steps

Deployment Validation

Running the Guidance

Ingest products into the OpenSearch vector database

Do visual search

From UI

Through API

Sample searches

Sample search - Sunglasses

Sample search - Suitcases

Next Steps

Cleanup

Authors

About

Releases

Packages

Contributors 5

Languages

License

aws-solutions-library-samples/guidance-for-visual-search-on-aws

Folders and files

Latest commit

History

Repository files navigation

Guidance for Simple Visual Search on AWS

Table of Contents

Overview

Solution Overview

Architecture Diagram

Cost

Sample Cost Table

Prerequisites

Operating System

Third-party tools

AWS account requirements

aws cdk bootstrap

Deployment Steps

Deployment Validation

Running the Guidance

Ingest products into the OpenSearch vector database

Do visual search

From UI

Through API

Sample searches

Sample search - Sunglasses

Sample search - Suitcases

Next Steps

Cleanup

Authors

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages