Style-Rank

Style Rank, a unified benchmarking framework for generative styling models in PyTorch. This repository contains code wrapping the implementations of several papers in the field of generative styling models and implementation of metrics to evaluate the quality of the generated images. We also provide Style-Rank, an evaluation dataset for comparison of the models.

This work was developed by Eyal Benaroche, Clément Chadebec, Onur Tasar, and Benjamin Aubin from Jasper Research and Ecole Polytechnique.

Models

Model	Arxiv	Code	Project Page	Notes
StyleAligned	Arxiv	Code	Project Page
VisualStyle	Arxiv	Code	Project Page
IP-Adapter	Arxiv	Code	Project Page	Using the implementation from Diffusers
InstantStyle	Arxiv	Code	Project Page	Using the implementation from Diffusers
CSGO	Arxiv	Code	Project Page
Style-Shot	Arxiv	Code	Project Page

Metrics

We implemented several common metrics to evaluate the quality of the generated images:

CLIP-Text metric : Cosine Similarity between a caption (embedded using ClipTextModel) and the generated image (embedded using ClipVisionModel) - Using the implementation from Transformers
CLIP-Image metric : Cosine Similarity between two images (embedded using ClipVisionModel) - Using the implementation from Transformers
Dino : Cosine Similarity between two images (embedded using Dinov2Model) - Using the implementation from Transformers
ImageReward : Score from the ImageReward model

Dataset

The dataset is an aggregation of images from multiple styling papers:

Setup

To be up and running, you need first to create a virtual env with at least python3.10 installed and activate it

With `venv`

python3.10 -m venv envs/style_rank
source envs/style_rank/bin/activate

With `conda`

conda create -n style_rank python=3.10
conda activate style_rank

Install the dependencies

Then install the required dependencies (if on GPU) and the repo in editable mode

pip install --upgrade pip
pip install -r requirements.txt
pip install -e .

Usage

Using the provided code, you can generate stylized images on the provided datasets (or your own given the right format) and evaluate them using the provided metrics. Results can fluctuate as the generation is not seeded and the default prompts are sampled from a list of prompts.

Dataset

The dataset is formated to be used with WebDataset

You can download it locally

wget -O data/stylerank_papers.tar "https://huggingface.co/datasets/jasperai/style-rank/resolve/main/stylerank_papers.tar"

Or you can also stream it from HuggingFace with webdataset:

import webdataset as wds

url = f"pipe:curl -s -L https://huggingface.co/datasets/jasperai/style-rank/resolve/main/stylerank_papers.tar"
dataset = wds.WebDataset(url).decode('pil')
sample = next(iter(dataset))
sample["jpg"].show()

The dataset contains license, source, url, caption_blip, caption_cogvlm, style_caption and style_captionner metadata located as follows:

sample = {
    '__key__': image_key,
    'jpg': image_data,
    'json': {
        'license': image_license,
        'source': image_source,
        'url': original_dataset_url,
        'caption_blip': blip2_caption,
        'caption_cogvlm': cogvlam_caption,
        'style_caption': style_caption,
        'style_captionner': style_captioner,
    }
}

Inference

To generate images using one of the provided models, you can use the scripts provided in the examples/inference folder. For example, to generate images using the StyleAligned model, you can use the following command :

python examples/inference/stylealigned.py [--input-path /path/to/dataset] [--output-path /path/to/output]

Default output path is output/inference/ and the default input path is data/stylerank_papers.tar.

Addtionally, you can provide the --json_path argument to use a different json file for the prompts or use the --prompts argument to provide a list of prompts to use for the generation.

Iterating throught the provided .tar file and generate 4 random images based on the prompts provided in the prompts.json file, following a similar evaluation process as the one described in the VisualStyle paper.

Folder structure

The folder structure should be as follows :

.
├── README.md
├── data
│   ├── stylerank_papers.tar
│   └── prompts.json
├── examples
│   ├── inference
│   └── report
├── output
│   ├── inference
│   └── metrics
├── requirements.txt
├── setup.py
├── src
│   └── stylerank
└── tests
    ├── reference_images
    ├── test_metrics
    └── test_model

When running an inference script, the model will by default create a folder with its name to store the generated samples and the reference image using a new folder for each reference (with it's key as name) and the prompts used to generate it. The folder structure should look like this inside the ./output/ folder:

.
├── inference
│   ├── instant_style
│   │   ├── 0000
│   │   │   ├── prompt_1.png
│   │   │   ├── prompt_2.png
│   │   │   ├── prompt_3.png
│   │   │   ├── prompt_4.png
│   │   │   └── reference.png
│   │   ├── 0001
.   .   .   ....
│   │   └── 0111
│   ├── ip_adapter
│   │   ├── 0000
│   │   ├── 0001
.   .   .   ....
│   │   └── 0111
│   ├── stylealigned
.   .   └── ....
│   └── visualstyle
│       └── ....
└── metrics
    ├── interrupted.csv
    ├── report.csv
    └── metrics.csv

Reports

Given the generated image you can evaluate the results using the provided metrics. For example, to evaluate the generated images using the CLIP-Text metric, you can use the following command :

python examples/report/metrics.py --metrics ClipText [--input-path /path/to/dataset] [--output-path /path/to/output]

You can run multiple metrics at once by providing a list of metrics to the --metrics argument, ie :

python examples/report/metrics.py --metrics "[ClipText, ClipImage, Dinov2, ImageReward]" [--input-path /path/to/dataset] [--output-path /path/to/output]

It will output the results in the /path/to/output/metrics.csv file and the mean for each metric in the /path/to/output/report.csv file.

If you cancel the process, it will automatically save the results in the /path/to/output/interrupted.csv file.

Results

Running the evaluation on the provided stylerank_papers.tar dataset, we get the following results :

Model	ImageReward ↑	Clip-Text ↑	Clip-Image ↑	Dinov2 ↑
StyleAligned	-1.26	19.26	68.72	36.29
VisualStyle	-0.72	22.12	66.68	20.80
IP-Adapter	-2.03	15.01	83.66	40.50
Style-Shot	-0.38	21.34	65.04	23.04
CSGO	-0.29	22.16	61.73	16.85
InstantStyle	-0.13	22.78	66.43	18.48
Inversion-InstantStyle	-1.30	18.90	76.60	49.42

Tests

To run the tests to make sure the models and metrics are working as expected, you need to install pytest and run the tests using the following command :

pip install pytest

pytest tests/

License

This code is released under the Creative Commons BY-NC 4.0 license.

Citation

If you find this work useful or use it in your research, please consider citing us

@misc{benaroche2024stylerank,
  title={Style-Rank: Benchmarking stylization for diffusion models}, 
  author=Eyal Benaroche and Clement Chadebec and Onur Tasar and Benjamin Aubin},
  year={2024},
  }

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Style-Rank

Models

Metrics

Dataset

Setup

With `venv`

With `conda`

Install the dependencies

Usage

Dataset

Inference

Folder structure

Reports

Results

Tests

License

Citation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
data		data
examples		examples
output		output
src/stylerank		src/stylerank
tests		tests
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

License

gojasper/style-rank

Folders and files

Latest commit

History

Repository files navigation

Style-Rank

Models

Metrics

Dataset

Setup

With venv

With conda

Install the dependencies

Usage

Dataset

Inference

Folder structure

Reports

Results

Tests

License

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

With `venv`

With `conda`

Packages