diff --git a/README.md b/README.md index 93a26ca..19f5d8a 100644 --- a/README.md +++ b/README.md @@ -274,6 +274,7 @@ where each line of [`asset/samples_mini.txt`](asset/samples_mini.txt) contains a - 32GB VRAM is required for both 0.6B and 1.6B model's training +### 1). Train with image-text pairs in directory We provide a training example here and you can also select your desired config file from [config files dir](configs/sana_config) based on your data structure. To launch Sana training, you will first need to prepare data in the following formats. [Here](asset/example_data) is an example for the data structure for reference. @@ -310,6 +311,24 @@ bash train_scripts/train.sh \ --train.train_batch_size=8 ``` +### 2). Train with image-text pairs in directory +We also provide conversion scripts to convert your data to the required format. You can refer to the [data conversion scripts](asset/data_conversion_scripts) for more details. + +```bash +python tools/convert_ImgDataset_to_WebDatasetMS_format.py +``` + +Then Sana's training can be launched via +```bash +# Example of training Sana 0.6B with 512x512 resolution from scratch +bash train_scripts/train.sh \ + configs/sana_config/512ms/Sana_600M_img512.yaml \ + --data.data_dir="[asset/example_data_tar]" \ + --data.type=SanaWebDatasetMS \ + --model.multi_scale=true \ + --train.train_batch_size=32 +``` + # 💻 4. Metric toolkit Refer to [Toolkit Manual](asset/docs/metrics_toolkit.md). diff --git a/tools/ImgDataset2WebDatasetMS.py b/tools/convert_ImgDataset_to_WebDatasetMS_format.py similarity index 100% rename from tools/ImgDataset2WebDatasetMS.py rename to tools/convert_ImgDataset_to_WebDatasetMS_format.py