GitHub - aceliuchanghong/VQ-Font: [ICCV 2023] Few shot font generation via transferring similarity guided global and quantization local styles

Method is based on Vector Quantization, so we named our FFG method VQ-Font.

Paper can be found at ./Paper_IMG/ | Arxiv｜CVF.

env

conda create -n VQFont python=3.7
conda activate VQFont
source activate VQFont
cd VQ-Font
pip install -r requirements.txt
pip freeze > requirements.txt

训练字体来源

https://github.com/aceliuchanghong/free-font

操作步骤

1.下载字体
2.字体转图片(使用datasets/f2p.py 其中datasets/gen_imgs_from_ttf.py这个提取的字符图片不知道为什么vae不行),字体文件夹切分(datasets/split_folder.py),开始训练vae模型(vae/vae_train.py)
3.字体json生成(to_hex.py)
4.图片文件夹划分


生成all-3500:
python datasets/f2p.py --font z_using_files/content_font/HYDaSongJ-2.ttf --out z_using_files/f2p_imgs
获取train:
python split_folder.py --input_directory ../z_using_files/f2p_imgs/HYDaSongJ-2
开始训练:
nohup python vae_train.py --train_imgs_path ../z_using_files/f2p_imgs/HYDaSongJ-2_train/ --val_imgs_path ../z_using_files/f2p_imgs/HYDaSongJ-2_val/ >train.log &
验证筛选:
python vae_valid_pic.py --val_imgs_path ../z_using_files/f2p_imgs/HYDaSongJ-2_train
生成图片:
python vae_gen_font.py --val_imgs_path ../z_using_files/f2p_imgs/HYDaSongJ-2_train --model_path ../weight/VQ-VAE_chn_best-HYJ.pth
生成all-15500:
python datasets/f2p.py --font z_using_files/content_font/HYDaSongJ-2.ttf --out z_using_files/all_font_pics --char_file datasets/char_all_15000.txt --image_size 96
复制移动:
rm z_using_files/all_font_pics/combine/*
cp z_using_files/all_font_pics/HYDaSongJ-2/*.png z_using_files/all_font_pics/combine
cp z_using_files/imgs_2/VQ-VAE_chn_best-HYJ/*.png z_using_files/all_font_pics/combine
打包:
tar czvf 00.tar.gz z_using_files/all_font_pics/combine/*.png
tar czvf 01.tar.gz z_using_files/all_font_pics/HYDaSongJ-2/*.png

vps2:
cd /mnt/data/llch/FontDiffuser/outputs/HYJ_ai
tar zxvf 01/00...
source activate fontdiffuser
python run_gen.py --input outputs/HYJ_ai/ --name HYJ_ai --v v1.0

Data Preparation

Images and Characters

Collect a series of '.ttf'(TrueType) or '.otf'(OpenType) files to generate images for training models. and divide them into source content font and training set and test set. In order to better learn different styles, there should be differences and diversity in font styles in the training set.
Secondly, specify the characters to be generated (including training characters and test characters), eg the first-level Chinese character table contains 3500 Chinese characters.

trian_val_3500: {乙、十、丁、厂、七、卜、人、入、儿、匕、...、etc}
train_3000: {天、成、在、麻、...、etc}
val_500: {熊、湖、战、...、etc}

Convert the characters in the second step into unicode encoding and save them in json format, you can convert the utf8 format to unicode by using hex(ord(ch))[2:].upper():, examples can be found in ./meta/.

trian_val_all_characters: ["4E00", "4E01", "9576", "501F", ...]
train_unis: ["4E00", "4E01", ...]
val_unis: ["9576", "501F", ...]

After that, draw all font images via ./datasets/font2image.py. All images are named by 'characters + .png', such as ‘阿.png’. Organize directories structure as below, and train_3000.png means draw the image from train_unis: ["4E00", "4E01", ...]. 在vae的训练过程中,其实是训练emb模型,所以只需要content_font的字体图片即可其中vae_train.py需要分为3000:500的然后vae_emb.py需要全部的3500图片第二阶段few-shot的训练才需要需要上面收集的所有字体,转为图片,分为content+训练集和测试集即可

Font Directory
|--| content
| --| kaiti4train_VAE
| --| train_3000.png
| --| ...
| --| kaiti4val_VAE
| --| val_500.png
| --| ...
| --| kaiti4train_FFG
| --| trian_val_3500.png
| --| ...
|--| train
| --| train_font1
| --| train_font2
| --| trian_val_3500.png
| --| ...
| --| ...
|--| val
| --| val_font1
| --| val_font2
| --| trian_val_3500.png
| --| ...
| --| ...

Build meta files and lmdb environment

需要用dataset/font2image.py之类的先获取字体图片
参照下面获取meta数据

Run script ./build_trainset.sh or 查看 ./build_dataset/build_meta4train.py里面的示例

 python3 ./build_dataset/build_meta4train.py \
 --saving_dir ./results/your_task_name/ \
 --content_font path\to\all_content \
 --train_font_dir path\to\training_font \
 --val_font_dir path\to\validation_font \
 --seen_unis_file path\to\train_unis.json \
 --unseen_unis_file path\to\val_unis.json

Training

The training process is divided into two stages:

1）Pre-training the content encoder and codebook via VQ-VAE

cd vae .. python vae_train.py (在vae的训练过程中,其实是训练emb模型,所以只需要content_font的字体图片即可)

2）Training the few shot font generation model via GAN.

python3 train.py params....

Pre-train VQ-VAE

When pre-training VQ-VAE, the reconstructed character object comes from train_unis in the content font, The training process can be found at ./model/VQ-VAE.ipynb.

Then use the pre-trained content encoder to calculate a similarity between all training and test characters and store it as a dictionary.

{'4E07': {'4E01': 0.2143, '4E03': 0.2374, ...}, '4E08': {'4E01': 0.1137, '4E03': 0.1020, ...}, ...}

Few shot font generation

Modify the configuration in the file ./cfgs/custom.yaml

Keys

work_dir: the root directory for saved results. (keep same with the saving_dir above)
data_path: path to data lmdb environment. (saving_dir/lmdb)
data_meta: path to train meta file. (saving_dir/meta)
content_font: the name of font you want to use as source font.
all_content_char_json: the json file which stores all train and val characters.
other values are hyperparameters for training.

Run scripts

python3 train.py task_name cfgs/custom.yaml
  #--resume \path\to\your\pretrain_model.pdparams

Test

Run scripts

python3 inference.py ./cfgs/custom.yaml \
--weight \path\to\saved_model.pdparams \
--content_font \path\to\content_imgs \
--img_path \path\to\test_imgs \
--saving_root ./infer_res

Explain

图片含义:第一行参考字符第二行GT 第三行模型生成字符

论文里面的FID、SSIM等指标是怎么计算的?

FID:：参考论文-Gans trained by a two time-scale update rule converge to a nash equilibrium. In NeurIPS, 2017.
LPIPS：参考论文-The unreasonable effectiveness of deep features as a perceptual metric. In CVPR, 2018.

VQ-VAE时候,数据norm到[-0.5, 0.5],但是在阶段2, 喂给encoder的图片数据normalized [-1, 1].是否有问题?

TODO:之后再看

输出的png中的sfsu, sfuu, ufsu, ufuu分别是什么含义呢?

SFSU等表示了不同字体和字符的组合情况（风格和字符都有训练集和测试集） (seen font seen unicode)
sfsu # 见过的字符见过的字体
sfuu # 没有见过的字符见过字体
ufsu # 见过的字符没见过的字体
ufuu_ # 没有见过的字符和字体

load_pretrain_vae_model函数似乎对于gen没有任何返回值,那么输入的gen就成为摆设了,是不是代码写得有问题?

TODO:之后再看

vscode远程开发步骤
1.ssh <服务器登录名>@<公网ip>
2.输入服务器登录密码或者修改config 文件指定端口
3.往~/.ssh/authorized_keys里面添加C:\Users\【用户名】\.ssh”，访问“id_rsa.pub的数据(没有可以ssh-keygen)

there is no formatter for 'python' files installed
setting.json里面
"[python]": {
    "editor.defaultFormatter": "ms-python.black-formatter",
    "editor.formatOnSave": true
  }

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

env

训练字体来源

Data Preparation

Images and Characters

Build meta files and lmdb environment

Training

Pre-train VQ-VAE

Few shot font generation

Keys

Run scripts

Test

Run scripts

Explain

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
Paper_IMG		Paper_IMG
build_dataset		build_dataset
cfgs		cfgs
datasets		datasets
meta		meta
model		model
trainer		trainer
utils		utils
vae		vae
weight		weight
z_using_files		z_using_files
.gitignore		.gitignore
README.md		README.md
evaluator.py		evaluator.py
inference.py		inference.py
inference.sh		inference.sh
requirements.txt		requirements.txt
train.py		train.py
train.sh		train.sh

aceliuchanghong/VQ-Font

Folders and files

Latest commit

History

Repository files navigation

env

训练字体来源

Data Preparation

Images and Characters

Build meta files and lmdb environment

Training

Pre-train VQ-VAE

Few shot font generation

Keys

Run scripts

Test

Run scripts

Explain

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages