output person identity and image quality is far behind instantid, faceid plus #15

flankechen · 2024-05-07T02:59:06Z

Thanks for your work and code.
But to be honest, in my test, the person identity and image quality is far behind instantid or faceid plus in current released model and weights.

some of my test,

xujiaqi 许佳琪

2.liuyifei 刘亦菲

jshilong · 2024-05-07T03:14:56Z

Thanks for your feedback

The biggest feature of FlashFace is that it can achieve precise control of the face through language (controlling age, gender, accessories, expressions, etc.), but this also leads to higher requirements for language prompts（you can add a beautiful young woman in the prompt）， At the same time, you can increase lamda_feature, face_guidance and step_to_launch_face_guidance to obtain better face similarity.

. You can provide your language prompt for convenience I'll test it. At the same time, providing a face box like face_bbox =[0.3, 0.1, 0.6, 0.4] (a face box at center) will also bring better results

At the same time, you can also check #11 to see if it helps

jshilong · 2024-05-07T03:55:47Z

Because there are fewer Asians in my training set, you can add Asian to prompt and change lamda_feat = 1.2
face_guidence = 3
step_to_launch_face_guidence = 800, increase face_bbox =[0.3, 0.2, 0.6, 0.5]

xujiaqi

# ordinary people

face_imgs = [Image.open(f"{package_dir}/jiaqixu/{i+1}.png").convert("RGB") for i in range(4)]
need_detect = True
pos_prompt =  'A beautiful young asian woman, white skin on the street, sunny day, soft light'
# remove beard
neg_prompt = None
# No face position
face_bbox =[0.3, 0.2, 0.6, 0.5] 
# bigger these three parameters leads to more fidelity but less diversity 
lamda_feat = 1.2
face_guidence = 3
step_to_launch_face_guidence = 800

steps = 30
default_text_control_scale = 7.5

default_seed = 0


imgs = generate(pos_prompt=pos_prompt, 
                    neg_prompt=neg_prompt, 
                    steps=steps, 
                    face_bbox=face_bbox,
                    lamda_feat=lamda_feat, 
                    face_guidence=face_guidence, 
                    num_sample=4, 
                    text_control_scale=default_text_control_scale, 
                    seed=default_seed, 
                    step_to_launch_face_guidence=step_to_launch_face_guidence, 
                    reference_faces=face_imgs,
                    need_detect=need_detect
                    )


# show the generated images
img_size = imgs[0].size
num_imgs = len(imgs)
save_img = Image.new('RGB', (img_size[0] * (num_imgs + 1), img_size[1]))
for i, img in enumerate(imgs):
    save_img.paste(img, ((i + 1) * img_size[0], 0))

# paste all four reference face imgs to the first

resize_w = img_size[0] // 2
resize_h = img_size[1] // 2

for id, ref_img in enumerate(face_imgs):
    # resize the ref_img keep the ratio to fit the size of (resize_w, resize_h)
    w_ratio = resize_w / ref_img.size[0]
    h_ratio = resize_h / ref_img.size[1]
    ratio = min(w_ratio, h_ratio)
    ref_img = ref_img.resize(
        (int(ref_img.size[0] * ratio), int(ref_img.size[1] * ratio)))

    if id < 2:
        save_img.paste(ref_img, (id * resize_w, 0))
    else:
        save_img.paste(ref_img, ((id - 2) * resize_w, resize_h))

display(save_img)

jshilong · 2024-05-07T04:17:41Z

liuyifei

# ordinary people

face_imgs = [Image.open(f"{package_dir}/liuyifei/{i+1}.png").convert("RGB") for i in range(3)]
need_detect = True
pos_prompt =  'A beautiful young asian woman in the garden, sunny day'
# remove beard
neg_prompt = None
# No face position
face_bbox =[0.3, 0.2, 0.6, 0.5] 

# bigger these three parameters leads to more fidelity but less diversity 
lamda_feat = 1.3
face_guidence = 2.3
step_to_launch_face_guidence = 600

steps = 25
default_text_control_scale = 7.5

default_seed = 0


imgs = generate(pos_prompt=pos_prompt, 
                    neg_prompt=neg_prompt, 
                    steps=steps, 
                    face_bbox=face_bbox,
                    lamda_feat=lamda_feat, 
                    face_guidence=face_guidence, 
                    num_sample=4, 
                    text_control_scale=default_text_control_scale, 
                    seed=default_seed, 
                    step_to_launch_face_guidence=step_to_launch_face_guidence, 
                    reference_faces=face_imgs,
                    need_detect=need_detect
                    )


# show the generated images
img_size = imgs[0].size
num_imgs = len(imgs)
save_img = Image.new('RGB', (img_size[0] * (num_imgs + 1), img_size[1]))
for i, img in enumerate(imgs):
    save_img.paste(img, ((i + 1) * img_size[0], 0))

# paste all four reference face imgs to the first

resize_w = img_size[0] // 2
resize_h = img_size[1] // 2

for id, ref_img in enumerate(face_imgs):
    # resize the ref_img keep the ratio to fit the size of (resize_w, resize_h)
    w_ratio = resize_w / ref_img.size[0]
    h_ratio = resize_h / ref_img.size[1]
    ratio = min(w_ratio, h_ratio)
    ref_img = ref_img.resize(
        (int(ref_img.size[0] * ratio), int(ref_img.size[1] * ratio)))

    if id < 2:
        save_img.paste(ref_img, (id * resize_w, 0))
    else:
        save_img.paste(ref_img, ((id - 2) * resize_w, resize_h))

display(save_img)

jshilong · 2024-05-07T04:23:33Z

In terms of image quality, FlashFace is heavily constrained by the sd1.5, which significantly lags behind the SDXL version-based methods.
Regarding the representation of Asian individuals, the training dataset for FlashFace contains few to no Asian faces. Consequently, it may be necessary to add a face bounding box and increase the corresponding parameters such as lamda_feature, face_guidance, and step_to_launch_face_guidance for better person identity.

jshilong · 2024-05-07T07:09:43Z

https://github.com/ali-vilab/FlashFace/blob/main/docs/zh_cn.md

I added some hyperparameter experience here, it might help you

jshilong mentioned this issue May 12, 2024

本地部署测试，效果很垃圾 #17

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

output person identity and image quality is far behind instantid, faceid plus #15

output person identity and image quality is far behind instantid, faceid plus #15

flankechen commented May 7, 2024

jshilong commented May 7, 2024

jshilong commented May 7, 2024 •

edited

Loading

jshilong commented May 7, 2024

jshilong commented May 7, 2024

jshilong commented May 7, 2024

output person identity and image quality is far behind instantid, faceid plus #15

output person identity and image quality is far behind instantid, faceid plus #15

Comments

flankechen commented May 7, 2024

jshilong commented May 7, 2024

jshilong commented May 7, 2024 • edited Loading

jshilong commented May 7, 2024

jshilong commented May 7, 2024

jshilong commented May 7, 2024

jshilong commented May 7, 2024 •

edited

Loading