AIGC project: FaceChain 2.5k stars a week, 3 photos for personal portraits

Source of this article: Machine Heart Editorial Department

An open source project called FaceChain can use AI models to create portraits of people. The project has been online for one week and has already collected 2.5k stars. Today it also ranked first on the Github trend list.

04af097760327a9f229079fe7f2a394b.jpeg

Project address: https://github.com/modelscope/facechain

Users only need to provide a minimum of three photos to get a personal portrait of a specific style. For example, to generate a business ID photo:

fe559ee450e8aa1625a3d45d78c3ecbd.png

You can also experience this application directly in the ModelScope creative space without any installation steps.

641f46f53707cd918f7cfa464e405a09.png

Trial address: https://modelscope.cn/studios/CVstudio/cv_human_portrait/summary

fe1d6638446770f0aea727db2f38b86e.png

7212529e16b570d647f874b700f50d7a.png

In the project introduction, the author explained the technical principles of AI generating personal portraits, and explained how the generative AI model can become a "portrait artifact". Let's take a look at this part of the explanation.

The principle of generating personal portraits

Fundamental

The ability of AI to generate personal portraits comes from the Vincent graph function of the Stable Diffusion model - input a piece of text or a series of prompts and output the corresponding images. There are two main factors that affect the effect of personal photo generation: photo style information and user character information.

To this end, the project authors used offline-trained style LoRA models and online-trained face LoRA models to learn the above two aspects of information. LoRA is a fine-tuned model with fewer trainable parameters. In Stable Diffusion, the information of the input image can be injected into the LoRA model by performing Vincentian graph training on a small number of input images.

1cbc2e2d14d4b0b5940c030b71a287b7.png

Therefore, the ability of the personal portrait model is divided into two stages: training and inference. The training stage generates image and text label data used to fine-tune the Stable Diffusion model to obtain the face LoRA model; the inference stage generates based on the face LoRA model and style LoRA model. Personal portrait images.

training phase

The input of the training phase is an image uploaded by the user containing a clear face area, and the output is a face LoRA model.

Specifically, the project author first uses an image rotation model based on orientation judgment, and a face refinement rotation method based on face detection and key point model to process user uploaded images to obtain images containing forward faces; next Use the human body analysis model and the portrait skin beautification model to obtain high-quality face training images; then, the project uses the face attribute model and the text annotation model, combined with the label post-processing method, to generate refined labels for the training images; finally, use The above image and label data fine-tune the Stable Diffusion model to obtain the face LoRA model.

inference phase

The input of the inference phase is the image uploaded by the user in the training phase and the preset input prompt used to generate a personal portrait, and the output is a personal portrait image.

In the inference phase, the project first integrates the weights of the face LoRA model and the style LoRA model into the Stable Diffusion model; then uses the Vincentian graph function of the Stable Diffusion model to initially generate a personal portrait image based on the preset prompt; then, the The project uses the face fusion model to further improve the face details of the above-mentioned photo images. The template faces used for fusion are selected from the training images through the face quality assessment model; finally, the face recognition model is used to calculate the generated photo images and Based on the similarity of template faces, the portrait images are sorted, and the top-ranked personal portrait images are output as the final output result.

The project author has introduced the installation and usage methods in detail, and has made the project code open source. Interested readers should try it out.

Follow the public account [Machine Learning and AI Generated Creation], more exciting things are waiting for you to read

Suppression, 60,000 words! 130 articles in 30 directions! CVPR 2023 The most comprehensive AIGC paper! Read it in one go

An in-depth explanation of stable diffusion: Interpretation of the paper on the potential diffusion model behind AI painting technology

A simple introduction to ControlNet, a controllable AIGC painting generation algorithm! 

Classic GAN must read: StyleGAN

a11d200b6ee032e8e57920c49a2bb5a9.png Click me to view GAN’s series of albums~!

A cup of milk tea and become the cutting-edge trendsetter of AIGC+CV vision!

The latest and most complete collection of 100 articles! Generate diffusion modelsDiffusion Models

ECCV2022 | Summary of some papers on Generative Adversarial Network GAN

CVPR 2022 | 25+ directions, the latest 50 GAN papers

 ICCV 2021 | Summary of 35 topic GAN papers

Over 110 articles! CVPR 2021 most comprehensive GAN paper review

Over 100 articles! CVPR 2020 most comprehensive GAN paper review

Unpacking a new GAN: decoupling representation MixNMatch

StarGAN version 2: multi-domain diversity image generation

Attached download | "Explainable Machine Learning" Chinese version

Attached download | "TensorFlow 2.0 Deep Learning Algorithm Practice"

Attached download | Sharing of "Mathematical Methods in Computer Vision"

"A Review of Surface Defect Detection Methods Based on Deep Learning"

"A Review of Zero-Sample Image Classification: Ten Years of Progress"

"A Review of Few-Sample Learning Based on Deep Neural Networks"

"Book of Rites·Xue Ji" says: If you study alone without friends, you will be lonely and ignorant.

Click on a cup of milk tea and become the cutting-edge trendsetter of AIGC+CV vision! , join  the planet of AI-generated creation and computer vision  knowledge!

Guess you like

Origin blog.csdn.net/lgzlgz3102/article/details/132680428