Source of this article: Machine Heart Editorial Department
An open source project called FaceChain can use AI models to create portraits of people. The project has been online for one week and has already collected 2.5k stars. Today it also ranked first on the Github trend list.
Project address: https://github.com/modelscope/facechain
Users only need to provide a minimum of three photos to get a personal portrait of a specific style. For example, to generate a business ID photo:
You can also experience this application directly in the ModelScope creative space without any installation steps.
Trial address: https://modelscope.cn/studios/CVstudio/cv_human_portrait/summary
In the project introduction, the author explained the technical principles of AI generating personal portraits, and explained how the generative AI model can become a "portrait artifact". Let's take a look at this part of the explanation.
The principle of generating personal portraits
Fundamental
The ability of AI to generate personal portraits comes from the Vincent graph function of the Stable Diffusion model - input a piece of text or a series of prompts and output the corresponding images. There are two main factors that affect the effect of personal photo generation: photo style information and user character information.
To this end, the project authors used offline-trained style LoRA models and online-trained face LoRA models to learn the above two aspects of information. LoRA is a fine-tuned model with fewer trainable parameters. In Stable Diffusion, the information of the input image can be injected into the LoRA model by performing Vincentian graph training on a small number of input images.
Therefore, the ability of the personal portrait model is divided into two stages: training and inference. The training stage generates image and text label data used to fine-tune the Stable Diffusion model to obtain the face LoRA model; the inference stage generates based on the face LoRA model and style LoRA model. Personal portrait images.
training phase
The input of the training phase is an image uploaded by the user containing a clear face area, and the output is a face LoRA model.
Specifically, the project author first uses an image rotation model based on orientation judgment, and a face refinement rotation method based on face detection and key point model to process user uploaded images to obtain images containing forward faces; next Use the human body analysis model and the portrait skin beautification model to obtain high-quality face training images; then, the project uses the face attribute model and the text annotation model, combined with the label post-processing method, to generate refined labels for the training images; finally, use The above image and label data fine-tune the Stable Diffusion model to obtain the face LoRA model.
inference phase
The input of the inference phase is the image uploaded by the user in the training phase and the preset input prompt used to generate a personal portrait, and the output is a personal portrait image.
In the inference phase, the project first integrates the weights of the face LoRA model and the style LoRA model into the Stable Diffusion model; then uses the Vincentian graph function of the Stable Diffusion model to initially generate a personal portrait image based on the preset prompt; then, the The project uses the face fusion model to further improve the face details of the above-mentioned photo images. The template faces used for fusion are selected from the training images through the face quality assessment model; finally, the face recognition model is used to calculate the generated photo images and Based on the similarity of template faces, the portrait images are sorted, and the top-ranked personal portrait images are output as the final output result.
The project author has introduced the installation and usage methods in detail, and has made the project code open source. Interested readers should try it out.
Follow the public account [Machine Learning and AI Generated Creation], more exciting things are waiting for you to read
A simple introduction to ControlNet, a controllable AIGC painting generation algorithm!
Classic GAN must read: StyleGAN
Click me to view GAN’s series of albums~!
A cup of milk tea and become the cutting-edge trendsetter of AIGC+CV vision!
The latest and most complete collection of 100 articles! Generate diffusion modelsDiffusion Models
ECCV2022 | Summary of some papers on Generative Adversarial Network GAN
CVPR 2022 | 25+ directions, the latest 50 GAN papers
ICCV 2021 | Summary of 35 topic GAN papers
Over 110 articles! CVPR 2021 most comprehensive GAN paper review
Over 100 articles! CVPR 2020 most comprehensive GAN paper review
Unpacking a new GAN: decoupling representation MixNMatch
StarGAN version 2: multi-domain diversity image generation
Attached download | "Explainable Machine Learning" Chinese version
Attached download | "TensorFlow 2.0 Deep Learning Algorithm Practice"
Attached download | Sharing of "Mathematical Methods in Computer Vision"
"A Review of Surface Defect Detection Methods Based on Deep Learning"
"A Review of Zero-Sample Image Classification: Ten Years of Progress"
"A Review of Few-Sample Learning Based on Deep Neural Networks"
"Book of Rites·Xue Ji" says: If you study alone without friends, you will be lonely and ignorant.
Click on a cup of milk tea and become the cutting-edge trendsetter of AIGC+CV vision! , join the planet of AI-generated creation and computer vision knowledge!