[Tencent Cloud HAI Domain Secrets] - AIGC application helps enterprises reduce costs and increase efficiency

I. Introduction:

In recent years, with the continuous development of deep learning, big data, artificial intelligence, AI and other technical fields, machine learning is one of the hottest branches of artificial intelligence. It uses a large amount of data to train computer programs to achieve intelligent decision-making and speech recognition. , image processing and other tasks.

Insert image description here

The author has also gone through the above stages of software development, from programming in the Web era, to distributed programming in the cloud era, to today's AI era. Traditional programming involves human programmers manually writing code to achieve specific functions, while machine learning is Functions are implemented by letting computer programs learn from data and automatically extract features and patterns.

How to solve artificial intelligence (machine learning) model training and inference, high-performance computing, etc., are often necessary conditions for algorithms, computing power and big data to achieve large-scale applications.

The widespread application of GPU has promoted the development of AI technology. Through the high-speed computing power of GPU, developers can train models and test algorithms faster, thus promoting the rapid development of AI technology. The emergence and development of GPUs also provide more possibilities for the research and development of new algorithms and new models in the AI ​​field.

Insert image description here

Recently, Tencent Cloud launched a "High-Performance Application Service HAI", which is a GPU application service product for AI and scientific computing. It is application-centric and matches GPU cloud computing resources. It is a new GPU product in the AI ​​2.0 era and comes with pre-installed LLM. , AI painting, data science and other high-performance applications, enabling plug-and-play, helping small and medium-sized enterprises and developers quickly deploy high-performance applications such as LLM, AI painting, and data science.

Insert image description here


2. Instant“Yu”(Tencent Cloud GPU cloud server) He Sheng“Liang ”(High-performance application service HAI):

In the field of AI that we usually come into contact with, most AI servers equipped with GPU cloud servers can cover more application scenarios, especially in the field of artificial intelligence. Next, let us first understand some concepts of GPU cloud servers. Only by knowing the enemy can we win every battle and make effective comparative advantages and disadvantages to make targeted choices.

1. Tencent Cloud GPU server:

GPU cloud server (Cloud GPU Service, GPU) is an elastic computing service that provides GPU computing power. It has super parallel computing capabilities. As a cutting-edge tool in the IaaS layer, it serves deep learning training, scientific computing, graphics and image processing, and video editing. Decoding and other scenarios. Tencent Cloud provides computing power at your fingertips at any time, effectively relieving your computing pressure and improving business efficiency and competitiveness.

Insert image description here

2. Application scenarios of Tencent Cloud GPU cloud server:

GPU computing application scenarios:

Insert image description here

GPU rendering application scenarios:

Insert image description here


3. Introduction to high-performance application service HAI:

"High-performance application service HAI" has huge computing power and can be used out of the box. Based on the underlying computing power of Tencent Cloud GPU cloud servers, it provides high-performance cloud services out of the box. Focusing on applications, it matches GPU cloud computing resources to help small and medium-sized enterprises and developers quickly deploy high-performance applications such as LLM, AI painting, and data science.

Insert image description here

It is particularly worth mentioning that for developers, jupyterlab uses a visual webUI interface and a "visual IDE" to greatly reduce the complexity of debugging and lower the threshold for application use. Even after simple training, non-developers (operation and maintenance) can personnel) can also participate in the use.

1. Horizontal comparison, better than blue:

In the past, most AI servers equipped with GPU cloud servers were built by ourselves, which can cover more application scenarios, such as graphics rendering, deep learning, astrophysics, chemical and molecular computing, cloud computing and virtualization, computing-intensive industries and other applications. .

High-performance application service HAIThe value of the product:
Significantly lowers the threshold for using GPU cloud servers. Optimize product usage experience from multiple angles, low threshold, ready to use right out of the box.

Insert image description here

2. A variety of high-performance application deployment scenarios, easy to handle:

Insert image description here

3. Flexible packages respond to different demand scenarios and provide more cost-effective billing methods:

When I understand and become familiar with AI applications, I use the basic version, which incurs relatively low costs. When generating models and data sets, I can use the advanced version, which can improve the efficiency of output.

Insert image description here

4. Tencent Cloud High-Performance Application Service HAI Hands-on Experiment:

This event is a developer technology practice event jointly launched by Tencent Cloud and CSDN. Through hands-on experiments, you can have an in-depth and immersive experience of Tencent Cloud's high-performance application service HAI.

The manual provided by the event is also very detailed. You can quickly experience the AI ​​products related to Tencent Cloud High-Performance Application Service HAI. The event will cover multiple application scenarios. Whether you are a technical novice or an experienced developer, you can learn from the event. Get the best of technology.


4. Introduction to Stable Diffusion:

1. Stable Diffusion:

Stable Diffusion is an image generation model based on the diffusion process that can generate high-quality, high-resolution images. It gradually transforms the noise image into the target image by simulating the diffusion process. This model has strong stability and controllability, and can generate images with diverse effects and good visual effects.

Insert image description here

Stable Diffusion can assist the realization of visual creativity by generating diverse, high-quality images, repairing damaged images, increasing the resolution of images, and applying specific styles to images. It provides more creative tools and materials for visual artists, designers, etc., and promotes innovation and development in the field of visual art.

2. Compare the pain points of self-deployment of GPU servers:

OpenTencent Cloud GPU Server Console to purchase an instance.

2.1 Install basic software:

sudo apt install wget git

2.1 Install python 3.10.6:

# 安装依赖
sudo apt install wget git python3 python3-venv
# 删除默认的低版本
which python3
sudo rm /usr/bin/python
# 配置软链接
ls -lh /usr/bin | grep python
ln -s /usr/bin/python3 /usr/bin/python
# 若是GPU环境的用户需要安装与cuda版本对应的torch
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 --extra-index-url https://download.pytorch.org/whl/cu117
# pip换源
pip config set global.index-url https://mirrors.ustc.edu.cn/pypi/web/simple
# 安装对应依赖

pip install -r requirements_versions.txt
# 建立虚拟环境
sudo apt-get install python3.5-venv
python3 -m venv_name
source venv_name/bin/activate

2.2 Install CUDA:

# 下载Cuda
wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.run
# 安装cuda
sudo sh cuda_11.8.0_520.61.05_linux.run
# 配置环境变量
# 增加下面两行内容,并保存
vim ~/.bashrc
export PATH=/usr/local/cuda-11.8/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.8/lib64:$LD_LIBRARY_PATH
# 使配置文件生效
source ~/.bashrc

2.3 Install stable diffusion:

# 拉取stable diffusion 代码:
git clone GitHub - AUTOMATIC1111/stable-diffusion-webui: Stable Diffusion web UI
# 安装stable diffusion:
cd stable-diffusion-webui/
# 启动
./webui.sh

Insert image description here

The above is my own attempt to purchase a Tencent Cloud GPU server, manually build the environment, and run stable diffusion. It took aboutalmost an afternoon, and this was based on my previous experience.

Insert image description here

3. Compare the effectiveness of HIA products:

By completing the first official hands-on experiment,how to use HAI to easily master AI painting. It probably took us less than 10 minutes to start painting using stable diffusion, and less than 20 minutes to complete the hands-on experiment (as shown below). There are too many online The article describes how to use it, so I won’t repeat it here.

Insert image description here

It can be seen from the following comparison that "purchasing and deploying by yourself" and "high-performance application service HAI" have business pain points at the following7 points, "High-Performance Application Service HAI" has greatly lowered the threshold for use and lowered the cost of learning, allowing more companies and developers to join the AI ​​application industry.

Insert image description here


5. Feasibility evaluation of the company's business efficiency improvement plan through the application of "High Performance Application Service HAI":

Since the advent of AIGC artificial intelligence-generated content, AIGC paintings represented by Stable Diffusion have experienced explosive growth in the past period, triggering a revolution in productivity.

1. Evaluation of Stable Diffusion AI painting to help designers reduce costs and increase efficiency:

In a traditional design team, the usual designer’s workflow is as follows:

  • Operations put forward requirements for business intent and graphics
  • When designing, find some reference pictures that meet the requirements on "Qiantu.com", or draw some first drafts of the design by hand.
  • After confirming the communication style and sketch with the business, confirm with the business whether it meets the requirements of the business.
  • Model the design and adjust it for some detailed designs
  • Finally, the design is finalized and the design drawings are delivered

During this design phase, the time-consuming points delivered to the business team are as follows:

  • Designers often need to find a large number of reference pictures on "Qiantu.com". This process is relatively time-consuming and laborious, and generally takes multiples of half a working day.
  • If you encounter some more complex interactions, designers still need to draw design sketches by themselves, which is also more time-consuming.

Insert image description here

Now with the aid of AIGC drawing tools, the time for finding reference drawings and modeling and design sketches can be greatly shortened, and it also reduces the time for repeated communication and confirmation with the business. Using Stable Diffusion to generate design reference drawings can quickly confirm the design style with the business. After drawing a line draft, the design drawing is directly generated through Stable Diffusion. The designer then optimizes the details, which greatly improves the efficiency of the entire design process. Even simple drawings can be delivered directly to be completed by business units.

2. Evaluation of Pytorch model AI image recognition to help business units reduce costs and increase efficiency:

PyTorch is an open source machine learning library developed by Facebook Artificial Intelligence Research (FAIR). It is written in Python and supports dynamic calculation graphs and distributed training.

Insert image description here

Case number one:
In the claims business, we often encounter vehicle scratches, malicious vehicle damage, scratches and other claims cases, and we often need to refer to a large number of cases.
Business pain points:
①. If you recruit familiar business personnel, you may be able to quickly solve actual problems with your experience.
②. If you recruit new manpower, new scenarios, and new customers, you may encounter problems that are not handled in a timely manner.
Improvement measures:
①. By using pytorch to implement image search, you can quickly find matching cases in the case pool.
②. Increase the reference basis and standardization of vehicle compensation, improve work efficiency, and reduce personnel costs.
Case 2:
In the review process of claims vouchers, since it is often the case that the application materials are highly similar to the claim vouchers in the historical vouchers, there is a problem of claims fraud, which is somewhat similar to what is often called "fraudulent insurance."
Business pain points:
①. Manual selection is inaccurate, time-consuming and inefficient.
Improvement measures:
①. Use pytorch to implement the same image search technology, and when an exception occurs, the case will be transferred to the manual review process.
②. While improving the efficiency of claims review, combat fraud such as embezzlement and misappropriation, thereby reducing the risk of insurance claims.

3. ChatGLM2 6B model AI dialogue helps business units reduce costs and increase efficiency plan evaluation:

The company's current customer service business is a hybrid working model of traditional manual customer service and customer service robots that automatically configure mechanized copywriting responses. For merchants, labor costs are reduced, and consumers can also get faster responses and services.

Insert image description here

ChatGLM2-6B uses Multi-Query Attention, which improves the generation speed and also reduces the memory usage of KV Cache during the generation process. At the same time, ChatGLM2-6B uses Causal Mask for dialogue training, and the KV Cache from previous rounds can be reused during continuous dialogue, further optimizing the memory usage.

Insert image description here

However, the current automatically configured and mechanized copywriting replies are difficult for many consumers to understand. However, even if consumers report to the intelligent customer service that they do not understand, they will not get any other response. The AI ​​dialogue of "Tencent Cloud High-Performance Application Service HAI" can help consumers better understand the problems they encounter. Isn't this more convenient?


6. Stable Diffusion AI painting actual case reference:

1. AI image processing:

In actual work scenarios, it is often encountered that the business department needs to produce various activities, manuals, invitations and other materials. Generally, many scenarios will not be considered unless otherwise specified, such as "roll-up banners", "posters", " In scenes such as "publication design and printing", when the picture is enlarged, it will appear blurry and unclear in scenes that require higher resolution and clarity.

Insert image description here

As above, it needs to be applied to a larger background area. If the design file is directly enlarged, the non-vector elements in the file will generally be blurred, resulting in blurry and blurry images in scenes that require higher resolution and clarity. At this time, you can use Stable Diffusion to enlarge, repair and improve the clarity of the image, saving the labor cost of redrawing.

Let’s introduce several AI amplification algorithms of Stable Diffusion: post-processing, script UItimate SD plug-in solution, and of course, there are many other solutions. When doing high-definition restoration and high-definition enlargement with enlargement algorithms, where is the focus point of a picture?

1.1 Post-processing plan:

Insert image description here

serial number operate describe
1 Click on the "Extras" tab ①. Corresponds to Chinese “post-processing”
2 Upload images that require high-definition enlargement ①. Single Image can process one image
②. Batch Process can upload multiple images
③. Batch Process Directory (Batch Processing Folder) allows you to select a folder directory that requires batch processing of images
3 Enter the desired scaling ratio in "Scale By" ①. You can set this value according to your own needs to adjust the magnification required.
4 Upscaler1 represents the amplification algorithm ①. It is recommended to choose the two algorithm models "R-ESRGAN 4x+" and "R-ESRGAN 4x+ Anime6B"
5 Click "Generate" to generate the image

1.2 "Script plug-in" solution:

Insert image description here

serial number operate describe
1 Click the "Extensions" tab ①. Corresponding to the Chinese "Extensions", you can manage installed plug-ins
2 Click on "script" script
3 Click "Load form" to load the extension list ①. You can see that this extended list is an accelerated manifest list
②. There will be a loading time to wait
③. If loading is complete, enter the lower left picture
4 Search for "Ultimate SD" in the search area ①. Here you can search for installed or undownloaded plug-ins
②. If there are plug-ins that meet the conditions, they will be displayed below and the installation status will be displayed.
5 Click "Install" to install the "Ultimate SD" plug-in ①. After the installation is completed, enter the picture on the upper right
6 In the "Extensions" tab, check whether the installation is successful ①. You can check whether the "Ultimate SD" plug-in is installed successfully.
7 "Ultimate SD" plug-in installation information
8 Switch to the "Setting" settings tab
9 Click "Reload UI" to restart

Insert image description here
You can see the expected results. The processing effect is indeed better. However, one problem is that the processing process is too slow. It takes nearly half an hour to generate a 17M image. If it is really used in the production stage, it may need more time. High configuration to support.
Insert image description here

1.3 Summary:

It can be seen that post-processing enlargement is definitely not as good as redrawing in terms of details, but its advantage is that it is simple, convenient, fast, and can process any picture. If the requirements are not high, it is a very useful function. , compared to SD enlarged scripts, you can get higher resolution and richer details, but the processing time takes a little longer.
Insert image description here

2. Business promotion poster generation:

In the early stage of design, the designer has just received the demand, and after understanding the content elements and design style of the screen, he will find corresponding reference pictures on websites such as Petals for confirmation by the business party. At this time, he may also need to use appropriate sketches to express the screen. Composition, elements, etc.

However, such reference pictures often cannot convey the designer's design ideas very well, and it takes a long time to find a suitable picture. At this time, if you enter relevant keywords through Stable Diffusion, you can generate an inspiration reference picture.

Insert image description here

In the picture above, the festival poster uses the Mid-Autumn Festival theme as an example to generate some posters to share with friends. You can use the ability of Stable Diffusion to directly generate some exquisite holiday material pictures through "spells", and then design and layout to add text and other materials.

Insert image description here

serial number operate describe
1 Click on the "txt2img" tab ①. Generate pictures through text descriptions
2 Enter "Prompt" ①. For the text description of the image you want to generate, generally using English description can get better generation results.
3 Enter "Negative prompt" ①. Describes the characteristics of the image the user wants to generate.
4 Adjust “Sampling Steps” ①. 如果生成图片细节不满足要求,可适当增加采样步骤,但生成时间也会相应增加
②. 大部分采样器超过50步后意义就不大了
5 点击“Generate”生成图片

在通过PhotoShop添加相应的文案元素进行排版,得到一个氛围感满满的中秋节气的海报,为了降低设计师手动复制二唯码生成海报的痛点,如下使用Vue + Java开发了一套重绘二唯码参数的系统,用来减轻设计师的工作量,同时,也降低了出错的几率。

Insert image description here

重绘携带二唯码海报相关java相关核心代码:

public static byte[] pressImage(ByteArrayInputStream input, FxPosterDTO poster,
									float alpha, Map<String, String> textMap) {
    
    
	ByteArrayOutputStream bos = new ByteArrayOutputStream();
	try {
    
    
		// 海报图片
		Image target = ImageIO.read(new URL(poster.getPosterUrl()));
		int wideth = target.getWidth(null);
		int height = target.getHeight(null);
		BufferedImage image = new BufferedImage(wideth, height, 1);
		Graphics2D g = image.createGraphics();
		g.drawImage(target, 0, 0, wideth, height, null);

		// 二维码图片
		Image src_other = ImageIO.read(input);
		int wideth_other = src_other.getWidth(null);
		int height_other = src_other.getHeight(null);
		int reX = (wideth - wideth_other) / 2;
		int reY = (height - height_other) / 2;
		g.setComposite(AlphaComposite.getInstance(10, alpha));
		g.drawImage(src_other, reX + poster.getQrCodeX(), reY + poster.getQrCodeY(), null);

		// 码LOGO替换
		if (HmbConstants.WECHAT_PROGRAM.equals(poster.getRemark1())) {
    
    
			int logoX = (wideth - 240) / 2;
			int logoY = (height - 240) / 2;
			Image logoIO = ImageIO.read(new URL(fxProject.getRemark1()));
			g.drawImage(logoIO, logoX + poster.getPointX() + LOGO_OFFSET,
					logoY + poster.getPointY() + LOGO_OFFSET, null);
		}

		// 海报文字
		PosterQrcodeReq obFirst = JSON.parseObject(poster.getRemark4(), PosterQrcodeReq.class);
		PosterQrcodeReq obSecond = JSON.parseObject(poster.getRemark5(), PosterQrcodeReq.class);
		String contentFirst = Tools.isBlank(obFirst.getContentFirst()) ? "" : obFirst.getContentFirst();
		String contentSecond = Tools.isBlank(obSecond.getContentSecond()) ? "" : obSecond.getContentSecond();
		Color color = Color.WHITE;
		if (HmbConstants.POSTER_WORD_COLOR_B.equals(obSecond.getContentColor())) {
    
    
			color = Color.BLACK;
		}
		if (!Tools.isBlank(textMap)) {
    
    
			g.setRenderingHint(RenderingHints.KEY_TEXT_ANTIALIASING, RenderingHints.VALUE_TEXT_ANTIALIAS_ON);
			String waterMarkContent = textMap.get("userName")
					.concat(contentFirst);
			g.setColor(color);
			g.setBackground(Color.WHITE);
			g.setFont(new Font("Microsoft YaHei", Font.BOLD, 24)); // 字体、字型、字号
			g.drawString(waterMarkContent, reX + obFirst.getXFirst(), reY + obFirst.getYFirst()); // 画文字
			g.drawString(contentSecond, reX + obSecond.getXSecond(), reY + obSecond.getYSecond()); // 画文字
		}
	} catch (Exception var14) {
    
    
		log.error(var14.getMessage(), var14);
	}
	return bos.toByteArray();
}

private static void insertImage(BufferedImage source, InputStream imgPath, boolean needCompress) throws Exception {
    
    
	if (imgPath == null) {
    
    
		log.warn("文件不存在,imgPath", imgPath);
	} else {
    
    
		Image src = ImageIO.read(imgPath);
		int width = (src).getWidth(null);
		int height = (src).getHeight(null);
		if (needCompress) {
    
    
			// 压缩LOGO
			if (width > WIDTH) {
    
    
				width = WIDTH;
			}
			if (height > HEIGHT) {
    
    
				height = HEIGHT;
			}
			Image image = (src).getScaledInstance(width, height, 4);
			BufferedImage tag = new BufferedImage(width, height, 1);
			Graphics g = tag.getGraphics();
			g.drawImage(image, 0, 0, null);
			g.dispose();
			src = image;
		}

		Graphics2D graph = source.createGraphics();
		int x = (QRCODE_SIZE - width) / 2;
		int y = (QRCODE_SIZE - height) / 2;
		graph.drawImage(src, x, y, width, height, null);
		Shape shape = new RoundRectangle2D.Float((float) x, (float) y, (float) width, (float) width, 6.0F, 6.0F);
		graph.setStroke(new BasicStroke(3.0F));
		graph.draw(shape);
		graph.dispose();
	}
}

public static byte[] pressImage(InputStream input, float f, FxPosterDTO req , boolean isFont) throws Exception {
    
    
	ByteArrayOutputStream bos = new ByteArrayOutputStream();
	Thread.currentThread().getContextClassLoader().getResource("").getPath();
	Image target = ImageIO.read(new URL(req.getPosterUrl()));
	int wideth = target.getWidth(null);
	int height = target.getHeight(null);
	BufferedImage image = new BufferedImage(wideth, height, 1);
	Graphics2D g = image.createGraphics();
	g.drawImage(target, 0, 0, wideth, height, null);

	Image src_other = ImageIO.read(input);

	int qrWidth = req.getQrCodeSize() == 0  ? 200 : 200 * req.getQrCodeSize() / 100;
	BufferedImage bufferedImageBef = createResizedCopy(src_other, qrWidth, qrWidth);

	int wideth_other = bufferedImageBef.getWidth(null);
	int height_other = bufferedImageBef.getHeight(null);

	int reX = (wideth - wideth_other) / 2;
	int reY = (height - height_other) / 2;

	BufferedImage bufferedImage = setClip(bufferedImageBef,20);
	g.setComposite(AlphaComposite.getInstance(10, f));
	g.drawImage(bufferedImage,(wideth - 280) / 2 + req.getQrCodeX(), (height - 280) / 2 + req.getQrCodeY(),null);
	g.setRenderingHint(RenderingHints.KEY_TEXT_ANTIALIASING, RenderingHints.VALUE_TEXT_ANTIALIAS_ON);
	Color mycolor = POSTER_WORD_COLOR_B.equals(req.getFontColor()) ? Color.BLACK : Color.WHITE;
	g.setColor(mycolor);
	g.setBackground(Color.WHITE);
	if (!isFont){
    
    
		g.setFont(new Font("AR PL UMing CN:style=Light", Font.BOLD, 30));
		g.drawString(PbmCodeUtils.mask(req.getCreateName(),4,3), reX + req.getFontX(),
				reY + req.getFontY());
	}
	g.dispose();
	ImageIO.write(image, FORMAT_NAME, bos);

	return bos.toByteArray();
}

2.3 小结:

在导入Stable Diffusion后,再加上自己研发的系统对海报进行二次加工,可以达到海报批量快速产出的效果,加快了业务部分快速推广的作用,同时,也极大的减轻了设计师的工作量。

  • 借助Stable Diffusion图片生成能力可以快速的基本素材的生成
  • 借助PS软件专门用来进行图像处理的软件,通过它可以对图像修饰、对图形进行编辑,以及对图像的色彩处理,另外还有绘图和输出等功能,可以使图像产生特效,如果和其它工具或软件配合使用,还可以进行高质量的广告设计、美术创意和三维动画制作
  • 借助自己开发的图片二次开发功能,可以有效的将不同渠道、不同业务的分销码,重绘到海报中携带二唯码参数

Insert image description here

希望借助AIGC领域的工具,打造一个全流程线上工具化,运营人员通过配置节日、风格、形象、动作等即可自动生成运营图。

3.1 图生图 - 运营生成二唯码场景:

4. 总结:

“腾讯云高性能应用服务HAI”提供了Stable Diffusion快速部署及下载自定义模型功能,使用者不需要自己下载代码,不需要自己安装复杂的依赖,不需要了解Git、Python、Docker等技术,只需要在控制台图形界面点击几下鼠标就可以快速启动Stable Diffusion服务进行绘画,非技术同学也能轻松搞定。

与设计沟通后,通过上面的案例可以看出Stable Diffusion只能完成项目中的一部分,或者一些临时应急对设计需求要求并不高的项目,如果要求较高还是需要设计师二次创作及相应的优化。AIGC 本质上还是提效辅助的工具,设计师需要去掌握更高超的操作技能。


七、Pytorch模型AI图像识别实际案例参考:

腾讯云向量数据库(Tencent Cloud VectorDB)是一款全托管的自研企业级分布式数据库服务,专用于存储、检索、分析多维向量数据。该数据库支持多种索引类型和相似度计算方法,单索引支持10亿级向量规模,可支持百万级 QPS 及毫秒级查询延迟。腾讯云向量数据库不仅能为大模型提供外部知识库,提高大模型回答的准确性,还可广泛应用于推荐系统、NLP 服务、计算机视觉、智能客服等 AI 领域。

腾讯云的向量数据库中的“向量检索”是一种基于向量空间模型的信息检索方法,可以将非结构化的数据表示为向量存入向量数据库,向量检索通过计算查询向量与数据库中存储的向量的相似度来找到目标向量。

Insert image description here

通过这个特性,我们可以利用PyTorch的transforms模块和预训练的模型ResNet50将一张输入图像转化为一个特征向量,存放到向量数据库,再通过“Embedding”功能匹配相似值的score权重来进行由高到低排序,越大表示相似度越高。

Insert image description here

1. 首先,按照官方的要求进行安装向量数据库的SDK:

Insert image description here

安装完成后,可以看到安装的速度是非常的快,最大是2.0M每秒,内部做了学术加速。

Insert image description here

pandas 是基于NumPy 的一种工具,该工具是为解决数据分析任务而创建的。Pandas 纳入了大量库和一些标准的数据模型,提供了高效地操作大型数据集所需的工具。

2. 下载ResNet50模型:

ResNet50是一种深度卷积神经网络,用于图像分类和对象检测,是一种基于深度卷积神经网络(Convolutional Neural Network,CNN)的图像分类算法。

Insert image description here

3. 通过ptyhon脚本进行图片的特征向量入库操作:

import torchvision.transforms as transforms
from PIL import Image
import pandas as pd
import torch
import torchvision.models as models
from glob import glob
from pathlib import Path
from torchvision.models import ResNet50_Weights
import tcvectordb
from tcvectordb.model.document import Document, SearchParams, Filter
from tcvectordb.model.enum import FieldType, IndexType, MetricType, ReadConsistency
from tcvectordb.model.index import Index, VectorIndex, FilterIndex, HNSWParams

# disable/enable http request log print
tcvectordb.debug.DebugEnable = False

client = tcvectordb.VectorDBClient(url='http://xxx.tencentclb.com:20000', username='root', key='xxxx', read_consistency=ReadConsistency.EVENTUAL_CONSISTENCY, timeout=30)

print(client)

# Test Image Path
TEST_IMAGE_PATH = './images/*.jpg'

# Initialize model
# model = models.resnet50(pretrained=True)
model = models.resnet50(weights=ResNet50_Weights.IMAGENET1K_V2)

# Set model to eval mode
model = model.eval()
model = torch.nn.Sequential(*(list(model.children())[:-1]))

# Load image path
def load_image(x):
    if x.endswith('csv'):
        with open(x) as f:
            reader = csv.reader(f)
            next(reader)
            for item in reader:
                yield item[1]
    else:
        for item in glob(x):
            yield item

# Embedding: Function to extract features from an image
def extract_features(image_path):
    transform = transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
    ])
    # Read the image Ensure the image is read as RGB
    img = Image.open(image_path).convert('RGB')
    img = transform(img)
    img = img.unsqueeze(0)
    feature = model(img)
    # Reshape the features to 2D
    feature = feature.view(feature.shape[0], -1)
    return feature.detach().numpy()

def display_multiple_embeddings(image_path_pattern):
    # Use glob to get all matching image paths
    image_paths = load_image(image_path_pattern)
    # Process each image and collect the results
    results = []
    print(image_paths)
    for index, img_path in enumerate(image_paths):
        feature = extract_features(img_path)
        # Convert features to a pandas DataFrame
        d = pd.DataFrame(feature)
        upsert_data(str(index), img_path, d)
        # results.append(d)

def create_db_and_collection():
        database = 'ai'
        coll_name = 'ai_images'
        coll_alias = 'ai_images_alias'
        
        # 创建DB
        db = client.create_database(database)

        # 创建 Collection

        # 第一步,设计索引(不是设计 Collection 的结构)
        # 1. 【重要的事】向量对应的文本字段不要建立索引,会浪费较大的内存,并且没有任何作用。
        # 2. 【必须的索引】:主键id、向量字段 vector 这两个字段目前是固定且必须的,参考下面的例子;
        # 3. 【其他索引】:检索时需作为条件查询的字段,比如要按书籍的作者进行过滤,这个时候 author 字段就需要建立索引,
        #     否则无法在查询的时候对 author 字段进行过滤,不需要过滤的字段无需加索引,会浪费内存;
        # 4.  向量数据库支持动态 Schema,写入数据时可以写入任何字段,无需提前定义,类似 MongoDB.
        # 5.  例子中创建一个书籍片段的索引,例如书籍片段的信息包括 {
    
    id, vector, segment, bookName, author, page},
        #     id 为主键需要全局唯一,segment 为文本片段, vector 字段需要建立向量索引,假如我们在查询的时候要查询指定书籍
        #     名称的内容,这个时候需要对 imageUrl 建立索引,其他字段没有条件查询的需要,无需建立索引。
        index = Index()
        index.add(VectorIndex('vector', 2048, IndexType.HNSW, MetricType.COSINE, HNSWParams(m=16, efconstruction=200)))
        index.add(FilterIndex('id', FieldType.String, IndexType.PRIMARY_KEY))
        index.add(FilterIndex('imageUrl', FieldType.String, IndexType.FILTER))

        # 第二步:创建 Collection
        db.create_collection(
            name=coll_name,
            shard=3,
            replicas=0,
            description='ai images collection',
            index=index,
            embedding=None,
            timeout=20
        )

        # 设置 Collection 的 alias
        db.set_alias(coll_name, coll_alias)

def upsert_data(index, img_path, d):
        # 获取 Collection 对象
        db = client.database('ai')
        coll = db.collection('ai_images')

        # upsert 写入数据,可能会有一定延迟
        # 1. 支持动态 Schema,除了 id、vector 字段必须写入,可以写入其他任意字段;
        # 2. upsert 会执行覆盖写,若文档id已存在,则新数据会直接覆盖原有数据(删除原有数据,再插入新数据)

        document_list = [
            Document(id=index,
                     vector=[0.2123, 0.21, 0.213],
                     imageUrl=img_path)
        ]
        coll.upsert(documents=document_list)

if __name__ == '__main__':
create_db_and_collection()

    display_multiple_embeddings(TEST_IMAGE_PATH)

Insert image description here

4. 提供查询API接口相关逻辑:

def query_data(self, vector):
        # 获取 Collection 对象
        db = self._client.database('book')
        coll = db.collection('book_segments')

        # 批量相似性查询,根据指定的多个向量查找多个 Top K 个相似性结果
        res = coll.search(
            vectors=[ vector],  # 指定检索向量,最多指定20个
            params=SearchParams(ef=200),  # 若使用HNSW索引,则需要指定参数ef,ef越大,召回率越高,但也会影响检索速度
            retrieve_vector=False,  # 是否需要返回向量字段,False:不返回,True:返回
            limit=10  # 指定 Top KK)
        # 输出相似性检索结果,检索结果为二维数组,每一位为一组返回结果,分别对应search时指定的多个向量
        print_object(res)

接口返回图片的相关信息,如大小、类型、分辨率、时间,最重要的是score的值:

Insert image description here

前端页面显示逻辑:

<template>
  <scroll-view
      class="media-view"
      scroll-y
      @scrolltolower="loadMore"
  >
    <unicloud-db
      ref="mediaUdb"
      v-slot:default="{data, loading, error, pagination}"
      :collection="collection"
      orderby="createDate desc"
      loadtime="manual"
      :page-size="50"
      @load="onMediaListLoad"
    >
      <view v-if="(loading || processing) && pagination.current === 1" class="loading">
        <uni-icons class="icon" type="spinner-cycle" size="30" color="#000"></uni-icons>
      </view>
      <view class="items" v-else-if="mediaList.length">
        <view
            class="media-item"
            :class="{active: mediaItem.active, selected: mediaItem.selected}"
            v-for="(mediaItem, index) in mediaList"
            @click="onSelect(index)"
            :key="mediaItem._id"
        >
          <view class="image" v-if="mediaItem.type === 'image'">
            <image :src="mediaItem.isUploading ? mediaItem.src: mediaItem.thumb.listCover" mode="aspectFill" class="img"></image>
          </view>
          <view class="image" v-if="mediaItem.type === 'video'">
            <video
                class="v"
                :src="mediaItem.src"
                :controls="false"
                :show-center-play-btn="false"
                v-if="mediaItem.isUploading || /^cloud:\/\//.test(mediaItem._src)"
            ></video>
            <image v-else :src="mediaItem.thumb.listCover" mode="aspectFill" class="img"></image>
          </view>
          <view class="mask" v-if="mediaItem.isUploading">
            <view class="progress">
              <view class="inner" :style="{width: mediaItem.progress + '%'}"></view>
            </view>
            <view class="tip">{
    
    {
    
    mediaItem.tip}}</view>
          </view>
        </view>
      </view>
      <view class="media-library-isnull" v-else>
        <uni-icons type="images" size="60" color="#ccc"></uni-icons>
        <view class="text">媒体库资源为空,是否上传资源?</view>
        <button
            type="primary"
            size="mini"
            @click="$emit('onUploadMedia');"
        >上传媒体资源</button>
      </view>
    </unicloud-db>

  </scroll-view>
</template>

如下为页面的效果显示,可以看到通过一张图片搜索,可以将类似的图片查找出来,非常的实用。

Insert image description here

小结:

通过“高性能应用服务HAI”中的PyTorch框架和腾讯云向量数据库相结合,将文本/图像检索任务是指在大规模文本/图像数据库中搜索出与指定图像最相似的结果,在检索时使用到的文本/图像特征可以存储在向量数据库中,通过高性能的索引存储实现高效的相似度计算,进而返回和检索内容相匹配的文本/图像结果


八、垂直领域“汽车保险”AI智能客服实际案例参考:

绝大多数提供互联网应用的公司都会存在在线客服的岗位,以往客服单位需要招在大量专业人员,经过内部培训一段周期再上岗作业,往往会存在一些问题:

序号 分类 描述
1 人工座席高强负荷运转 人工座席无法应对高峰期海量访客,造成服务响应缓慢、排队等待过长及服务专业性不足等各种情况
2 核心数据外泄风险 人工座席能够触及的客户资料数据覆盖面广,部分敏感业务数据存在暴露风险,可能导致数据信息外泄
3 7*24服务 很多时候,客服人员在下班或者休假的时候,还要频繁工作,导致客服工作时长久
4 业务“Serverless化服务” 当遇到业务比较忙时,需要招大量的人力来支撑业务发展,当业务低谷期,又需要减员来保证公司的正常支出

根据第二个实验手册:未来对话:HAI创作个人专属的知识宇宙,里面第5点“高性能应用服务HAI 快速部署 ChatGLM2-6B-int4 本地模型及基于 P-Tuning v2 的微调”,对于现有的业务有帮助,准备自己的训练集进行微调 ChatGLM2-6B 模型(基于 P-Tuning v2 ),创建企业垂直领域的专属知识库

以下为测试需要的训练集,以最新一个项目的QA为例,收集以下的list:

[
 {
    
    "content": "车保赔巨王卡", "summary": "车保赔巨王卡是xx公司推出的一种综合保险服务,主要服务于xx城市中所有的车辆服务,投保期间只有一年,2026年1月1号到2026年12月30号。"},
 {
    
    "content": "车保赔巨王卡投保时间", "summary": "投保期间只有一年,2026年1月1号到2026年12月30号。"},
 {
    
    "content": "车保赔巨王卡如何理赔", "summary": "关注公众号'某某车保险',输入'理赔'即可进行理赔的相关操作。"},
]

执行“sh train_chat.sh”命令进行训练:

Insert image description here

模型开始训练,数据集越多耗时越长,目前测试的三条训练集、验证集大约需要1个小时左右训练完成。

Insert image description here

可以看到快速部署 ChatGLM2-6B-int4 本地模型及基于 P-Tuning v2 的微调前后的对比:
Insert image description here

基于 P-Tuning v2 的微调可以看出回答的结果更贴近业务垂直领域的结果,但是让客户使用这个界面必然是不现实的,下面我们通过API的方式来集成到我们的客服系统中,看看是否能“模拟”人工客服,从而降低对客服人员的成本诉求。

根据手册输入命令,用于开启 API 服务:


cd ./ChatGLM2-6B
python api.py

通过postman进行发送请求一个post请求,可以看到是可以请求成功的。

Insert image description here

使用开发好的客服系统可以进行对接,如下为vue相关代码:

<template>
	<view>
		<scroll-view scroll-with-animation scroll-y="true"  @touchmove="hideKey"
		style="width: 750rpx;" :style="{'height':srcollHeight}" :scroll-top="go" >
			<view id="okk" scroll-with-animation >
			<view  class="flex-column-start" v-for="(x,i) in msgList" :key="i">
				<view v-if="x.my" class="flex justify-end padding-right one-show  align-start  padding-top" >
					<view class="flex justify-end"  style="width: 400rpx;">
						<view class="margin-left padding-chat bg-cyan" style="border-radius: 35rpx;">
							<text   style="word-break: break-all;">{
    
    {
    
    x.msg}}</text>
						</view>
					</view>
				</view>
				<view v-if="!x.my" class="flex-row-start margin-left margin-top one-show" >
					<view class="chat-img flex-row-center">
						<image style="height: 75rpx;width: 75rpx;" src="../../static/image/ke.png" mode="aspectFit"></image>
					</view>
					<view  class="flex"  style="width: 500rpx;">
						<view class="margin-left padding-chat flex-column-start" style="border-radius: 35rpx;background-color: #f9f9f9;">
							<text  style="word-break: break-all;" >{
    
    {
    
    x.msg}}</text>
							<view class="flex-column-start" v-if="x.type==1" style="color: #2fa39b;">
								<text style="color: #838383;font-size: 22rpx;margin-top: 15rpx;">以下是常见问题,可以点击查看:</text>
								<text @click="answer(index)" style="margin-top: 30rpx;" 
								v-for="(item,index) in x.questionList" :key="index" >{
    
    {
    
    item}}</text>
							</view>
						</view>
					</view>
				</view>
		</view>
		<view v-show="msgLoad" class="flex-row-start margin-left margin-top">
			<view class="chat-img flex-row-center">
				<image style="height: 75rpx;width: 75rpx;" src="../../static/image/robt.png" mode="aspectFit"></image>
			</view>
			<view  class="flex"  style="width: 500rpx;">
				<view class="margin-left padding-chat flex-column-start" 
				style="border-radius: 35rpx;background-color: #f9f9f9;">
					<view class="cuIcon-loading turn-load" style="font-size: 35rpx;color: #3e9982;">
						
					</view>
				</view>
			</view>	
		</view>
		<view style="height: 120rpx;">
		</view>
		</view>	
		</scroll-view>
		<view class="flex-column-center" style="position: fixed;bottom: -180px;"
		:animation="animationData" >		
			<view class="bottom-dh-char flex-row-around" style="font-size: 55rpx;">
				 <input  v-model="msg"  class="dh-input" type="text" style="background-color: #f0f0f0;" 
				 @confirm="sendMsg" confirm-type="search" placeholder-class="my-neirong-sm"
				 placeholder="说点什么吧..." /> 
				 <view @click="sendMsg" class="cu-tag bg-cyan round">
				 	发送消息
				 </view>
				<text @click="ckAdd" class="cuIcon-roundaddfill text-brown"></text>
			</view>
		</view>
	</view>
</template>

本地通过proxy代理一下,请求到AI ChatGLM2 6B的服务器,可以完成集成到开发客服系统中来。

Insert image description here

以下为AI ChatGLM2 6B的服务器的日志相关信息。

Insert image description here

小结:

通过快速部署 ChatGLM2-6B-int4 本地模型及基于 P-Tuning v2 的微调,可以搭建属于企业垂直领域的私域客服知识体系。帮助理解用户的问题,并提供准确、及时的回答和解决方案。从而基于人工智能技术的自动化客户服务系统,旨在提高客户满意度和降低企业成本。

Insert image description here

目前主要应用在客户咨询等场景,可以帮助企业大幅提升服务效率,提高服务质量,降低人工成本,成为企业提升服务效率、降低成本的重要方式,也可以帮助企业更好地应对日益复杂的客户需求和市场变化。

Insert image description here

当然,AI客服人工智能仍然有其局限性,无法完全取代人类客服。因此,人与AI对话的合作将是未来客户服务的发展方向,共同为用户提供更好的服务。

与传统的客户服务相比,基于“高性能应用服务HAI”应用式AI功能能够利用自然语言提示词进行自动化机器人程序开发,在大语言模型(LLM)的加持下,提升智能化问题解决效率,加速问题的有效处理


有些同学可能是第一次接触GPU这个概念,接下来我就来普及一下GPU是什么?有什么样的一些特点?为什么在AI、深度训练和图像处理等领域大受欢迎呢?

九、什么是GPU?GPU与CPU有什么区别吗?GPU有哪些应用场景?

1. 什么是GPU?

GPU,全称为图形处理器,是一种专门设计用于处理计算机图形和图像的处理器。它可以加速计算机图形渲染和处理操作,提高计算机图形和图像的性能和质量。GPU相对于CPU而言,具有更多的处理单元和更高的并行处理能力,因此可以更快地处理大量的图形和图像数据。

GPU的主要功能包括图形渲染、图像处理、计算加速等。

  • 在游戏、动画、视觉效果等领域中,GPU是实现高质量图形和图像的必要组件。
  • 在科学计算、深度学习等领域中,GPU也可以作为计算加速器来使用,可以大幅提高计算速度和效率。

GPU的工作原理是通过多个处理单元并行处理图形、图像和计算任务来提高处理速度和效率。这些处理单元分布在不同的计算核心和计算单元中,可以同时处理多个任务。此外,GPU还使用了高速缓存、显存等技术来优化数据存储和访问,进一步提高了性能和速度。


2. CPU与GPU的不同特点:

Insert image description here

3. 显卡是GPU吗?

通常所说的显卡(Graphics Card)指的是安装了 GPU 的设备。

上图所示,显卡除了包含 GPU 之外,还包括显存、供电模块、总线、风扇、显卡 BIOS、外围设备等部件。显卡通过将 CPU 传输的数据转换为图像信号,控制显示器输出图像。

Insert image description here

可以看出,在一些需要大量图像处理或计算的应用场景中,GPU 可以比 CPU 更高效地完成任务。因此,现代的显卡也广泛应用于机器学习、深度学习、AI人工智能等领域的加速计算,甚至被用于科学计算、天文学、地质学、气象学、量子学等众多领域。

4. 不同的结构组成:

Insert image description here

5. 白话文区别:

Insert image description here

6. Nvidia 产品矩阵:

Insert image description here


十、公司AIGC业务降本增效之路考量:

AIGC是一种新的人工智能技术,它的全称是Artificial Intelligence Generative Content,即人工智能生成内容。现阶段AIGC多以单模型应用的形式出现,主要分为文本生成、图像生成、视频生成、音频生成,其中文本生成成为其他内容生成的基础。

通过“腾讯云高性能应用服务HAI”实践了AI作画、AI深度学习、AI LLM模型的案例,可以体验到简易部署、便捷维护,减少工作量、步骤繁琐、效率低和时间成本的问题,同时提升系统整体性能和用户体验。

以下为在体验过程中,个人觉得非常提效的几个点:

Insert image description here
同时,在体验AIGC的应用中,可以通过“腾讯云高性能应用服务HAI”的应用大幅提高内容生成的速度,节省时间和资源,“腾讯云高性能应用服务HAI”可以轻松应对大规模的内容生成需求。

Insert image description here

序号 分类 描述
1 提升生产效率 “腾讯云高性能应用服务HAI”大幅提高生产效率,进一步优化生产流程,提高生产效率。
2 降低运营成本 “腾讯云高性能应用服务HAI”可以降低企业的运营成本,帮助企业做出更加精准的生产决策,
从而降低生产成本,提高数据处理能力和响应速度,进一步降低企业的运营成本。
3 优化资源利用 “腾讯云高性能应用服务HAI”可以帮助企业优化资源利用,可以帮助企业更好地规划生产和资源分配,提高资源利用效率。

当然,并非是AI取代了人,而是会用AI对话模型、AI绘画工具的人,替换掉不会驾驭AI工具,传统的作业方式的人。让使用“腾讯云高性能应用服务HAI”的在企业中,实现“一个人顶一个组”、“支撑以前2-3倍的业务体量”

Insert image description here

同时,在对上面手册的实操,和自己企业内部的一些需求调研过程,也是“腾讯云高性能应用服务HAI”在实际应用中有一些SWOT的思考:

Insert image description here


十一、公司业务其它AI场景的未来展望:

在新的AIGC技术浪潮之中,“腾讯云高性能应用服务HAI”的实践方案过程中,在公司推广技术导入方案会面临着这样的问题:

Insert image description here

  • 在的业务上应用“腾讯云高性能应用服务HAI”能获得什么?
  • 如何快速、平滑地从传统的体系基础上完成“腾讯云高性能应用服务HAI”切换?
  • 站在机器学习算法设计的角度,又会带来什么影响和改变?
  • 在众多的AIGC生态下,众多的技术路线和架构选型中,如何确定“腾讯云高性能应用服务HAI”是一条比较适合自身场景的路径?

以下是公司经过了初创期、爬坡期,在行业内快速的吸引客户,并且占有一定的业务量,后续在原有的业务基础上,提高市场的竞争力,以及对公司一些CostDown原则的实施,希望能通过更多的AIGC的工具链路,帮助企业实施AI的战略布局。

Insert image description here

事实上,通过以上对AIGC的一些工具Stable Diffusion和Pytorch、ChatGLM2-6B的案例,可以看到在原有的人工传统作业方式,通过AIGC的工具体系,来加速业务的处理效率。


十二、总结:

高性能应用服务(Hyper Application Inventor,HAI)是一款面向AI、科学计算的GPU应用服务产品,提供即插即用的澎湃算力与常见环境。助力中小企业及开发者快速部署LLM、AI作画、数据科学等高性能应用,原生集成配套的开发工具与组件,大幅提高应用层的开发生产效率、降低运营成本、提高产品质量和优化资源利用等。

Insert image description here

序号 分类 描述
1 简单易用 通过简化计算、网络和存储等基础设施的配置流程,大幅降低了云服务操作和管理的复杂度。
2 应用环境快速部署 支持多种 AI 环境快速部署,如 ChatGLM-6B、StableDiffusion 等,使用户可专注业务及应用场景创新。
3 高灵活性 支持用户登录实例,对 AI 模型及实例环境进行灵活配置。可进行内部开发、业务测试,或对外提供业务服务。
4 多种登录方式 除传统连接方式外,支持通过 jupyterlab、WebUI 等方式一键启动,提供更贴合使用场景的登录方式。
5 算力种类丰富 提供多种算力套餐选择,未来还将加入更多种类供用户选择。

Insert image description here

得益于人工智能技术的不断迭代与突破式发展,高性能应用服务HAI应运而生,应用式AI可以驱动各行各业,如营销与销售、产品与研发和客户运营等业务职能,帮助企业增强客户体验、提升员工生产力和创造力、优化业务流程,腾讯云致力于推动生成式AI普惠化,赋能千行百业持续创新。

Guess you like

Origin blog.csdn.net/wanmeijuhao/article/details/134769960