How far away are we from the next era of Ultra HD?

82afed222ec19e4f5c543984b8a879c6.jpeg

Low-latency interactive live broadcasts, continuous scrolling short videos, 1080p movies and TV series... Now that ultra-high-definition videos are easily available, our tolerance for low-definition and laggy videos is getting lower and lower.

According to the "2022 China Online Audiovisual Development Research Report", as of December 2021, the number of online video (including short video) users in China has reached 975 million, an increase of 47.94 million from December 2020, accounting for 94.5% of the total Internet users .

Behind this is the superimposed pressure of storage, bandwidth and computing power costs.

If you want to pursue the ultimate audio-visual experience when watching an ultra-high-definition movie, then the video requires 16 times the computing power, 12 times the storage, and 10 times the bandwidth.

What if 100 people watch it at the same time?

At this time, we urgently need a real-time live broadcast media processing platform with low cost, high compression rate and certain enhanced capabilities, as well as the trump card behind it - the encoding and decoding processing solution .

1

You can have your cake and eat it too

According to Agora's data analysis, users with high-definition image quality have a 10.3% higher retention time in the channel than with standard-definition image quality. High-definition images can make viewers more willing to stay on the platform and enhance user stickiness.

But high-definition video is not something you can just say, and the cost pressure behind it cannot be underestimated. In order to cope with the continuous growth of video traffic, video standards organizations have been promoting the continuous iteration of video encoding technology. Starting from MPEG2, the compression rate of video coding standards has increased by about 50% every 10 years. Take H.266, launched in 2021, as an example: the compression rate has increased by 50% compared to H.265, but its encoding calculation cost has increased by 15 times .

At this time, faced with more than 10 times the cost of next-generation coding, traditional CPU capabilities are no longer able to cope. The cascading effects of Moore's Law also make it difficult for it to burst out high-performance capabilities. Since the CPU can't do it, what about using GPU and AI to leverage it?

211bc337745fbe38f0403233f94b38be.png

According to the public financial reports of relevant companies, the cost of video transcoding and bandwidth has accounted for 10% of the company's annual revenue.

AI is indeed a good helper. This is a complete set of flow charts for video transcoding and streaming. We can see that throughout the entire red frame process, AI can completely take over the work of content review, understanding, editing and transcoding. However, while improving the image quality of video encoding and decoding, the computing power cost required by AI cannot be underestimated.

The high cost of GPUs is prohibitive. Enterprises dare not stock up on a large number of GPU cards at once, not to mention that GPU transcoding cannot achieve the same high compression rate as CPUs.

Faced with the above demand pain points, the original single CPU or GPU architecture can no longer fully meet the needs. Comparing the two, there is no clear winner.

So the question is, is there a way to combine the two without increasing costs?

There really is.

We all know that video encoding hardware platforms are flourishing, including CPU, GPU, proprietary chips and even FPGA... But for video transcoding (especially for hot data transcoding with high traffic volume), CPU is still the first. A choice precisely because CPU has two irreplaceable advantages: 1. High flexibility; 2. High reusability.

So, if AI is embedded into transcoding, can the entire transcoding solution be implemented on the CPU?

In the fourth-generation Intel Xeon® Scalable processor released at the beginning of this year, Intel made a major innovation: using several built-in hardware accelerators to accelerate performance in different scenarios. Among them, AMX's AI acceleration completely fills the gap in CPU coding and builds full-link intelligent coding.

ec7a60d991cc99d5853cb0afc9c0f370.png

On Intel's fourth generation Xeon®, each physical core has such a built-in AMX acceleration unit.

So, who says you can’t have your cake and eat it too?

2

Ranked first for four consecutive years, how did Tencent Cloud do it?

As the saying goes, practice brings true knowledge, and Tencent Cloud’s road to supreme practice is a good example.

As 4k/8k videos gradually enter thousands of households, consumers’ viewing habits are gradually moving towards high-definition and ultra-high definition. As a leading service provider of high-definition video, the choice of Tencent Cloud has become very important.

2942bd86fd50ffb0f6f10b194669a39f.pngIn ultra-fast high-definition transcoding, Tencent Cloud supports 40%+ of the media processing services on the entire network.

In terms of technology selection, it was the irreplaceable advantages of the CPU that made Tencent Cloud decide to abandon the selection of hardware solutions and switch to pure CPU encoder processing. So, how does the fourth generation Xeon® help Tencent Cloud 4k/8k ultra-high definition decoding?

Let’s start with cost reduction

Super scores, computing power and upgrades

As mentioned earlier, the high flexibility of the CPU makes CPU upgrades almost cost-free. Pure CPU encoders can achieve higher compression rates than hardware solutions through algorithm design, and the upgrade of software solutions is more convenient. For example: the original hardware chip supports 8K265 encoding. If you want to upgrade to support 266 encoding later, the hardware needs to be redesigned, and the software only needs to be code upgraded. The system can continue to iterate to support the latest capabilities.

The pure CPU solution uses general-purpose computing power. When 8K transcoding is not performed, this part of the resource can be easily released for general-purpose CPU computing power utilization. When encoding 4k/8k, full-link intelligent encoding allows developers to focus on algorithm innovation without having to consider details such as deployment, and it can be used out of the box.

Process merging to reduce operation and maintenance costs : Since the super-resolution part requires very high computing power, it needs to be assisted by the GPU, but this will also cause some problems: migrating high-demand AI loads to the GPU will lead to complete encoding and pre-processing. separation. It's like decoding in one room - sending it to another room for pre-processing - and then back to encoding. Not only does it make the process lengthy, but it also places a huge burden on operation and maintenance. Repeated scheduling of data also causes a certain increase in latency. The CPU full-link intelligent coding incorporates this part into the CPU, successfully reducing operation and maintenance costs.

a81476cdfaf6de56720f485e4d018e59.png

Because of the flexibility of the software, Tencent Cloud's 8K real-time transcoding system can support all mainstream video codec standards. In MSU O264, V265 in 2021 and MSU H.264, H.265 and AV1 in 2022 and 2023, Tencent Cloud is far ahead.

Fine control

AMX, INC (Intel N) and accuracy

The high computing power of BF16 and INT8 is indeed very helpful in migrating AI from GPU to CPU, but how to ensure accuracy? Intel® Neural Compressor (INC) has built-in correction algorithms specifically for precision. As a developer, you only need to do three things: input model, input data set, and input accuracy requirements.

In addition, during the pre-processing process, the fourth-generation Xeon® uses intelligent coding to core-bind the CPU to finely control the overall transcoding process. For example, operations such as decoding, adding watermarks, converting resolution, encoding, etc. are all assigned to designated CPUs, and try to ensure that interdependent operations are all on the same CPU.

AI reasoning capabilities have been greatly improved : video pre-processing such as image quality enhancement requires powerful computing power support. This is a practical case of Intel and Tencent Cloud. In the two scenarios of video enhancement and target detection, the AI ​​inference performance optimized using the fourth-generation Xeon® AMX is improved by 1.86 and 1.95 times respectively compared with the previous generation platform.

e7f9bd4248a3ab8688fcfdf0ef5cbdc5.png

At the same time, accuracy loss is controlled within an acceptable range, which also enables users to implement full-link intelligent coding on the CPU, significantly reducing deployment costs and operation and maintenance costs.

3

"Core" inspires intellectual change and builds together

The human eye always longs for the clearest and most realistic images and videos, and people’s pursuit of clarity is never-ending. No matter how fast artificial intelligence brings technological advancement, digitalization and cloud computing should be necessary solutions for enterprises to cope with continuous changes.

At the 2023 Tencent Global Digital Ecosystem Conference on September 7, Intel will serve as an in-depth partner to hold a special sub-forum with the theme of "Core" Intellectual Change and Joint Construction . (Time: 14:30 Location: 1F CC105C)

In the Intel sub-forum, you can learn about the many new achievements of Intel and Tencent's all-round and in-depth cooperation in artificial intelligence, big data, scientific computing, audio and video, etc. over the past 20 years, as well as the construction of high-energy-efficiency , highly reliable and easily scalable new generation information technology smart infrastructure, promoting the latest progress in the deep integration of the digital economy and the real economy.

At the same time, Intel will also share its latest product and technology roadmap, including Intel AI large model solutions supported by advanced hardware and optimized software such as the fourth generation Intel® Xeon® Scalable processors and Hanana® Gaudi®2, and Intel’s cloud-edge integrated intelligent network solution.

In addition, at this conference, Intel will also set up a special exhibition area to display a total of 15 advanced solutions through four major areas: cloud and AI product solutions, cloud-to-end solutions, conference room solutions and edge product solutions.

93564997d32bbe45b820fb314578335c.jpeg

Standing at a new milestone in industrial digitalization, how do you view the infinite imagination that artificial intelligence, cloud computing, and big data will bring to the future?


Let’s wait and see on September 7th.

e0db7c7ac868321d631fff196fdd5d90.png

Scan the QR code in the picture to register immediately 

Guess you like

Origin blog.csdn.net/vn9PLgZvnPs1522s82g/article/details/132726210