Lao Huang brought the "Super GPU" to explode the E-class AI supercomputing performance soaring, the more you buy, the more cost-effective,

b4bfdd4ac3d94585cda8fceddf9905da.jpeg


   New Zhiyuan report  

Editor: Editorial Department
[Introduction to Xinzhiyuan] Nvidia has enlarged its strategy again, this time directly using the super GPU GH200 to explode the market.

At today's COMPUTEX conference, Nvidia CEO Huang Renxun announced to the world that we have reached the tipping point of generative AI. From then on, every corner of the world will have computing needs. Nvidia, whose share price has just skyrocketed by $200 billion, is already prepared for this moment. At the beginning, Lao Huang , dressed in black leather, walked onto the stage impassionedly, "Hi everyone! We're back! " Then, he brought out the big killer - "Super GPU" GH200, and announced Google Cloud, Meta and Microsoft Will be the first to get GH200. dedcad6f4ba5169470a440d7962d354b.jpeg It is said that more than 3,500 people came to the scene and experienced this 2 -hour passionate speech . After 4 years, Lao Huang, who has been away for a long time, is also fluent in Chinese. a0438e00d01a6d6dc5b2abdbc13ad8a3.jpeg

"Super Chip" GH200


It should be said that in this speech, the highlight is still on the GPU. After all, the AI ​​iPhone has arrived. Lao Huang held a chip in his left and right hands respectively, announcing that the "GH200 super chip" has been fully put into production. d3d02c53b2b95d7c03d996830710a766.jpeg This "super GPU" uses NVLink-c2c interconnection technology to combine the ARM-based energy-saving Grace CPU with the high-performance NVIDIA H100 Tensor Core GPU, providing a total bandwidth of up to 900GB/s. At present, more than 400 system configurations have been added to the system supported by GH200. These system configurations are powered by different combinations of Nvidia's latest CPU, GPU, and DPU architectures. These include Grace, Hopper, Ada Lovelace, and BlueField, architectures created to meet the growing demand for generative AI. b23fa4d5dfef8ca44d5f54e5e6a03b27.jpeg In addition, Lao Huang also announced a more important one: a supercomputer composed of 256 GH200s is coming.

Supercomputer DGX GH200, launched this year

According to Nvidia, the new DGX GH200 artificial intelligence supercomputing platform is designed for large-scale generative AI loads. This supercomputer composed of 256 Grace Hopper super chips will have an extraordinary AI performance of up to 1 exaflop and 144TB of shared memory (nearly 500 times more than the previous generation DGX A100). For example, in GPT-3 training, it can be 2.2 times faster than the previous generation DGX H100 cluster. Additionally, the behemoth contains 150 miles of fiber optics and more than 2,000 fans. At present, Nvidia has cooperated with the three giants, Google, Meta and Microsoft. e2a02366b9347933f2b8c473a72fa161.jpeg Due to the explosive growth of generative artificial intelligence, giants such as Microsoft and Google want to have more powerful and better performing systems. The DGX H200 is designed to deliver maximum throughput for massive scalability of the largest workloads by using Nvidia's custom NVLink Switch chip, bypassing the limitations of standard cluster connections such as InfiniBand and Ethernet. ecfd51f0d41db24710da5011989f5fc4.jpeg In addition, Nvidia said it is building its own large-scale AI supercomputer, NVIDIA Helios, which is expected to go online this year. It will use four DGX GH200 systems connected to an NVIDIA Quantum-2 InfiniBand network to increase data throughput for training large AI models. In the past, data centers were very large and CPU-based, and algorithm iterations took a long time, and most of the algorithms were also CPU-centric. Now, with Grace Hopper, the process can be done in days or even hours. It is going to revolutionize the entire industry! 6f1793fb0dc0487d93edd44b7fb35018.jpeg (Wait, isn't the parameter of PaLM 540B?)

Lao Huang: The more you buy, the more money you save!


As the current leader, such a 65-pound H100 computer, worth 200,000 US dollars, is the world's first equipped with a Transformer Engine, and it is currently the most expensive computer in the world. f973058bb6397cf4e52a42b972bf161d.jpeg Lao Huang said that it can be said that the more you buy a product like this, the more you save. 11c7abd3351995edeec30edeb8c24345.jpeg Next, Lao Huang mentioned the IBM 360 in 1964, emphasizing the importance of the CPU. Lao Huang confidently repeated, "60 years later, we now have a data center. Today, a data center is a computer." f36c9e27d6e348b6d22c1e6fae707065.jpeg As Lao Huang said, a new computing model is being created. e0ce444eccfb53a316873d2028b06f52.jpeg Why is using a GPU better than using a CPU? Lao Huang gave an analysis from the configuration: at a cost of 10 million US dollars, you can build a data center with 960 CPUs, but this data center needs 11GWh of power to process 1X LMM (Large Language Model) data volume. cdfdd4a52df9a0d3bec3a826b9a47ec3.jpeg But for the same money, you can build a data center equipped with 48 GPUs, but only need 3.2GWh power consumption, and can process 44X LLM data volume. fb127d74699a7f6981c5156d35b5a71d.jpeg You know, such a configuration is amazing enough. However, this is not enough. In order to obtain extreme performance, you can directly increase the number of GPUs to 172 with the same power consumption. At this time, the computing power can be as high as 150 times that of the CPU data center. Of course, the budget was also raised to $34 million. 5d30f3f12791c1d1c5103dc4a1d88777.jpeg In addition, if you just want to complete the work at hand (1X LLM), Lao Huang will also help you reduce the cost - you can buy a data center equipped with 2 GPUs for only 400,000 US dollars. The power consumption is only 0.13GWh. b2eb757ae4bc8364180e091f2bf1145a.jpeg There was applause from the audience, and Lao Huang took out the mantra "The more you buy, The more you save", and even repeated it three times. What is the strategy behind this? Lao Huang gave a formula. 5e3ca408c068500dca9b6e7e1ec6e92e.jpeg

MGX: Modular Architecture

At the same time, Huang also launched NVIDIA MGXTM, a reference architecture for system manufacturers to quickly and cost-effectively build more than 100 server variants. The specification is said to cut development costs by as much as three-quarters and cut development time by two-thirds to just six months. With MGX, technology companies can optimize the basic system architecture for accelerated computing for their servers, and then choose their own GPU, DPU and CPU. MGX can also be easily integrated into cloud and enterprise data centers. 4a14d78e04fac3a33a4d402573f33fa0.jpeg In addition to hardware, MGX is also supported by NVIDIA's complete software stack, which enables developers and enterprises to build and accelerate AI, HPC and other applications. This includes NVIDIA AI Enterprise, the software layer of the NVIDIA AI platform, which features more than 100 frameworks, pretrained models and development tools to accelerate AI and data science, fully supporting enterprise AI development and deployment.

Introducing AI into the game, NPC characters with real-time voice chat are coming

The highlight of this speech is the new custom AI model foundry service - Avatar Cloud Engine (ACE) for Game. At the scene, Lao Huang held an RTX 4060 Ti in his right hand and a computer in his left hand, demonstrating Cyberpunk 2077 running real-time ray tracing. In a ramen shop scene full of "cyberpunk" style, players press a button to speak in their own voice, and the owner Jin will answer. Jin is an NPC character, but his responses are generated in real-time by a generative AI based on the player's voice input. Jin also has realistic facial animations and voices that match the player's tone and backstory. 6c8728521a090963a4b87838dea6c2e8.jpeg The generation of this realistic character uses a real-time artificial intelligence model rendering tool Nvidia Ace. Lao Huang said that the characters in this game are not preset. They have a typical quest giver NPC type. 86248473609e7b68123e44e16e82fb20.jpeg But from the video, it can be seen that the avatar's conversation is a bit stilted, but it's not too bad.

Those without AI expertise, will be left behind

For 40 years, we have created PC, internet, mobile, cloud, and now is the era of artificial intelligence. what will you create Whatever it is, go after it like we did. To run, not to walk. Either you run for food, or you allow yourself to escape and be food.
73edfaa365171900e65452b1b2f326c8.jpeg On May 27, Huang Renxun delivered a graduation speech at National Taiwan University. At this moment, he is attracting the attention of the whole world. Instantly transformed into the master of trillions, making his words more emboldened. Huang Renxun said that every company and individual should be familiar with artificial intelligence, otherwise, there is a danger of failure. f8bcbfb0599a27f83efaa75914e251fa.jpeg He emphasized: agile companies will use artificial intelligence to improve their status, such companies will not go bankrupt. Many people worry that AI will take away their jobs, but the ones who will really take away your jobs are those who have mastered AI technology. At that time, he predicted in his speech: From all perspectives, the prosperity of AI is an opportunity for the regeneration of the computer industry. Over the next decade, our industry will replace trillion-dollar traditional computers with new AI computers. From today's speech, we seem to have glimpsed the prototype of this future.





Guess you like

Origin blog.csdn.net/zhaomengsen/article/details/130953516