AI Daily|Google releases Astra to counter GPT-4o, Byte releases 9 self-developed large models, Tencent Hunyuan open source Wenshengtu large model...

Article recommendations

GPT-4o was released, and users reviewed it immediately. Is there any exaggeration in OpenAI?

OpenAI live broadcast countdown, GPT-5 is confirmed to be absent, GPT-3.5 to 5, understand the big difference in AI evolution in one article!

Hot topics in this issue

Google holds I/O 2024: Project Astra is released to counter GPT-4o, Gemini series models are updated

Co-founder and AI pioneer Ilya Sutskever leaves OpenAI

US and China to hold AI security talks to prevent "miscalculation and accidental conflict"

ByteDance officially launches self-developed bean bag large model series, “99.3% cheaper than the industry”

Upgrade to benchmark Sora, Tencent Hunyuan open source Wensheng graph large model

...

Google I/O 2024: Project Astra is released to counter GPT-4o, Gemini series models are updated

At the Google I/O 2024 conference, Google shared how to use AI to build more useful products and functions. The conference included the following sharing content:

  • Gemini series model updates:

Gemini 1.5 Pro upgrade: Expands the context window to 2 million tokens and also enhances its code generation, logical reasoning and planning, multi-turn dialogue, and audio and image understanding through advancements in data and algorithms. Gemini 1.5 Pro is an upgrade that can follow increasingly complex and detailed instructions, including specifying behavioral instructions involving roles, formats and styles.

Gemini 1.5 Flash released: 1.5 Flash is the latest member of the Gemini model family and the fastest Gemini model in the API. It is optimized for large-scale, high-volume, high-frequency tasks, and the service is more cost-effective.

Gemini Advanced: With the introduction of Gemini 1.5 Pro, you can handle multiple large documents and make complex plans, and Gemini Live will be launched for Gemini Advanced subscribers to achieve better language interaction.

  • Release of Project Astra, an AI assistant with visual memory:

It can process text, video and audio in real time, be able to answer questions about each other and interpret them, or generate creative output, and can recognize and interpret diagrams or program code on a whiteboard.

  • Comparing with Sora, the video generation model Veo is launched:

Veo can generate over a minute of high-quality 1080p resolution video in a variety of cinematic and visual styles. And the nuance and tone of a cue can be accurately captured, providing an unprecedented level of creative control—understanding cues for a variety of cinematic effects, such as time-lapses or aerial shots of landscapes.

  • Google Search AI releases AI Overviews:

Based on the multi-step reasoning capabilities of customized Gemini models, AI Overviews will help solve increasingly complex problems. Instead of breaking your question into multiple searches, you can ask the most complex questions in one go, with all the nuances and caveats you think of.

  • Gemma family has added new members:

PaliGemma, the first open model for visual language, is optimized for image captioning, visual question answering and other image labeling tasks.

Gemma2, the next-generation open model due to be released in June this year, outperforms some models more than twice its size and can run efficiently on a GPU or a single TPU host in Vertex AI.

Learn more:

https://blog.google/inside-google/message-ceo/google-io-2024-keynote-sundar-pichai/

Co-founder and AI pioneer Ilya Sutskever leaves OpenAI

Ilya Sutskever, co-founder of OpenAI and co-author of the seminal AlexNet paper, is leaving the company after nearly 10 years to pursue a new project of "personal significance" to her. Jakub Pachocki will take over as director of research. Jakub has worked at OpenAI for more than seven years and is described by CEO Sam Altman as one of the most brilliant thinkers of his generation. According to Sam, he leads most of the company's major projects. In November 2022, Ilya participated in the temporary ouster of CEO Sam Altman, who had been criticized for forced commercialization and related security risks. However, an investigation found the dismissal was unwarranted. Ilya apologized, helped reinstate Altman, and then left the board. Hours after Ilya resigned, AI security researcher Jan Leike also announced his departure. Leike and Ilya co-led the Superalignment team established by OpenAI in the summer of 2023, with the goal of gradually iteratively aligning superintelligence and creating an automated alignment researcher with human capabilities.

Learn more:

https://the-decoder.com/co-founder-and-ai-pioneer-ilya-sutskever-leaves-openai/

ByteDance officially launches self-developed bean bag large model series, “99.3% cheaper than the industry”

At the 2024 Spring Volcano Engine FORCE Motive Power Conference held today, ByteDance launched its self-developed “Bean Bag Large Model” series. This large model family covers the bean bag general model Pro and liti, as well as the bean bag·role-playing model, the bean bag·speech synthesis model, the bean bag·sound reproduction model, the bean bag·speech recognition model, the bean bag·Vensen diagram model, the bean bag·Function Call The nine major models including the model comprehensively demonstrate ByteDance’s profound accumulation and innovation capabilities in the field of artificial intelligence. "Only with large usage can we polish a good model and significantly reduce the unit cost of model inference. The price of Doubao's main model in the enterprise market is only 0.0008 yuan/thousand Tokens, and 0.8% can process more than 1,500 Chinese characters, which is cheaper than the industry 99.3%." Tan Dai said that the shift from pricing in cents to cents will help companies accelerate business innovation at lower costs.

Learn more:

https://mp.weixin.qq.com/s/WPs7Gt3Dt_SqkN1PJXsmmw

Upgrade to benchmark Sora, Tencent Hunyuan open source Wensheng graph large model

Tencent announced that its Hunyuan Wensheng graph model has been upgraded and open sourced. It has been released on Hugging Face and Github. It includes complete models such as model weights, inference code, and model algorithms, and is available for free commercial use by enterprises and individual developers. The upgraded Hunyuan Wenshengtu large model adopts the same DiT architecture as Sora. Tencent said that Hunyuan DiT is the first bilingual DiT architecture in Chinese and English. Hunyuan DiT is a text-to-image generation model based on the Diffusion transformer. This model has fine-grained understanding capabilities in Chinese and English. Hunyuan DiT can conduct multiple rounds of dialogue with users to generate and improve images based on context. This is also the industry's first Chinese-native DiT architecture Vincentian graph open source model, which supports Chinese and English bilingual input and understanding, with 1.5 billion parameters.

Learn more:

https://www.ithome.com/0/767/876.htm

If there is any infringement, please contact us to delete it.

"Trusted AI Progress" The official account is dedicated to the dissemination of the latest trusted artificial intelligence technology and the cultivation of open source technology, covering large-scale graph learning, causal reasoning, knowledge graphs, large models and other technical fields. Welcome to scan the QR code to follow and unlock more AI information~

Microsoft's China AI team collectively packed up and went to the United States, involving hundreds of people. How much revenue can an unknown open source project bring? Huawei officially announced that Yu Chengdong's position was adjusted. Huazhong University of Science and Technology's open source mirror station officially opened external network access. Fraudsters used TeamViewer to transfer 3.98 million! What should remote desktop vendors do? The first front-end visualization library and founder of Baidu's well-known open source project ECharts - a former employee of a well-known open source company that "went to the sea" broke the news: After being challenged by his subordinates, the technical leader became furious and rude, and fired the pregnant female employee. OpenAI considered allowing AI to generate pornographic content. Microsoft reported to The Rust Foundation donated 1 million US dollars. Please tell me, what is the role of time.sleep(6) here?
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/7032067/blog/11149645