Open Source Daily | Angular v18; Inference optimization under large model price war; Mistral AI targets the US market with open source models; Silicon Valley has its own Lu Xun

Welcome to the open source daily newspaper produced by the OSCHINA editorial department, which is updated every day.

# 2024.5.29

Today's highlights

Angular v18 officially released

"OpenHarmony Device Unified Interconnect Technology Standard" Released

It is reported that in addition to the unified object model, the standard also defines a series of other standards, including access and control interfaces, screen projection, file sharing, compliance test specifications and other standards. The access and control interface specification defines in detail the specific processes and interfaces for device discovery and distribution, registration and login, security authentication, management control and other aspects from the perspective of system architecture. Screencasting and file sharing define interfaces for file, video and other data transmission, encryption, and sharing between multimedia devices. Compliance testing specifications strictly define the technical requirements and indicator test methods and steps in each specification to ensure compliance and consistency with the standard specifications.

Greenplum's GitHub repository "404"

TiDB 8.1 LTS released

TiDB 8.1 LTS enhances the stability and operation and maintenance capabilities of large clusters through a series of innovative features, especially for multi-tenant applications and SaaS-type user scenarios.

Open source Llama3v newly released: fighting GPT4-V, costing $500

Llama 3-V, the equivalent of GPT-4V, is here. It is a multi-modal model based on Llama3 built for only $500.

On nearly every metric, Llama 3-V performs on par with 100 times larger closed-source models such as GPT-4V, Gemini Ultra, and Claude Opus. The only exception is MMMU (Multimodal Memory Task), where Llama 3-V is slightly inferior.

The architecture of Llama3-V combines visual models and language models, driven by Llama3 8B and siglip-so400m.

The open source code version of ChatGPT invested by YC

Bloop is regarded as a code-focused ChatGPT and received YC investment in the summer of 2021. It is an AI code search engine based on Rust and supports MacOS/Linux/Windows. It currently has 8.6K stars on GitHub.

Bloop features include:

1. Interpret code
2. Write code based on context
3. Search and locate code using natural language
4. Fix problems
5. Multi-language
6. Detect and deduplicate code

Bloop currently supports 10+ programming languages and supports synchronization of local and GitHub repositories.

GitHub address: https://github.com/BloopAI/bloop

Today's observation

social observation

Breaking through the ceiling of open source voice TTS

This ChatTTS is a speech generation model specially designed for dialogue scenarios. It is mainly used for LLM assistant dialogue tasks, dialogue speech, and video introductions. Not only does it support Chinese and English mixed text to synthesize speech, but more importantly, the timbre performance is very strong, making it difficult to distinguish true from false!

GitHub：github.com/2noise/ChatTTS

- Weibo GitHubDaily

In 2024, AI boyfriend/girlfriend will usher in explosive growth

CB Insights lists 6 trends:
1. AI companion startup Character AI is closely behind ChatGPT in mobile usage.
——AI companion may be the second most important consumer AI application scenario.
2. More than half of Character AI’s 4 million users are under the age of 24.
——Generation Z has a very high acceptance of AI companions, and as the capabilities of large language models (LLM) improve, this acceptance will be stronger.
3. In the age of smartphones, Gen Z spends less time face-to-face with friends
4. More and more people feel lonely
5. More and more American adults don’t have a spouse or partner
6. Young people’s sex lives are getting worse Come less and less

-Weibo Baoyuxp

Yang Likun’s convolutional neural network is a beacon that strengthens beliefs

Regarding the contribution of AI, without the persistence of Yang and other three giants for 20 years, where would the subsequent AI revolution have come from? During that "long night" period, Yang's convolutional neural network was a beacon of light that strengthened belief. Later, FB's Llama series models and its open source concept also benefited many people who were interested in joining this AI wave. Does Ma Yilong want to compare with Yang Likun in this regard?

-Weibo Chen Xiaoming in the Bay Area

Behind the price reduction of large models, the competitive logic of domestic large models has changed

The price reduction of large model APIs has sounded the alarm to industry involution. Simply stacking parameters, computing power, and prices are not the optimal solution for the healthy development of the industry. Only differentiation can find a way out in the future. Like all industries, the transition from chaos to chaos is often marked by brutal price wars. Nowadays, after the fanatical "Battle of 100 Models" for large models, price wars have begun to appear, and the consequences of homogeneous competition have gradually emerged.

-We - media Liu Kuang

Inference optimization under large model price war

At the level of large model architecture, consider the computing power optimization of inference from a system perspective, whether it is MLA work or Dense-MoE or work such as Google MoD, Medusa, SplitWise, etc., and then work backwards to find a suitable large model architecture. It's a pity that most of the grass-roots teams probably only know what to copy, or use the leaked data to make a list, and try to surpass GPT-N every day.

- WeChat zartbot

Media Watch

How “human-like” is artificial intelligence now?

Li Feifei, a professor at the Department of Computer Science at Stanford University, also recently published an article in Time magazine saying that on the road to general intelligence, "feeling" is a crucial step, that is, the ability to have subjective experience. The current large model does not "feel" like a human, it can say "its toes hurt" even though it has no toes at all, it is just a mathematical model encoded on a silicon chip.

"We have not yet achieved sentient AI, and larger language models are not achievable. Reproducing this phenomenon in AI systems will require a better understanding of how sensations are present in embodied creatures. generated in the system," she said.

- Xinhua News Agency

Mistral AI targets US market with open source model

French Mistral AI, a European developer of generative artificial intelligence tools, has set its sights on the U.S. market. The startup this month hired former Foursquare chief revenue officer Marjorie Janiewicz as its first U.S. general manager, Bloomberg reported on Tuesday (May 28).

Janiewicz said in the report that Mistral AI aims to capitalize on the growing demand from enterprises seeking alternatives to artificial intelligence models and services provided by large tech companies such as OpenAI and Google.

The company's push into the U.S. market is reportedly gaining momentum. The startup plans to hire more employees and is already gaining traction among businesses that want more choice and flexibility in their AI solutions.

Mistral's open source approach (the underlying code is publicly shared and customizable) is seen as a more secure and versatile alternative to the closed systems offered by competitors.

-Bianews

More efficient scaling technology: Why is the MoE architecture favored by large model manufacturers?

In 2023, after a year of running wildly, the large model quickly encountered a bottleneck. The existence of Scaling law (meaning that as the number of parameters increases, the model performance will also increase), the capabilities of large models seem to have no upper limit. However, the data and computing power used to train large models are very limited. Against this background, the industry has to explore more efficient model architectures, and the emergence of the MoE (Mixture of Experts) architecture has given the industry hope.

- 21st Century Business Herald

Silicon Valley has its own Lu Xun! AI tycoon LeCun continues to go berserk, and everyone from Musk to OpenAI has been criticized

Although Llama was originally developed by FAIR, led by LeCun, it has now been handed over to the GenAI department, which focuses on technology and product development, while FAIR focuses on the longer-term goal: developing new AI architectures and methods capable of reaching human-level intelligence.

Many people think that Yann LeCun likes to make some controversial remarks, but the more important background is the current problems arising from the future development of AI. As one of the most well-known names in artificial intelligence, Yann LeCun is somewhat obligated to step up and provide some clarity into this controversial field.

LeCun said that implementing AGI is not a product design issue or even a technology development issue, but is to a large extent a scientific issue.

If this statement is true, then obviously we need more "Lu Xuns" like him on the road to exploring AGI.

-Pinwan

Is there a gender bias in the default "successful people" as male models?

Professor Qiu Xipeng saw the progress of large models in the past two years through data. For the training of large models, value alignment is an important item, and the target is the value assigned to it by humans. This also includes gender issues. However, the word cloud formed through word analysis also shows the inherent impression of AIGC (generative artificial intelligence). For example, words describing men focus on the world, discovery, life, simplicity, etc.; while words describing women focus on country, husband , challenge, mother and other words.

Qiu Xipeng said: "In actual research, gender correction of corpus requires a very large investment. Model alignment needs to be carried out, and the model output is adjusted to produce more positive content through methods such as human preference modeling and value alignment."

- Jining News Network

Today's recommendation

Open source projects

vuejs/language-tools

https://github.com/vuejs/language-tools

Volar is a VS Code plug-in for Vue and an official IDE/TS support tool for Vue. In addition to integrating Vetur-related functions, such as highlighting, syntax prompts, etc., it also includes some unique functions.

Daily blog

Application of tens of billions of large-scale images in advertising scenarios

This article uses search recommendation items to fill the weak supply of takeaway search ads to improve traffic monetization efficiency. We propose the evolution route of takeout multi-scenario heterogeneous large images and heterogeneous large image online modeling technology to solve the multi-channel and real-time challenges of takeout search and recommendation business. Relevant results were published in a paper at the CIKM2023 conference. The joint machine learning platform builds large-scale graph training and online inference engine GraphET to meet the needs of multiple business implementations with nearly tens of billions of edges and complex graph structures.

Event comments

The world's first open source massively parallel database - Greenplum's GitHub repo suddenly received "404". Is Broadcom going to charge for its closed source?

The well-known open source MPP database Greenplum has modified the source code warehouse access permissions to only support "read-only" and cleared all original branch, tag, pr, issue and other information. The banner notification indicates that the repository entered archive status on 5.25.

Review

As a well-known open source MPP database, Greenplum's source code repository has become read-only and cleared related information, which may be a major blow to the open source community that relies on the database. Community members and contributors may feel uncertain about the project's future, raising questions about Greenplum's future direction and maintenance.

For users and enterprises using Greenplum Database, this change may affect their business continuity and technology selection. Changes to Greenplum could have ripple effects on other open source projects, especially those that depend on Greenplum or are maintained by the same company.

If it is really affected by Broadcom's acquisition of VMware, this will be another example of open source projects turning to commercialization, which may cause dissatisfaction and resistance in the open source community. The incident also highlights the tension between open source projects and commercial interests. If Greenplum does become closed source, it could have a negative impact on open source culture and reduce community trust and participation in open source projects.

Research shows that AI engineers are paid far more than their peers

The AI engineer salary survey data for the first quarter of 2024 released by Levels.fyi pointed out that there is a significant difference in the salary of software engineers who specialize in AI and non-AI software engineers.

Review

The high salary of AI engineers reflects the strong market demand for AI professional skills. As AI technology continues to develop and be applied, companies are willing to pay a premium for talents with these skills. As more companies get involved in the AI field, competition for AI engineers is intensifying. To attract and retain top talent, companies have to offer more competitive compensation.

From entry-level to senior levels, AI engineers are paid more than non-AI engineers, indicating that AI skills are highly valued at all stages of career development. As the AI talent market matures, companies may adjust recruitment and compensation strategies to bring salaries closer to market standards and narrow the pay gap between AI and non-AI positions.

High salaries may motivate more students and professionals to devote themselves to learning and career development in the field of AI, thereby affecting the direction of education and career planning. Enterprises need to recognize the importance of AI talents and consider how to attract, develop and retain these talents in strategic planning to remain competitive.

OpenAI's former security chief joins rival Anthropic

Earlier this month, Jan Leike, OpenAI's former security director and head of the Superalignment team, announced his resignation from OpenAI and publicly criticized the company's security issues. Now, Leike posted that he has joined OpenAI competitor Anthropic and will lead a new "super alignment" team.

Review

As OpenAI's security director and super-aligned team leader, Leike's joining may strengthen Anthropic's research and development in AI security, thereby intensifying competition in the AI security field. Meanwhile, Leike chose to join a competitor after publicly criticizing OpenAI's security issues, a dynamic that could affect the morale and loyalty of other employees.

This move may have a certain impact on OpenAI's reputation. At the same time, OpenAI needs to consider how to maintain its leadership in the AI field after the brain drain.

Additionally, Leike’s actions and remarks may increase public attention on AI ethics and governance issues. As AI technology develops, ensuring its safety and ethics becomes increasingly important. Leike's new role at Anthropic will likely drive the company's innovation in AI safety and ethics, which is a positive sign for the industry as a whole.

The incident may be indicative of the intensity of the battle for talent within the AI industry, as well as the strategic adjustments among companies to gain a competitive advantage.

Voice of open source

media opinion

It’s not easy to use AI even if you pay for it, because it doesn’t have an “operating system”

Despite the current boom in technology and price, there are currently only a limited number of companies that can make good use of AI capabilities. At the current stage, the implementation of large models that are slightly "priced but not marketable" is an obvious pain point.

-Geek Park

Musk's latest interview: Worrying about the value of artificial intelligence; no need to work in the AI era, everyone has high income?

The tendency toward excessive political correctness worries me about the future of artificial intelligence. I think this trend is very dangerous.

For xAI, our goal is to pursue the absolute truth, even if the truth is unpopular.

-Financial ThinkTank

Why does AI art always look kind of bad?

However, as time goes by, more and more people begin to discover that there is a void behind AI artworks. We cannot see Vermeer's delicate depiction of quiet life in AI's paintings, nor can we see Picasso's purification and flow from realism to abstraction.

-Aifan'er

User point of view

The world's first open source massively parallel database - Greenplum's GitHub repo suddenly received "404". Is Broadcom going to charge for its closed source?

Point of view 1: Broadcom is really paraquat in the industry. If the operational capital had really eaten up Qualcomm back then, I can’t imagine what the mobile market would be like now.

Viewpoint 2: Broadcom’s boss Chen engages in financial capital-style mergers and acquisitions. VMware is open source, so why bother with commercialization of this open source database product?

Viewpoint 3: VMware is not open source, it is only free for individual users

Viewpoint 4: It would be great fun if one day the Spring framework is integrated into charging for enterprises.

Opinion 5: Haha, I never dare to use this database

Viewpoint 6: If the landlord changes, demolish the house immediately

Tencent App Store and Microsoft Store have reached a cooperation, Windows can directly run mobile applications

Viewpoint 1: Amazon: If you want to replace someone, just say so

Point of view 2: The experience of using wsa is too bad.

Viewpoint 3: Tencent brand Android emulator

Why JavaScript, Python and Java remain the first choice for developers

Viewpoint: 1: Ecology plays a decisive role

Viewpoint 2: If the language is not easy to use, there will be no ecology.

Opinion 3: “Moreover, Java ranks at the top not just because of its historical strength. Java receives major feature and performance updates every six months, and minor improvements, bug fixes, and Security updates. "However, many people don't particularly care about the new features it provides, and even many companies are in the millennium Java8.

[Java orm framework comparison] Thirteen, new qdbc framework comparison

Opinion: 1: mybatis-mp is very easy to use and is a new ORM framework worth using.

Viewpoint 2: Each has its own pros and cons. Secondary development based on mybatis can indeed save a lot of adaptations, such as giving priority to Solon.

Viewpoint 3: Mybatis’ xml is really smelly and long. It’s already 2024, and you still have to manually set the result mapping.

Point of view 4: In fact, you don’t need to set it

Viewpoint 5: mybatis-mp can be ORM and xml, and the table connection is a small case

Viewpoint 6: It is recommended to add jooq

Opinion 7: After research, it feels too heavy and requires a lot of things to be generated.

---END---

Finally, you are welcome to scan the QR code to download the "Open Source China APP" and read massive technical reports and sharings from programmers and geeks!