The cost is reduced by 90%, and the overseas social platform Typing is based on Databend’s big data exploration and practice.

Typing (Input Technology) was founded in 2022. It is an overseas enterprise that mainly provides social platforms for Southeast Asia, Latin America, the Middle East and other overseas regions. Its social platform is similar to domestic Soul, Momo, etc., providing social functions such as live video broadcast, voice chat room, short video, life sharing, text chat, etc., with over one million registered users and hundreds of thousands of daily active users. People can meet interesting people, make new friends, and build their own social communities within the platform.

Typing business scenario features

Today, social platforms have become an essential part of life. People make friends, share and exchange information on social platforms, and this information contains rich data on user behavior and preferences. Big data technology enables these massive amounts of data to be effectively mined and analyzed, thereby providing technical support and decision-making support for the development of social platforms and user experience.

As a social company, Typing’s importance of data is self-evident. A lot of business value can be mined through data:

1. Build user portraits for social platforms. User portrait is a user model based on the user's behavioral data and personal information. Typing can build accurate user portraits of users within the platform by analyzing data such as users' attention, friend relationships, interests and hobbies. Through user portraits, the platform can better understand users' needs and behavioral tendencies, thereby providing Typing users with more personalized and accurate services and recommendations, and improving the platform's user experience and satisfaction.

2. Content recommendation and personalized push on social platforms. Typing The entire social platform has a variety of content, including audio, video, text, pictures and other forms. It is often difficult for users to find the content and people they are interested in. With the help of big data analysis technology, Typing can analyze users' historical behavioral data to understand users' interests and preferences, thereby providing users with personalized content recommendations and pushes. Through personalized content push, Typing social platform can increase user activity and stickiness, while also increasing user dependence and loyalty to the platform.

3. Mining social relationships on social platforms. As a social platform, social relationships between people are the core of Typing. The understanding and analysis of social relationships can help Typing better discover users' interests and needs. With the help of big data analysis technology, Typing can analyze users' friend relationships, interactive behaviors and other data, discover interest groups and social networks between users, and thus provide users with more accurate and relevant social recommendations. At the same time, social relationship mining can also provide the platform with strategic guidance such as user churn prediction and user relationship maintenance to improve user retention and activity.

Technical challenges faced by Typing

Limited by the scale of the startup, Typing's entire R&D team only has about 15 people. It does not have a dedicated big data team or AI algorithm recommendation team. However, the company has a strong demand for refined operations, which requires good care of users and the entire platform. Know your roots and know your roots. How to derive valuable analysis and insights based on data has become indispensable. In order to achieve this goal, the Typing technical team has conducted a lot of exploration and has been exposed to the big data solutions of Alibaba Cloud and Huoshan Engine. However, in Typing's view, these solutions are very complicated from documentation to access, and the time and labor costs are relatively high. High, it is difficult for a start-up company to launch.

Typing also tried the open source Clickhouse, but it required dedicated data developers to do some intermediate data cleaning ETL work. Due to the lack of manpower in this area, it was ultimately unable to be implemented.

Why choose Databend?

During an open source event at a conference, Wu Yunpeng, the leader of Typing’s technical team, came into contact with Databend. After a series of in-depth understanding and exchanges, he was deeply attracted by the following features of Databend:

  • Storage and calculation separation architecture: Databend completely separates storage and calculation, and users can easily expand or shrink according to the needs of the application. At the same time, Databend is completely object-oriented storage design, breaking through the constraints of traditional database disk capacity;

  • High-performance query: Databend’s advanced architecture and vectorized query engine not only enable instant analysis of massive data, but also reduce latency to sub-second levels. At the same time, data-level parallelism (Vectorized Query Execution) and instruction-level parallelism (SIMD) technology are used to provide data analysis with excellent performance. Under the TPC-H standard, Databend is 1.3 times faster than the mainstream foreign new generation storage and computing integrated cloud native database in three dimensions: importing data, cold run, and hot run; compared with the traditional storage and computing integrated database, it is 2-3 times faster. speed increase;

  • Seamlessly connects with mainstream data ecology and tools: Databend Cloud seamlessly connects with mainstream data technologies and tools, providing Java, Go, Python, Node.js, Rust and other language SDKs, and supports Kafka, DBT, FlinkCDC, Airbyte, Data X , Devezium and other tools, it solves the compatibility problem of Typing’s original technology stack, meets all needs in data conversion, business intelligence, Ad-Hoc analysis and data application, and can help users quickly explore the potential value of data;

  • Low cost: Databend Cloud’s economical and intelligent computing cluster, coupled with highly compressed and performance-optimized object storage, can reduce costs by up to 90%. Start-ups like Typing no longer have to spend huge sums of money for data processing;

  • Easy to use: Databend Cloud provides a one-stop SaaS service. Through data pipeline and task management, it can make data import easier, allowing users to use it out of the box without operation and maintenance. At the same time, Databend has no indexes to build, no manual tuning, no manual calculation of partitions or sharded data, all done when the data is loaded into the table.

Deployment plan

The various features of Databend just meet Typing's needs for a big data platform, so Typing chose the Databend database as the main big data analysis tool. After a series of planning, preparation, compatibility assessment and other work, the big data computing business was successfully migrated to Databend Cloud. Currently, Typing's data source mainly comes from the AWS Aurora database, and developers regularly synchronize data in a T+1 manner every day. First, use databend-py SDK to export dozens of tables in the Aurora database to S3, and then directly import the data in S3 into Databend Cloud through Databend. Thanks to the open source philosophy that Databend adheres to and its open source contribution to Superset, Databend can easily access the Superset open source data dashboard tool. The data calculated by Databend Cloud is then transferred to Superset for data visualization.

In this scenario, the main purpose of Databend is to host operational data dashboards. Typing starts synchronizing at 8 a.m. every day, and the data volume is about 2-3TB. Data import and calculation can be completed before going to work at 10 a.m. Typing's technical staff can make some visual data dashboards for operations and products in Superset after work.

In addition, Databend has another use in Typing. It uses historical user behavior data generated in the database (such as consumption records, voice room, gift delivery, etc.) to perform full user calculations in Databend Cloud to calculate user group labels, and then Import the business server to support business application development to differentiate users and provide more personalized push.

Project income

It has been half a year since the deployment was completed in November last year. Databend Cloud has solved various challenges of Typing's big data analysis very well. Whether it is query speed, result accuracy or cost, it has exceeded Typing's expectations.

  • After migrating to Databend Cloud, Typing's data cost has been reduced by 90% on the basis of faster query speed. Currently, the highest cost part is the consumption of synchronizing data from AWS Aurora to Databend Cloud. Typing is also trying to explore with Databend. Reduce this cost by replacing the synchronization mechanism;

  • Typing's operations team often writes SQL to set some indicators and view data dashboards. Since Databend provides a unified SQL interface, it conforms to the original database usage habits of product and R&D, saving adaptation costs. The operations team reported that it is very easy to get started with the new data dashboard. No matter what you write, the results can be given quickly. The whole process is very smooth and stable;

  • During the service process, Databend officially provides exclusive engineer services, and emergency problems can be reported and fixed within days or hours. For Typing, it can save dedicated data development manpower and use Databend engineers as part of the data team, which was completely unimaginable in some cloud provider services in the past.

Future exploration

Currently, Typing is starting a new round of exploration of Databend. The trust in Databend also makes Typing want to expand it to a wider range of uses. In the future, Typing plans to synchronize the buried data of business servers to Databend Cloud. Since buried data contains more user behaviors, these data are more valuable for business decisions than database data. This part of the data will be used to support some more time-sensitive logical businesses. The server's hidden data is more time-sensitive and is synchronized approximately every 15 minutes, requiring near-real-time synchronization. After considering cost and timeliness, Databend provides an incremental synchronization solution, which can reach the minimum hour level.

Throughout the entire cooperation process with Typing, Databend not only helped Typing solve many existing technical problems, but also adhered to the concept of open cooperation to explore more scenarios with Typing to provide reliable data support for the development of social platform business .

High school students create their own open source programming language as a coming-of-age ceremony - sharp comments from netizens: Relying on the defense, Apple released the M4 chip RustDesk. Domestic services were suspended due to rampant fraud. Yunfeng resigned from Alibaba. In the future, he plans to produce an independent game on Windows platform Taobao (taobao.com) Restart web version optimization work, programmers’ destination, Visual Studio Code 1.89 releases Java 17, the most commonly used Java LTS version, Windows 10 has a market share of 70%, Windows 11 continues to decline Open Source Daily | Google supports Hongmeng to take over; open source Rabbit R1; Docker supported Android phones; Microsoft’s anxiety and ambitions; Haier Electric has shut down the open platform
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/5489811/blog/11105696