Databend Open Source Weekly Issue 137

Databend is a modern cloud data warehouse. Designed for flexibility and efficiency to support your large-scale analysis needs. Free and open source. Experience the cloud service now: https://app.databend.cn .

What's On In Databend

Explore Databend's new developments this week and encounter Databend that is closer to your heart.

Support query matching inverted index

The inverted index is the most commonly used data structure in document retrieval systems. It is used to store the mapping of the storage location of a word in a document or a group of documents under full-text search.

matchDatabend now supports matching inverted indexes in queries .

SELECT id, score(), content FROM t WHERE match(content, '中国') ORDER BY score()
----
21 1.1967509 中国的古代诗词充满了深邃的意境和独特的韵味,是中华文化的重要组成部分。
24 1.1967509 中国的传统节日蕴含着丰富的文化内涵,是传承和弘扬中华文化的重要途径。
28 1.3336442 中国的饮食文化博大精深,各地的美食各具特色,让人流连忘返。
12 1.4319203 北京的故宫是中国古代建筑的瑰宝,吸引了无数游客前来参观。
15 1.5059 中国的茶文化源远流长,品茶已经成为一种生活方式。

If you'd like to learn more, feel free to contact the Databend team or check out the resources listed below.

Code Corner

Let’s explore code snippets or projects in Databend and the surrounding ecosystem.

Combine TASK and STREAM to capture and track user activity in real time

The stream (STREAM) in Databend is a dynamic real-time representation of table changes. Create streams to capture and track modifications to related tables for ongoing analysis. Tasks encapsulate specific SQL statements designed to be executed at predetermined intervals, triggered by specific events, or as part of a broader task sequence.

When creating a task, you can design it according to the workflow below.

The following example shows how to combine TASK and STREAM to capture and track user activity in real time, regularly synchronizing user_activity_profilesthe table with activities_streamdata in to ensure that user_activity_profilesit always accurately reflects the latest user activity.

-- Define a task in Databend
CREATE TASK user_activity_task 
WAREHOUSE = 'default'
SCHEDULE = 1 MINUTE
-- Trigger task when new data arrives in activities_stream
WHEN stream_status('activities_stream') AS 
    -- Insert new records into user_activity_profiles
    INSERT INTO user_activity_profiles
    SELECT
        -- Join activities_stream with user_profiles based on user_id
        a.user_id, p.username, p.location, a.activity, a.timestamp
    FROM
        activities_stream AS a
        JOIN user_profiles AS p
            ON a.user_id = p.user_id
    -- Include only rows where the action is 'INSERT'
    WHERE a.change$action = 'INSERT';

Interested parties are welcome to read the following document to learn how to use Databend Cloud to complete this task.

Highlights

Here are some noteworthy events, maybe you can find something of interest.

  • Support show viewsand desc view.
  • Added is_error, is_not_error, error_orfunction.
  • task_historySupport paging.
  • Supports PRQL query language.

What's Up Next

We are always open to cutting-edge technologies and innovative ideas, and welcome you to join the community and inject vitality into Databend.

Supports CHANGES clause

The CHANGES clause allows querying a table or view's change tracking metadata over a specified interval without creating a stream with explicit transaction offsets.

Combining multiple queries can be used to retrieve change tracking metadata between different transaction boundaries.

SELECT ...
FROM ...
   CHANGES ( INFORMATION => { DEFAULT | APPEND_ONLY } )
   AT ( { TIMESTAMP => <timestamp> | OFFSET => <time_difference> | STREAM => '<name>' } )
   [ END( { TIMESTAMP => <timestamp> | OFFSET => <time_difference> } ) ]
[ ... ]

Issue #15028 | Feature: support CHANGES clause

If you are interested in this topic, you can try to solve some of the problems or participate in discussions and PR reviews. Alternatively, you can click on https://link.databend.rs/im-feeling-lucky to pick a random question, good luck!

New Contributors

Meet new people in the community. Databend is a better place because of you.

Changelog

Check out the changelog for Databend's daily builds to stay up to date on the latest developments.

Address: https://github.com/datafuselabs/databend/releases

Contributors

A big thank you to the contributors for their great work this week.

Connect With Us

Databend is an open source, flexible, low-cost, new data warehouse based on object storage that can also perform real-time analysis. We look forward to your attention and exploring cloud native data warehouse solutions together to create a new generation of open source Data Cloud.

Linus took matters into his own hands to prevent kernel developers from replacing tabs with spaces. His father is one of the few leaders who can write code, his second son is the director of the open source technology department, and his youngest son is a core contributor to open source. Huawei: It took 1 year to convert 5,000 commonly used mobile applications Comprehensive migration to Hongmeng Java is the language most prone to third-party vulnerabilities. Wang Chenglu, the father of Hongmeng: open source Hongmeng is the only architectural innovation in the field of basic software in China. Ma Huateng and Zhou Hongyi shake hands to "remove grudges." Former Microsoft developer: Windows 11 performance is "ridiculously bad " " Although what Laoxiangji is open source is not the code, the reasons behind it are very heartwarming. Meta Llama 3 is officially released. Google announces a large-scale restructuring
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/5489811/blog/11049092