Milvus is new! The new Range Search function allows precise control of search results

The Range Search feature was born out of the community.

One day, a user who makes system recommendations put forward a request in the community, hoping that Milvus can provide a new function that can return results whose vector distance is within a certain range. And this is not an isolated case. When developers do similarity queries, they often need to perform secondary filtering on the results.

To help users solve this problem, Milvus has launched a new feature - Range Search. This article will give you a detailed explanation of this new feature, including the basic introduction of Range Search, usage scenarios and the technical details behind it.

01.What is Range Search?

As the name suggests, Range Search is a range search. Unlike KNN Search, which returns the most similar TOP-K results, Range Search returns the TOP-K results whose vector distance falls within a certain interval.

So, when to choose Range Search over Top-K Search?

The most typical application scenario of Range Search is the recommendation system. For example, product recommendation, a good recommendation system should return results that are somewhat similar to the product clicked by the user, but not too similar. Recommendations that are too similar or too dissimilar will lead to unsatisfactory recommendation results.

Before the Range Search function was available, users working on recommendation systems could only perform KNN Search once, and then filter the query results twice outside the Milvus system. Now, with the Range Search function, they can directly call Range Search and get the results they need at once.

Range Search adds 2 new parameters, namely:

  • radius - refers to the outer boundary of similarity

  • range_filter - refers to the inner boundary of similarity

02.Technical implementation details of Range Search

Next, we dive into the architecture and algorithms of the Range Search function, discussing its advantages, limitations, and the integration of Range Search with third-party algorithm libraries.

Range Search reuses the existing search process, and all data paths in the upper layer are almost identical. Here are the steps taken when a search request is received:

  • The SDK receives a query request from a user, and the search param contains radius and range_filter information;

  • After receiving the query request, proxy generates a SearchTask and passes it to querynode;

  • After receiving the SearchTask, querynode calls the Search interface of segcore through cgo;

  • segcore will parse the parameters in search_param, and if there is radius, call knowhere::RangeSearch;

  • knowhere then calls the range_search function of the corresponding third-party library according to the index type.

Currently, all third-party library indexes only support one-sided Range Search, which means they only receive one parameter, radius, and the returned results are all unsorted results. The following table outlines the Range Search strategies for different index types"

For binary type indexes, both HAMMING and JACCARD support Range Search. SUBSTRUCTURE/SUPERSTRUCTURE does not support Range Search because the return value is true/false and does not meet the semantics of Range Search. The remaining float type indexes all support Range Search for L2/IP/COSINE.

The following table shows all index types and metric types that support Range Search:

03.How to use Range Search

To use Range Search, you only need to modify the search parameters in the search request. Next, I will talk about the detailed usage guide, and also provide Python sample code at the end of the guide.

Before starting

Please make sure Milvus is installed and running.

Please ensure that 1 Collection has been created and indexed on the Collection.

Range Search Parameters

  • radius: required parameter. Determines whether the search request will perform a Range Search or a KNN Search.

  • range_filter: optional parameter. If this parameter is set, the function will perform secondary filtering on the results.

Through the above two parameters, we can fine-tune the behavior of Range Search according to different application scenarios and needs. The following is sample code:

default_index = {"index_type": "HNSW","metric_type": "L2","params": {"M":48,"efConstruction":500}
}
collection.create_index("float_vector", default_index)
search_params = {"metric_type": "L2","limit": TOPK,
                 "params": {"ef":32,"range_filter":1.0,"radius":2.0}
}
res = collection.search(vectors[:nq], "float_vector", search_params, limit)

04.Parameter check

The following table lists the legal value ranges of radius corresponding to all metric types:

Since the legal value ranges of radius corresponding to different metric types are different, Milvus will not check the legality of radius, but only check the relative legality of radius and range_filter:

  • For L2/Hamming/Jaccard, range_filter < radius

  • For IP/Cosine, range_filter > radius

05. Summary

Milvus' Range Search function is not limited to recommendation engines, but can also be widely used in tasks such as content matching, anomaly detection, and NLP search. By utilizing the radius and range_filter parameters, users can precisely customize queries to meet the needs of different use cases.

Range Search is now officially available in Zilliz Cloud Beta! To experience the Range Search function, please upgrade the Zilliz Cloud ( https://zilliz.com.cn/cloud) cluster to the Beta version or download Milvus 2.3.x ( https://milvus.io/docs/install_cluster-milvusoperator. md). In addition, if you encounter any problems or suggestions when using the Range Search function, please feel free to give us feedback!

Alibaba Cloud suffered a serious failure and all products were affected (restored). Tumblr cooled down the Russian operating system Aurora OS 5.0. New UI unveiled Delphi 12 & C++ Builder 12, RAD Studio 12. Many Internet companies urgently recruit Hongmeng programmers. UNIX time is about to enter the 1.7 billion era (already entered). Meituan recruits troops and plans to develop the Hongmeng system App. Amazon develops a Linux-based operating system to get rid of Android's dependence on .NET 8 on Linux. The independent size is reduced by 50%. FFmpeg 6.1 "Heaviside" is released
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4209276/blog/10143103