ElasticSearch RestHighLevelClient 教程(三) 删除&&查询删除

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/paditang/article/details/79172837

前言

​ 删除文档作为ES操作中重要的一部分,其必要性毋庸置疑。而根据官网文档api可知,有两种删除方式:一是直接根据indextypeid直接删除,而第二种是查询删除,也就是所谓的Delete By Query API

​ 第一种删除方式因为id作为唯一标识,所以如果文档存在肯定能指定删除。

​ 而第二种查询删除的方式,其作用过程相当于先查询出满足条件的文档,再根据文档ID依次删除。所以必须注意查询条件,确定查询结果范围。否则会误删很多文档。

​ 当使用RestHighLevelClient操作时,第一种api没有问题,而第二种虽然提供了DeleteByQueryRequest,但是没有相应的方法执行这个请求。(如果存在,还望不吝指教!)只能自己查询再删除两步走。虽然由客户端发出两次请求肯定没有Delete By Query快,但是目前只能使用这种方式曲线救国了。

​ 还有一种方式就是使用RestClient,灵活拼接json语句,发送Http请求。

消息来源:https://discuss.elastic.co/t/delete-by-query-with-new-java-rest-api/107578

正文

准备数据

/PUT http://{{host}}:{{port}}/delete_demo
{
    "mappings":{
        "demo":{
            "properties":{
                "content":{
                    "type":"text",
                    "fields":{ "keyword":{ "type":"keyword" } } }
            }
        }
    }
}
/POST http://{{host}}:{{port}}/_bulk
{"index":{"_index":"delete_demo","_type":"demo"}}
{"content":"test1"}
{"index":{"_index":"delete_demo","_type":"demo"}}
{"content":"test1"}
{"index":{"_index":"delete_demo","_type":"demo"}}
{"content":"test1 add"}
{"index":{"_index":"delete_demo","_type":"demo"}}
{"content":"test2"}

注意:批量操作时,每行数据后面都得回车换行,最后一行后要跟空行!

{
    "took": 7,
    "errors": false,
    "items": [
        {
            "index": {
                "_index": "delete_demo",
                "_type": "demo",
                "_id": "AWExGSdW00f4t28WAPen",
                "_version": 1,
                "result": "created",
                "_shards": {
                    "total": 2,
                    "successful": 1,
                    "failed": 0
                },
                "created": true,
                "status": 201
            }
        },
        {
            "index": {
                "_index": "delete_demo",
                "_type": "demo",
                "_id": "AWExGSdW00f4t28WAPeo",
                "_version": 1,
                "result": "created",
                "_shards": {
                    "total": 2,
                    "successful": 1,
                    "failed": 0
                },
                "created": true,
                "status": 201
            }
        },
        {
            "index": {
                "_index": "delete_demo",
                "_type": "demo",
                "_id": "AWExGSdW00f4t28WAPep",
                "_version": 1,
                "result": "created",
                "_shards": {
                    "total": 2,
                    "successful": 1,
                    "failed": 0
                },
                "created": true,
                "status": 201
            }
        },
        {
            "index": {
                "_index": "delete_demo",
                "_type": "demo",
                "_id": "AWExGSdW00f4t28WAPeq",
                "_version": 1,
                "result": "created",
                "_shards": {
                    "total": 2,
                    "successful": 1,
                    "failed": 0
                },
                "created": true,
                "status": 201
            }
        }
    ]
}

ID方式删除

API格式
/DELETE http://{{host}}:{{port}}/delete_demo/demo/AWExGSdW00f4t28WAPen
Java 客户端

public class ElkDaoTest extends BaseTest{

    @Autowired
    private RestHighLevelClient rhlClient;

    private String index;

    private String type;

    private String id;

    @Before
    public void prepare(){
        index = "delete_demo";
        type = "demo";
        id = "AWExGSdW00f4t28WAPeo";
    }

    @Test
    public void delete(){
        DeleteRequest deleteRequest = new DeleteRequest(index,type,id);
        DeleteResponse response = null;
        try {
            response = rhlClient.delete(deleteRequest);
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
        System.out.println(response);
    }
}

同样删除成功。

关于rhlClient的使用可以参考之前的博文ElasticSearch Rest High Level Client 教程(一)通用操作

Delete By Query

API方式

首先重新把之前的数据恢复到四个文档。

/POST http://{{host}}:{{port}}/delete_demo/demo/_delete_by_query
{
    "query":{
        "match":{
            "content":"test1"
        }
    }
}
{
    "took": 14,
    "timed_out": false,
    "total": 3,
    "deleted": 3,
    "batches": 1,
    "version_conflicts": 0,
    "noops": 0,
    "retries": {
        "bulk": 0,
        "search": 0
    },
    "throttled_millis": 0,
    "requests_per_second": -1,
    "throttled_until_millis": 0,
    "failures": []
}
/GET http://{{host}}:{{port}}/delete_demo/demo/_search
{
    "took": 0,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 1,
        "max_score": 1,
        "hits": [
            {
                "_index": "delete_demo",
                "_type": "demo",
                "_id": "AWExKDse00f4t28WAafF",
                "_score": 1,
                "_source": {
                    "content": "test2"
                }
            }
        ]
    }
}

结果显示删除了三个文档,即test1test1test1 add,只剩下test2。显然是将查询到的结果都删除了。

如果使用term,也是同样按照查询匹配删除。

/POST http://{{host}}:{{port}}/delete_demo/demo/_delete_by_query
{
    "query":{
        "term":{
            "content.keyword":"test1"
        }
    }
}
{
    "took": 6,
    "timed_out": false,
    "total": 2,
    "deleted": 2,
    "batches": 1,
    "version_conflicts": 0,
    "noops": 0,
    "retries": {
        "bulk": 0,
        "search": 0
    },
    "throttled_millis": 0,
    "requests_per_second": -1,
    "throttled_until_millis": 0,
    "failures": []
}

证明Delete By Query就是先查询再删除的过程。

Java 客户端
  1. 使用RestHighLevelClient

    public class ElkDaoTest extends BaseTest {
    
    @Autowired
    private RestHighLevelClient rhlClient;
    
    private String index;
    
    private String type;
    
    private String deleteText;
    
    @Before
    public void prepare() {
        index = "delete_demo";
        type = "demo";
        deleteText = "test1";
    }
    
    @Test
    public void delete() {
        try {
            SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
            sourceBuilder.timeout(new TimeValue(2, TimeUnit.SECONDS));
            TermQueryBuilder termQueryBuilder1 = QueryBuilders.termQuery("content.keyword", deleteText);
            sourceBuilder.query(termQueryBuilder1);
            SearchRequest searchRequest = new SearchRequest(index);
            searchRequest.types(type);
            searchRequest.source(sourceBuilder);
            SearchResponse response = rhlClient.search(searchRequest);
            SearchHits hits = response.getHits();
            List<String> docIds = new ArrayList<>(hits.getHits().length);
            for (SearchHit hit : hits) {
                docIds.add(hit.getId());
            }
    
            BulkRequest bulkRequest = new BulkRequest();
            for (String id : docIds) {
                DeleteRequest deleteRequest = new DeleteRequest(index, type, id);
                bulkRequest.add(deleteRequest);
            }
            rhlClient.bulk(bulkRequest);
        } catch (IOException e) {
            e.printStackTrace();
        }
    
    }
    }

    恢复数据再执行以上代码,查询只剩下test1 addtest2两个文档,删除查询成功。具体查询不再贴出。

  2. 使用RestClient

    之前系列文章就有提到过,rhlClient是对RestClient的封装,而rhlClient有部分功能还在完善,还未在java中实现。那么使用restClient直接以http的形式调用ES服务就好了。

    public class ElkDaoTest extends BaseTest {
    
    
    @Autowired
    private RestClient restClient;
    
    private String index;
    
    private String type;
    
    private String deleteText;
    
    @Before
    public void prepare() {
        index = "delete_demo";
        type = "demo";
        deleteText = "test1";
    }
    
    @Test
    public void delete() {
        String endPoint = "/" + index + "/" + type +"/_delete_by_query";
        String source = genereateQueryString();
        HttpEntity entity = new NStringEntity(source, ContentType.APPLICATION_JSON);
        try {
            restClient.performRequest("POST", endPoint,Collections.<String, String> emptyMap(),
                    entity);
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
    
    }
    
    public String genereateQueryString(){
        IndexRequest indexRequest = new IndexRequest();
        XContentBuilder builder;
        try {
            builder = JsonXContent.contentBuilder()
                    .startObject()
                        .startObject("query")
                            .startObject("term")
                                .field("content.keyword",deleteText)
                            .endObject()
                        .endObject()
                    .endObject();
            indexRequest.source(builder);
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
        String source = indexRequest.source().utf8ToString();
        return source;
    }
    }

    运行后,同样删除了test1的两个文档,功能实现。优点就在于不需要发起两次HTTP连接,节省时间。

总结

​ 就删除操作而言,RestHighLevelClient所能做的还不够完善,因此要联系RestClient的灵活性才能实现我们想要的功能。

系列文章:

ElasticSearch Rest High Level Client 教程(一)通用操作

ElasticSearch RestHighLevelClient 教程(二) 操作index

猜你喜欢

转载自blog.csdn.net/paditang/article/details/79172837