python mongodb分组,取每一组最新数据

需求:

{'md5':'a0ba1541fff92960f81363fb097ed948','version':1}
{'md5':'a0ba1541fff92960f81363fb097ed948','version':2}
{'md5':'a868628e80565f7085e4099c83be536d','version':1}

找出每一组version最大的数据

预期结果:

{'md5':'a0ba1541fff92960f81363fb097ed948','version':2}
{'md5':'a868628e80565f7085e4099c83be536d','version':1}
代码:
 
 
 
 
client = MongoClient()
collection = client.connect('temp', db_name='test')
from bson.code import Code
mapper = Code("""
                function () {                            
                    emit(this.md5, this);
               }
                """)
 
 
reducer = Code("""
                function (key, values) {
                var temp=null;
                values.forEach(function(current){
                    if (!temp || current.version>temp.version)temp=current;
                })
                return Object.assign({},temp);                          
                }
               """)
 
 
result = collection.map_reduce(mapper, reducer, "myresults")
for item in result.find():
    print(item)
 
 

通过map函数遍历document,以md5为map的key,当md5仅为1个时不进行reduce步骤

直接得到结果,见官网
对于剩余的结果进行reducer步骤,对每一个md5进行回调,以上函数中key为md5值,
values为一个javascript Array类型,其中的元素为document,通过函数,返回对
这一组操作的最终结果,即需要的值。





猜你喜欢

转载自blog.csdn.net/weixin_42195514/article/details/80294414