需求:
{'md5':'a0ba1541fff92960f81363fb097ed948','version':1} {'md5':'a0ba1541fff92960f81363fb097ed948','version':2} {'md5':'a868628e80565f7085e4099c83be536d','version':1}
找出每一组version最大的数据
预期结果:
{'md5':'a0ba1541fff92960f81363fb097ed948','version':2} {'md5':'a868628e80565f7085e4099c83be536d','version':1}
代码:
client = MongoClient()
collection = client.connect('temp', db_name='test')
from bson.code import Code mapper = Code(""" function () { emit(this.md5, this); } """)
reducer = Code(""" function (key, values) { var temp=null; values.forEach(function(current){ if (!temp || current.version>temp.version)temp=current; }) return Object.assign({},temp); } """)
result = collection.map_reduce(mapper, reducer, "myresults") for item in result.find(): print(item)
通过map函数遍历document,以md5为map的key,当md5仅为1个时不进行reduce步骤
直接得到结果,见官网
对于剩余的结果进行reducer步骤,对每一个md5进行回调,以上函数中key为md5值,
values为一个javascript Array类型,其中的元素为document,通过函数,返回对
这一组操作的最终结果,即需要的值。