mongodb的安装及使用

1.MongoDB安装

　　安装包下载地址： https://www.mongodb.com/download-center/community

　　启动数据库：进入到mongd所在的bin目录，执行mongod --dbpath d:\mongodb\data (d:\mongodb\data为数据将要保存的文件夹路径)

　　连接数据库：将D:\MongoDB\mongodb-win32-i386-2.6.9\bin配置到环境变量，在cmd命令行输入 mongo 127.0.0.1:27017

2. MongoDB常用命令

　　show dbs 显示所有数据库

　　use dbname 使用名字为dbname的数据库，若数据库dbname不存在则创建

　　db 显示当前使用的数据库名字（或者db.getName()）

　　db.dropDatabase() 删除数据库

　　show collections 查看当前数据库的集合（collection，即表格）

　　db.collection.drop() 删除集合

use test                                                           #使用数据库test，不存在时创建
db.test.user.insert({"name":"zack","age":23})    #在test库的user集合中插入一条数据记录，user不存在时创建集合user


帮助命令
1. db.help()

 db.adminCommand(nameOrDocument) - switches to 'admin' db, and runs command [ just calls db.runCommand(...) ]
 db.auth(username, password)
 db.cloneDatabase(fromhost)
 db.commandHelp(name) returns the help for the command
 db.copyDatabase(fromdb, todb, fromhost)
 db.createCollection(name, { size : ..., capped : ..., max : ... } )
 db.createUser(userDocument)
 db.currentOp() displays currently executing operations in the db
 db.dropDatabase()
 db.eval(func, args) run code server-side
 db.fsyncLock() flush data to disk and lock server for backups
 db.fsyncUnlock() unlocks server following a db.fsyncLock()
 db.getCollection(cname) same as db['cname'] or db.cname
 db.getCollectionInfos()
 db.getCollectionNames()
 db.getLastError() - just returns the err msg string
 db.getLastErrorObj() - return full status object
 db.getMongo() get the server connection object
 db.getMongo().setSlaveOk() allow queries on a replication slave server
 db.getName()
 db.getPrevError()
 db.getProfilingLevel() - deprecated
 db.getProfilingStatus() - returns if profiling is on and slow threshold
 db.getReplicationInfo()
 db.getSiblingDB(name) get the db at the same server as this one
 db.getWriteConcern() - returns the write concern used for any operations on this db, inherited from server object if set
 db.hostInfo() get details about the server's host
 db.isMaster() check replica primary status
 db.killOp(opid) kills the current operation in the db
 db.listCommands() lists all the db commands
 db.loadServerScripts() loads all the scripts in db.system.js
 db.logout()
 db.printCollectionStats()
 db.printReplicationInfo()
 db.printShardingStatus()
 db.printSlaveReplicationInfo()
 db.dropUser(username)
 db.repairDatabase()
 db.resetError()
 db.runCommand(cmdObj) run a database command.  if cmdObj is a string, turns it into { cmdObj : 1 }
 db.serverStatus()
 db.setProfilingLevel(level,<slowms>) 0=off 1=slow 2=all
 db.setWriteConcern( <write concern doc> ) - sets the write concern for writes to the db
 db.unsetWriteConcern( <write concern doc> ) - unsets the write concern for writes to the db
 db.setVerboseShell(flag) display extra information in shell output
 db.shutdownServer()
 db.stats()
 db.version() current version of the server

2. db.test.help()        对数据库test的所有操作命令：

 db.test.find().help() - show DBCursor help
 db.test.count()
 db.test.copyTo(newColl) - duplicates collection by copying all documents to newColl; no indexes are copied.
 db.test.convertToCapped(maxBytes) - calls {convertToCapped:'test', size:maxBytes}} command
 db.test.dataSize()
 db.test.distinct( key ) - e.g. db.test.distinct( 'x' )
 db.test.drop() drop the collection
 db.test.dropIndex(index) - e.g. db.test.dropIndex( "indexName" ) or db.test.dropIndex( { "indexKey" : 1 } )
 db.test.dropIndexes()
 db.test.ensureIndex(keypattern[,options]) - options is an object with these possible fields: name, unique, dropDups
 db.test.reIndex()
 db.test.find([query],[fields]) - query is an optional query filter. fields is optional set of fields to return.
                                               e.g. db.test.find( {x:77} , {name:1, x:1} )
 db.test.find(...).count()
 db.test.find(...).limit(n)
 db.test.find(...).skip(n)
 db.test.find(...).sort(...)
 db.test.findOne([query])
 db.test.findAndModify( { update : ... , remove : bool [, query: {}, sort: {}, 'new': false] } )
 db.test.getDB() get DB object associated with collection
 db.test.getPlanCache() get query plan cache associated with collection
 db.test.getIndexes()
 db.test.group( { key : ..., initial: ..., reduce : ...[, cond: ...] } )
 db.test.insert(obj)
 db.test.mapReduce( mapFunction , reduceFunction , <optional params> )
 db.test.aggregate( [pipeline], <optional params> ) - performs an aggregation on a collection; returns a cursor
 db.test.remove(query)
 db.test.renameCollection( newName , <dropTarget> ) renames the collection.
 db.test.runCommand( name , <options> ) runs a db command with the given name where the first param is the collection name
 db.test.save(obj)
 db.test.stats()
 db.test.storageSize() - includes free space allocated to this collection
 db.test.totalIndexSize() - size in bytes of all the indexes
 db.test.totalSize() - storage allocated for all data and indexes
 db.test.update(query, object[, upsert_bool, multi_bool]) - instead of two flags, you can pass an object with fields: upsert, multi
 db.test.validate( <full> ) - SLOW
 db.test.getShardVersion() - only for use with sharding
 db.test.getShardDistribution() - prints statistics about data distribution in the cluster
 db.test.getSplitKeysForChunks( <maxChunkSize> ) - calculates split points over all chunks and returns splitter function
 db.test.getWriteConcern() - returns the write concern used for any operations on this collection, inherited from server/db if set
 db.test.setWriteConcern( <write concern doc> ) - sets the write concern for writes to the collection


3. db.test.user.help()  对集合user的所有操作命令：

db.test.user.find().help() - show DBCursor help
db.test.user.count()
db.test.user.copyTo(newColl) - duplicates collection by copying all documents to newColl; no indexes are copied.
db.test.user.convertToCapped(maxBytes) - calls {convertToCapped:'test.user', size:maxBytes}} command
db.test.user.dataSize()
db.test.user.distinct( key ) - e.g. db.test.user.distinct( 'x' )
db.test.user.drop() drop the collection
db.test.user.dropIndex(index) - e.g. db.test.user.dropIndex( "indexName" ) or db.test.user.dropIndex( { "indexKey" : 1 } )
db.test.user.dropIndexes()
db.test.user.ensureIndex(keypattern[,options]) - options is an object with these possible fields: name, unique, dropDups
db.test.user.reIndex()
db.test.user.find([query],[fields]) - query is an optional query filter. fields is optional set of fields to return.
                                              e.g. db.test.user.find( {x:77} , {name:1, x:1} )
db.test.user.find(...).count()
db.test.user.find(...).limit(n)
db.test.user.find(...).skip(n)
db.test.user.find(...).sort(...)
db.test.user.findOne([query])
db.test.user.findAndModify( { update : ... , remove : bool [, query: {}, sort: {}, 'new': false] } )
db.test.user.getDB() get DB object associated with collection
db.test.user.getPlanCache() get query plan cache associated with collection
db.test.user.getIndexes()
db.test.user.group( { key : ..., initial: ..., reduce : ...[, cond: ...] } )
db.test.user.insert(obj)
db.test.user.mapReduce( mapFunction , reduceFunction , <optional params> )
db.test.user.aggregate( [pipeline], <optional params> ) - performs an aggregation on a collection; returns a cursor
db.test.user.remove(query)
db.test.user.renameCollection( newName , <dropTarget> ) renames the collection.
db.test.user.runCommand( name , <options> ) runs a db command with the given name where the first param is the collection name
db.test.user.save(obj)

　　备份和恢复命令:

　　　　备份整个库

　　　　　　mongodump -d dbname -o d:\mongodb\backup 备份数据库dbname到d:\mongodb\backup (会在该路径下生成一个dbname文件夹)

　　　　　　mongorestore -d newdbname d:\mongodb\backup 从文件夹d:\mongodb\backup\dbname 恢复数据库

　　　　备份数据库中单个集合

　　　　　　mongodump -d dbname -c collection -o d:\mongodb\backup 备份数据库dbname的集合collection到d:\mongodb\backup

　　　　　　mongorestore -d newdb -c new_colletion d:\mongodb\backup\backup.bson 通过备份的bson文恢复单个集合

　1. 备份数据

　　　　mongodump -h IP --port 端口 -u 用户名 -p 密码 -d 数据库 -c 表 -o 文件存放路径

　　　　参数说明：
　　　　-h 指明数据库宿主机的IP
　　　　--port 指明数据库的端口 
　　　　-u 指明数据库的用户名
　　　　-p 指明数据库的密码
　　　　-d 指明数据库的名字
　　　　-c 指明collection的名字
　　　　-o 指明到要导出的文件名
　　　　-q 指明导出数据的过滤条件

　　　　导出指定数据库

　　　　mongodump -d SERVERLOG -o /data/mongobak/SERVERLOG.bak/



 　　2、mongorestore恢复数据库

　　　　常用命令格式

　　　　mongorestore -h IP --port 端口 -u 用户名 -p 密码 -d 数据库 --drop 文件存在路径

　　　　--drop：先删除所有的记录，然后恢复.

　　　　恢复所有数据库到mongodb中

　　　　mongorestore /data/mongobak/ #所有库的备份路径

　　　导出和导入命令

　　　　mongoexport -d dbname -c collection -o d:\mongodb\backup 导出

　　　　mongoimport -d dbname -c collection 导入

3、 mongoexport导出（集合或者集合中部分字段）

　　　　常用命令格式

　　　　mongoexport -h IP --port 端口 -u 用户名 -p 密码 -d 数据库 -c 表名 -f 字段 -q 条件导出 --csv -o 文件名

　　　　参数重点说明：
　　　　-f 导出指定字段，以逗号分割，-f uid,name,age导出uid,name,age这三个字段
　　　　-q 可以根据查询条件导出，-q '{ "uid" : "100" }' 导出uid为100的数据
　　　　--csv 表示导出的文件格式为csv的。这个比较有用，因为大部分的关系型数据库都是支持csv，在这里有共同点

 

　　　　导出整个集合

　　　　mongoexport -h dbhost -d dbname -c collectionname -f collectionKey -o dbdirectory
　　　　-h: MongoDB所在服务器地址
　　　　-d: 需要恢复的数据库实例
　　　　-c: 需要恢复的集合
　　　　-f: 需要导出的字段(省略为所有字段)
　　　　-o: 表示导出的文件名
导出表中部分字段（IR_SITENAME, DATE, IR_AUTHORS）

 　　　　mongoexport -h 127.0.0.1:27017 -d OTT_DB -c trsdata1 -f IR_SITENAME,DATE,IR_AUTHORS -o E:\data\dump\trsdata.csv

　　　　mongoexport --db OTT_DB --collection trsdata1 --type=csv -f IR_SITENAME,DATE,IR_AUTHORS --out E:\data\dump\trsdata2.csv


根据条件导出数据

　　　　mongoexport -d SERVERLOG -c users -q '{uid:{$gt:1}}' -o /data/mongobak/SERVERLOG.bak/users.json 

 

　　4、mongoimport导入（表或者表中部分字段）

　　　　常用命令格式

　　　　恢复整表导出的非csv文件
　　　　mongoimport -h IP --port 端口 -u 用户名 -p 密码 -d 数据库 -c 表名 --upsert --drop 文件名

　　　　--upsert:插入或者更新现有数据

　　　　恢复部分字段的导出文件
　　　　mongoimport -h IP --port 端口 -u 用户名 -p 密码 -d 数据库 -c 表名 --upsertFields 字段 --drop 文件名

　　　　--upsertFields:更新部分的查询字段，必须为索引,以逗号分隔.

　　　　恢复导出的csv文件
　　　　mongoimport -h IP --port 端口 -u 用户名 -p 密码 -d 数据库 -c 表名 --type 类型 --headerline --upsert --drop 文件名

　　　　--type：导入的文件类型（默认json）

　　　　例如，把上面导出的trsdata2.csv文件导入到集合trsdata2中

　　　　mongoimport -h 127.0.0.1 --port 27017  -d OTT_DB -c trsdata2 --type csv --headerline --upsert --drop E:\data\dump\trsdata2.csv

　　2.1 增删改查

　　　　文档：https://docs.mongodb.com/manual/crud/

 Mongo常用筛选符号

$eq 等于
$ne 不等于
$gt 大于
$gte 大于等于
$lt 小于
$lte 小于等于
$in 在范围内
$nin 不在范围内

# 以 $ 开头
$set # 更新字段
$unset # 删除字段 $inc # 自增 {$inc: {money: 10}} | 自减 {$inc: {money: -10}} $exists # 是否存在 $and $or $push # 向数组中尾部添加一个元素，如果字段不存在则创建 $addToSet # 向集合中添加元素 $pop # 删除数组中的头部或尾部元素

　　增加：insert ,insertOne, insertMany()

　　　　 db.user.insert({"name":"xiaoming","age":24})

　　删除：remove()，deleteOne(), deleteMany()

　　　　db.user.remove({"name":"zack0"})

　　　　db.user.remove({age:{$gt:99}}) 删除年纪大于99的

　　修改：update()，updateOne(), updateMany()

　　　　db.user.update({"name":"zack1"},{"name":"jack"}) 只更新第一条匹配name=zack1的人，名字改为jack

　　　　db.user.update({"name":"jack"},{$set:{"name":"jack1"}},{multi:true}) 更新所有name=jack的人，名字改为jack1

　　　　（第一个{"name":"zack1"}为搜索字段，第二个{"name":"jack"}为更新字段）

　　查找：find　　　

db.user.find()            查找所有
db.user.findOne()         查找一条
db.user.find().pretty()   美化显示
db.user.find().count()
db.user.find().limit(10)
db.users.find().skip(3).limit(5)    相当于mysql的 limit 3,5
db.user.find().sort({"id":-1})  按id倒序排列
db.user.find().sort({"id":1})  按id正序排列


db.user.find({"age": {$lt: 20}})       年龄小于20       
db.users.find({"age": {$gt: 25, $lt:30 }})  年龄大于25，小于30

#and和or
db.users.find({"age": {$lt: 30}, "_id": {$lt: 3 }})  年龄小于30，而且_id小于3
db.users.find({$or: [{"age": {$gt: 30}}, {"username": "tom"}]}  年龄大于30，或者username为tom
db.users.find({ $or: [{"username": "mengday"}, {"age": {$lt: 20}}], "_id": {$lt: 4}  相当于 id< 4 or (username=mengday and age<20)

db.users.find({"age": {$in: [18, 28]}})               #age 在[18,28]
db.food.find({"fruit": {$all: ["apple", "cherry"]}})  #fruit同时包含["apple", "cherry"]

#正则匹配
db.users.find({"username": /^xiao/, "username": /ng$/})  #以xiao开头，以ng结尾
db.users.find({"username": {$regex:/sunday/, $options:"$i"}})    #忽略大小写
db.users.find({name:/^B.*/}); 匹配 name字段以 B开头的数据
db.users.find({name: {$not: /^B.*/}}); 匹配name字段不以B开头的数据


#切片
 db.food.find({}, {"fruit": {$slice: 2}}) 前两个
 db.food.find({}, {"fruit": {$slice: -2}})  后两个
 db.food.find({}, {"fruit": {$slice: [1, 3]}}) 1,2
 
 db.user.find({age:{$exists:true}});  #存在age字段的数据
 db.user.find({age:null})               #age字段为null的数据或不存在age字段
 db.c2.find({age:{$exists:true,$eq:null}})

　　去重：

　　　　db.user.find("age") 返回一个所有age字段的列表，去处重复age值

　　聚合：

　　先筛选，再分组：

　　　db.user.aggregate({$match:{"age":25}},{$group:{_id:"$name",total:{$sum:"$age"}}}) 筛选age为25，通过name字段分组，对age字段求和（别名为total）

　　先分组，再筛选：

　　db.user.aggregate({$group:{_id:"$name",total:{$sum:"$age"}}},{$match:{"name":"zack9766"}}) 先分组，再筛选

3. python操控mongo

　　3.1 安装pymongo

　　　　python通过pymongo包来连接和使用mongo数据库,安装pymongo包： pip install pymongo

　　　　安装特定版本pymongo:pip install pymongo==3.5.1

　　　　升级：pip install --upgrade pymongo

　　3.2 pymongo的使用

　　　　官方文档：http://api.mongodb.com/python/current/

　　　　连接mongo数据库两种方式：

　　　　　　　　client = pymongo.MongoClient(host="127.0.0.1",port=27017)

　　　　　　　　client = MongoClient('mongodb://127.0.0.1:27017/')

　　　　选择数据库两种方式：(选择test数据库)

　　　　　　　　db = client.test

　　　　　　　　db = client["test"]

　　　　选择集合的两种方式：(选择user集合)

　　　　　　　　collection = db.user

　　　　　　　　collection = db["user"]

　　　　增加数据：

　　　　　　user1 = {"id":1,"name":"zack","age":27}

　　　　　　result = collection.insert(user1) 在user集合中插入一条数据，mongo会自动产生一个ObjectId类型的_id属性作为唯一标识，insert会返回该对象

　　　　　　result = collection.insert([user1,user2]) 插入两条数据，返回ObjectId的集合

　　　　　　（在PyMongo 3.x版本中，官方推荐使用insert_one() 和insert_many()，插入一条和多条数据）

　　　　查询数据：

　　　　　　result = collection.find_one() 返回一条数据，类型为字典

　　　　　　results = collection.find()　　　　　返回多条数据，类型为生成器，遍历能拿到多个字典（一条数据对应一个字典）

results = collection.find({'age': 20})              #查询年龄为20
results = collection.find({'age': {'$gt': 20}})        #查询年龄大于20
results = collection.find({'name': {'$regex': '^Za.*'}})  #查找所有名字以Za开头的学生

#常用过滤条件：https://docs.mongodb.com/manual/reference/operator/query/
$lt小于{'age': {'$lt': 20}}
$gt大于{'age': {'$gt': 20}}
$lte小于等于{'age': {'$lte': 20}}
$gte大于等于{'age': {'$gte': 20}}
$ne不等于{'age': {'$ne': 20}}
$in在范围内{'age': {'$in': [20, 23]}}
$nin不在范围内{'age': {'$nin': [20, 23]}}

$regex匹配正则表达式{'name': {'$regex': '^M.*'}}                      name以M开头
$exists属性是否存在{'name': {'$exists': True}}                        name属性存在
$type类型判断{'age': {'$type': 'int'}}                                age的类型为int
$mod数字模操作{'age': {'$mod': [5, 0]}}                               年龄模5余0
$text文本查询{'$text': {'$search': 'Mike'}}                           text类型的属性中包含Mike字符串
$where高级条件查询{'$where': 'obj.fans_count == obj.follows_count'}   自身粉丝数等于关注数

　　　　计算数量：

　　　　　　results= collection.find({"age":{"$gt": 20}}).count() 年龄大于20的学生总数

　　　　排序：

　　　　　　results= collection.find({"age":{"$gt": 20}}).sort("name"，pymongo.ASCENDING) 升序

　　　　　　results= collection.find({"age":{"$gt": 20}}).sort("name"，pymongo.DESCENDING) 降序

　　　　偏移和截取：

　　　　　results= collection.find({"age":{"$gt": 20}}).sort("name"，pymongo.ASCENDING).skip(2).limit(3) #从第三条开始，截取三条数据（即返回3,4,5条）

　　　　更新数据：

　　　　　result = collection.update({"name":"zack"},{"$set": {"age": 37}}) 筛选名字为zack的，更新其年纪，只更新一条数据

　　　　　result = collection.update_one({"name":"zack"},{"$set": {"age": 37}}) 只更新一条数据

　　　　　result = collection.update_many({"name":"zack"},{"$set": {"age": 37}})　更新筛选出来的所有数据

　　　　删除：

　　　　　　result = collection.remove({"name":"zack2"}) 筛选名字为zack2的, 删除所有筛选的数据

　　　　　　result = collection.delete_one({"name":"zack2"}) 删除一条数据

　　　　　　result = collection.delete_many({"name":"zack2"}) 删除所有数据

参考博客：　　　　

　　mongodb使用：https://www.jianshu.com/p/4ecde929b17d

　　pymongo使用：https://juejin.im/post/5addbd0e518825671f2f62ee