MySQL补充——索引，流程控制，数据备份，python操作mysql，SQLAlchemy

一、索引

　　索引，是数据库中专门用于帮助用户快速查询数据的一种数据结构。类似于字典中的目录，查找字典内容时可以根据目录查找到数据的存放位置，然后直接获取即可。

1、索引种类

普通索引：仅加速查询
唯一索引：加速查询 + 列值唯一（可以有null）
主键索引：加速查询 + 列值唯一 +　表中只有一个（不可以有null）
组合索引：多列值组成一个索引，专门用于组合搜索，其效率大于索引合并
全文索引：对文本的内容进行分词，进行搜索

索引合并，使用多个单列索引组合搜索，组合索引效率 > 索引合并
覆盖索引，select的数据列只用从索引中就能够取得，不必读取数据行，换句话说查询列要被所建的索引覆盖

普通索引

创建表+索引

create table in1(
    nid int not null auto_increment primary key,
    name varchar(32) not null,
    email varchar(64) not null,
    extra text,
    index ix_name (name)
)

创建表之后，另外创建普通索引

create index index_name on table_name(column_name)

删除普通索引

drop index_name on table_name;

查看索引

show index from table_name;

注意：对于创建索引时如果是BLOB 和 TEXT 类型，必须指定length。

create index ix_extra on in1(extra(32));

扫描二维码关注公众号，回复： 5039812 查看本文章

唯一索引

创建表+唯一索引

create table in1(
    nid int not null auto_increment primary key,
    name varchar(32) not null,
    email varchar(64) not null,
    extra text,
    unique ix_name (name)
)

创建表之后，另外创建唯一索引

create unique index 索引名 on 表名(列名)

删除唯一索引

drop unique index 索引名 on 表名

主键索引

创建表+创建主键

create table in1(
    nid int not null auto_increment primary key,
    name varchar(32) not null,
    email varchar(64) not null,
    extra text
)

OR

create table in1(
    nid int not null auto_increment,
    name varchar(32) not null,
    email varchar(64) not null,
    extra text,
    primary key(ni1)
)

创建完表之后，创建主键

alter table 表名 add primary key(列名);

删除主键

alter table 表名 drop primary key;
alter table 表名  modify  列名 int, drop primary key;

组合索引

组合索引是将n个列组合成一个索引

其应用场景为：频繁的同时使用n列来进行查询，如：where n1 = 'alex' and n2 = 666。

-- 创建表

create table in3(
    nid int not null auto_increment primary key,
    name varchar(32) not null,
    email varchar(64) not null,
    extra text
)

-- 创建组合索引

create index ix_name_email on in3(name,email);

如上创建组合索引之后，查询：（最左匹配原则）

name and email -- 使用索引
name -- 使用索引
email -- 不使用索引

注意：对于同时搜索n个条件时，组合索引的性能好于多个单一索引合并。

2、正确使用索引及注意事项

以下情况不会走索引：

- like '%xx'
    select * from tb1 where name like '%cn';
- 使用函数
    select * from tb1 where reverse(name) = 'wupeiqi';
- or
    select * from tb1 where nid = 1 or email = '[email protected]';
    特别的：当or条件中有未建立索引的列才失效，以下会走索引
            select * from tb1 where nid = 1 or name = 'seven';
            select * from tb1 where nid = 1 or email = '[email protected]' and name = 'alex'
- 类型不一致
    如果列是字符串类型，传入条件是必须用引号引起来，不然...
    select * from tb1 where name = 999;
- !=
    select * from tb1 where name != 'alex'
    特别的：如果是主键，则还是会走索引
        select * from tb1 where nid != 123
- >
    select * from tb1 where name > 'alex'
    特别的：如果是主键或索引是整数类型，则还是会走索引
        select * from tb1 where nid > 123
        select * from tb1 where num > 123
- order by
    select email from tb1 order by name desc;
    当根据索引排序时候，选择的映射如果不是索引，则不走索引
    特别的：如果对主键排序，则还是走索引：
        select * from tb1 order by nid desc;
 
- 组合索引最左前缀
    如果组合索引为：(name,email)
    name and email       -- 使用索引
    name                 -- 使用索引
    email                -- 不使用索引

注意事项：

-- 避免使用select *
-- count(1)或count(列) 代替 count(*)
-- 创建表时尽量时 char 代替 varchar
-- 表的字段顺序固定长度的字段优先
-- 组合索引代替多个单列索引（经常使用多个条件查询时）
-- 尽量使用短索引
-- 使用连接（JOIN）来代替子查询(Sub-Queries)
-- 连表时注意条件类型需一致
-- 索引散列值（重复少）不适合建索引，例：性别不适合

3、limit分页？

无论是否有索引，limit分页是一个值得关注的问题

select 
        * 
    from 
        tb1 
    where 
        nid < (select nid from (select nid from tb1 where nid < 当前页最小值 order by nid desc limit 每页数据 *【页码-当前页】) A order by A.nid asc limit 1)  
    order by 
        nid desc 
    limit 10;



    select 
        * 
    from 
        tb1 
    where 
        nid < (select nid from (select nid from tb1 where nid < 970  order by nid desc limit 40) A order by A.nid asc limit 1)  
    order by 
        nid desc 
    limit 10;


上一页：

    select 
        * 
    from 
        tb1 
    where 
        nid < (select nid from (select nid from tb1 where nid > 当前页最大值 order by nid asc limit 每页数据 *【当前页-页码】) A order by A.nid asc limit 1)  
    order by 
        nid desc 
    limit 10;


    select 
        * 
    from 
        tb1 
    where 
        nid < (select nid from (select nid from tb1 where nid > 980 order by nid asc limit 20) A order by A.nid desc limit 1)  
    order by 
        nid desc 
    limit 10;

4、执行计划

explain + 查询SQL ------ 用于显示SQL执行信息参数，根据参考信息可以进行SQL优化

mysql> explain select * from tb2;
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
|  1 | SIMPLE      | tb2   | ALL  | NULL          | NULL | NULL    | NULL |    2 | NULL  |
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
1 row in set (0.00 sec)

详细：


    id 查询顺序标识
            如：mysql> explain select * from (select nid,name from tb1 where nid < 10) as B;
            +----+-------------+------------+-------+---------------+---------+---------+------+------+-------------+
            | id | select_type | table      | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
            +----+-------------+------------+-------+---------------+---------+---------+------+------+-------------+
            |  1 | PRIMARY     | <derived2> | ALL   | NULL          | NULL    | NULL    | NULL |    9 | NULL        |
            |  2 | DERIVED     | tb1        | range | PRIMARY       | PRIMARY | 8       | NULL |    9 | Using where |
            +----+-------------+------------+-------+---------------+---------+---------+------+------+-------------+
            特别的：如果使用union连接其值可能为null


    select_type
        查询类型
            SIMPLE          简单查询
            PRIMARY         最外层查询
            SUBQUERY        映射为子查询
            DERIVED         子查询
            UNION           联合
            UNION RESULT    使用联合的结果
            ...
    table
        正在访问的表名


    type
        查询时的访问方式，性能：all < index < range < index_merge < ref_or_null < ref < eq_ref < system/const
            ALL             全表扫描，对于数据表从头到尾找一遍
                            select * from tb1;
                            特别的：如果有limit限制，则找到之后就不在继续向下扫描
                                   select * from tb1 where email = '[email protected]'
                                   select * from tb1 where email = '[email protected]' limit 1;
                                   虽然上述两个语句都会进行全表扫描，第二句使用了limit，则找到一个后就不再继续扫描。

            INDEX           全索引扫描，对索引从头到尾找一遍
                            select nid from tb1;

            RANGE          对索引列进行范围查找
                            select *  from tb1 where name < 'alex';
                            PS:
                                between and
                                in
                                >   >=  <   <=  操作
                                注意：!= 和 > 符号


            INDEX_MERGE     合并索引，使用多个单列索引搜索
                            select *  from tb1 where name = 'alex' or nid in (11,22,33);

            REF             根据索引查找一个或多个值
                            select *  from tb1 where name = 'seven';

            EQ_REF          连接时使用primary key 或 unique类型
                            select tb2.nid,tb1.name from tb2 left join tb1 on tb2.nid = tb1.nid;



            CONST           常量
                            表最多有一个匹配行,因为仅有一行,在这行的列值可被优化器剩余部分认为是常数,const表很快,因为它们只读取一次。
                            select nid from tb1 where nid = 2 ;

            SYSTEM          系统
                            表仅有一行(=系统表)。这是const联接类型的一个特例。
                            select * from (select nid from tb1 where nid = 1) as A;
    possible_keys
        可能使用的索引

    key
        真实使用的

    key_len
        MySQL中使用索引字节长度

    rows
        mysql估计为了找到所需的行而要读取的行数 ------ 只是预估值

    extra
        该列包含MySQL解决查询的详细信息
        “Using index”
            此值表示mysql将使用覆盖索引，以避免访问表。不要把覆盖索引和index访问类型弄混了。
        “Using where”
            这意味着mysql服务器将在存储引擎检索行后再进行过滤，许多where条件里涉及索引中的列，当（并且如果）它读取索引时，就能被存储引擎检验，
　　　　　　　因此不是所有带where子句的查询都会显示“Using where”。有时“Using where”的出现就是一个暗示：查询可受益于不同的索引。
        “Using temporary”
            这意味着mysql在对查询结果排序时会使用一个临时表。
        “Using filesort”
            这意味着mysql会对结果使用一个外部索引排序，而不是按索引次序从表里读取行。mysql有两种文件排序算法，
            这两种排序方式都可以在内存或者磁盘上完成，explain不会告诉你mysql将使用哪一种文件排序，也不会告诉你排序会在内存里还是磁盘上完成。
        “Range checked for each record(index map: N)”
            这个意味着没有好用的索引，新的索引将在联接的每一行上重新估算，N是显示在possible_keys列中索引的位图，并且是冗余的。

慢查询优化的基本步骤：

0.先运行看看是否真的很慢，注意设置SQL_NO_CACHE

1.where条件单表查，锁定最小返回记录表。这句话的意思是把查询语句的where都应用到表中返回的记录数最小的表开始查起，单表每个字段分别查询，看哪个字段的区分度最高

2.explain查看执行计划，是否与1预期一致（从锁定记录较少的表开始查询）

3.order by limit 形式的sql语句让排序的表优先查

4.了解业务方使用场景

5.加索引时参照建索引的几大原则

6.观察结果，不符合预期继续从0分析

5、慢日志查询

a、配置MySQL自动记录慢日志

slow_query_log = OFF                           是否开启慢日志记录
long_query_time = 2                            时间限制，超过此时间，则记录
slow_query_log_file = /usr/slow.log            日志文件
log_queries_not_using_indexes = OFF     　　　　为使用索引的搜索是否记录

查看当前配置信息：show variables like '%query%'
修改当前配置：set global 变量名 = 值

b、查看MySQL慢日志

mysqldumpslow -s at -a  /usr/local/var/mysql/MacBook-Pro-3-slow.log

"""
--verbose    版本
--debug      调试
--help       帮助
 
-v           版本
-d           调试模式
-s ORDER     排序方式
             what to sort by (al, at, ar, c, l, r, t), 'at' is default
              al: average lock time
              ar: average rows sent
              at: average query time
               c: count
               l: lock time
               r: rows sent
               t: query time
-r           反转顺序，默认文件倒序拍。reverse the sort order (largest last instead of first)
-t NUM       显示前N条just show the top n queries
-a           不要将SQL中数字转换成N，字符串转换成S。don't abstract all numbers to N and strings to 'S'
-n NUM       abstract numbers with at least n digits within names
-g PATTERN   正则匹配；grep: only consider stmts that include this string
-h HOSTNAME  mysql机器名或者IP；hostname of db server for *-slow.log filename (can be wildcard),
             default is '*', i.e. match all
-i NAME      name of server instance (if using mysql.server startup script)
-l           总时间中不减去锁定时间；don't subtract lock time from total time
"""

二、流程控制

1、IF条件语句

delimiter \\
CREATE PROCEDURE proc_if ()
BEGIN
    
    declare i int default 0;
    if i = 1 THEN
        SELECT 1;
    ELSEIF i = 2 THEN
        SELECT 2;
    ELSE
        SELECT 7;
    END IF;

END\\
delimiter ;

2、循环语句

while循环

delimiter \\
CREATE PROCEDURE proc_while ()
BEGIN

    DECLARE num INT ;
    SET num = 0 ;
    WHILE num < 10 DO
        SELECT
            num ;
        SET num = num + 1 ;
    END WHILE ;

END\\
delimiter ;

repeat循环

delimiter \\
CREATE PROCEDURE proc_repeat ()
BEGIN

    DECLARE i INT ;
    SET i = 0 ;
    repeat
        select i;
        set i = i + 1;
        until i >= 5
    end repeat;

END\\
delimiter ;

loop循环？

BEGIN
    
    declare i int default 0;
    loop_label: loop
        
        set i=i+1;
        if i<8 then
            iterate loop_label;
        end if;
        if i>=10 then
            leave loop_label;
        end if;
        select i;
    end loop loop_label;

END

三、数据备份

1、三种备份方式

物理备份：直接复制数据库文件，适用于大型数据库环境，但不能恢复到异构环境中，如windows。
逻辑备份：备份建库，建表，插入数据等操作所执行的SQL语句，适用于中小型数据库，效率相对较低。
导出表：直接将表导入到文本文件中。

2、用mysqldump实现逻辑备份

#语法：
# mysqldump -h 服务器 -u用户名 -p密码 数据库名 > 备份文件.sql

#示例：
#单库备份
mysqldump -uroot -p123 db1 > db1.sql
mysqldump -uroot -p123 db1 table1 table2 > db1-table1-table2.sql

#多库备份
mysqldump -uroot -p123 --databases db1 db2 mysql db3 > db1_db2_mysql_db3.sql

#备份所有库
mysqldump -uroot -p123 --all-databases > all.sql

恢复逻辑备份

#方法一：
[root@egon backup]# mysql -uroot -p123 < /backup/all.sql

#方法二：
mysql> use db1;
mysql> SET SQL_LOG_BIN=0;
mysql> source /root/db1.sql

#注：如果备份/恢复单个库时，可以修改sql文件
DROP database if exists school;
create database school;
use school;

3、实现自动化备份

备份计划：
1. 什么时间 2:00
2. 对哪些数据库备份
3. 备份文件放的位置

备份脚本：
[root@egon ~]# vim /mysql_back.sql
#!/bin/bash
back_dir=/backup
back_file=`date +%F`_all.sql
user=root
pass=123

if [ ! -d /backup ];then
mkdir -p /backup
fi

# 备份并截断日志
mysqldump -u${user} -p${pass} --events --all-databases > ${back_dir}/${back_file}
mysql -u${user} -p${pass} -e 'flush logs'

# 只保留最近一周的备份
cd $back_dir
find . -mtime +7 -exec rm -rf {} \;


手动测试：
[root@egon ~]# chmod a+x /mysql_back.sql 
[root@egon ~]# chattr +i /mysql_back.sql
[root@egon ~]# /mysql_back.sql

配置cron：
[root@egon ~]# crontab -l
2 * * * /mysql_back.sql

4、表的导入和导出

SELECT... INTO OUTFILE 导出文本文件
示例：
mysql> SELECT * FROM school.student1
INTO OUTFILE 'student1.txt'
FIELDS TERMINATED BY ',' //定义字段分隔符
OPTIONALLY ENCLOSED BY '”' //定义字符串使用什么符号括起来
LINES TERMINATED BY '\n' ; //定义换行符


mysql 命令导出文本文件
示例:
# mysql -u root -p123 -e 'select * from student1.school' > /tmp/student1.txt
# mysql -u root -p123 --xml -e 'select * from student1.school' > /tmp/student1.xml
# mysql -u root -p123 --html -e 'select * from student1.school' > /tmp/student1.html

LOAD DATA INFILE 导入文本文件
mysql> DELETE FROM student1;
mysql> LOAD DATA INFILE '/tmp/student1.txt'
INTO TABLE school.student1
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '”'
LINES TERMINATED BY '\n';

#可能会报错
mysql> select * from db1.emp into outfile 'C:\\db1.emp.txt' fields terminated by ',' lines terminated by '\r\n';
ERROR 1238 (HY000): Variable 'secure_file_priv' is a read only variable


#数据库最关键的是数据，一旦数据库权限泄露，那么通过上述语句就可以轻松将数据导出到文件中然后下载拿走，因而mysql对此作了限制，只能将文件导出到指定目录
在配置文件中
[mysqld]
secure_file_priv='C:\\' #只能将数据导出到C:\\下

重启mysql
重新执行上述语句

5、数据库迁移

务必保证在相同版本之间迁移
# mysqldump -h 源IP -uroot -p123 --databases db1 | mysql -h 目标IP -uroot -p456

四、PyMySQL模块

1、连接、执行sql、关闭（游标）

import pymysql

user=input('用户名: ').strip()
pwd=input('密码: ').strip()

#链接
conn=pymysql.connect(host='localhost',user='root',password='123',database='egon',charset='utf8')
#游标
cursor=conn.cursor() #执行完毕返回的结果集默认以元组显示
#cursor=conn.cursor(cursor=pymysql.cursors.DictCursor)


#执行sql语句
sql='select * from userinfo where name="%s" and password="%s"' %(user,pwd) #注意%s需要加引号
print(sql)
res=cursor.execute(sql) #执行sql语句，返回sql查询成功的记录数目
print(res)

cursor.close()
conn.close()

if res:
    print('登录成功')
else:
    print('登录失败')

2、execute()之sql注入

注意：符号 -- 会注释掉它之后的sql

根本原理：就根据程序的字符串拼接name='%s'，我们输入一个xxx' -- haha,用我们输入的xxx加'在程序中拼接成一个判断条件name='xxx' -- haha'

在一条sql语句中如果遇到select * from t1 where id > 3 -- and name='egon';则--之后的条件被注释掉了

#1、sql注入之用户存在，绕过密码
egon” -- 任意字符

#2、sql注入之用户不存在，绕过用户与密码
xxx” or 1=1 -- 任意字符

解决方法：

原来是我们对sql进行字符串拼接
　　sql="select * from userinfo where name='%s' and password='%s'" %(user,pwd)
　　print(sql)
　　res=cursor.execute(sql)

改写为（execute帮我们做字符串拼接，我们无需且一定不能再为%s加引号了）
　　sql="select * from userinfo where name=%s and password=%s" 　　　 #！！！注意%s需要去掉引号，因为pymysql会自动为我们加上
　　res=cursor.execute(sql,[user,pwd]) 　　　　　　　　　　　　　　　　 #pymysql模块自动帮我们解决sql注入的问题，只要我们按照pymysql的规矩来。

3、增、删、改：conn.commit()

import pymysql
#链接
conn=pymysql.connect(host='localhost',user='root',password='123',database='egon')
#游标
cursor=conn.cursor()

#执行sql语句
#part1
# sql='insert into userinfo(name,password) values("root","123456");'
# res=cursor.execute(sql) #执行sql语句，返回sql影响成功的行数
# print(res)

#part2
# sql='insert into userinfo(name,password) values(%s,%s);'
# res=cursor.execute(sql,("root","123456")) #执行sql语句，返回sql影响成功的行数
# print(res)

#part3
sql='insert into userinfo(name,password) values(%s,%s);'
res=cursor.executemany(sql,[("root","123456"),("lhf","12356"),("eee","156")]) #执行sql语句，返回sql影响成功的行数
print(res)

conn.commit() #提交后才发现表中插入记录成功
cursor.close()
conn.close()

4、查：fetchone，fetchmany，fetchall

import pymysql
#链接
conn=pymysql.connect(host='localhost',user='root',password='123',database='egon')
#游标
cursor=conn.cursor()

#执行sql语句
sql='select * from userinfo;'
rows=cursor.execute(sql) #执行sql语句，返回sql影响成功的行数rows,将结果放入一个集合，等待被查询

# cursor.scroll(3,mode='absolute') # 相对绝对位置移动
# cursor.scroll(3,mode='relative') # 相对当前位置移动
res1=cursor.fetchone()
res2=cursor.fetchone()
res3=cursor.fetchone()
res4=cursor.fetchmany(2)
res5=cursor.fetchall()
print(res1)
print(res2)
print(res3)
print(res4)
print(res5)
print('%s rows in set (0.00 sec)' %rows)



conn.commit() #提交后才发现表中插入记录成功
cursor.close()
conn.close()

'''
(1, 'root', '123456')
(2, 'root', '123456')
(3, 'root', '123456')
((4, 'root', '123456'), (5, 'root', '123456'))
((6, 'root', '123456'), (7, 'lhf', '12356'), (8, 'eee', '156'))
rows in set (0.00 sec)
'''

5、获取插入的最后一条数据的自增ID

import pymysql
conn=pymysql.connect(host='localhost',user='root',password='123',database='egon')
cursor=conn.cursor()

sql='insert into userinfo(name,password) values("xxx","123");'
rows=cursor.execute(sql)
print(cursor.lastrowid) #在插入语句后查看

conn.commit()

cursor.close()
conn.close()