Basic knowledge of MySQL

Three database paradigms?

Reprinted at: https://zhuanlan.zhihu.com/p/72197799

When designing a relational database, a reasonable relational database is designed to comply with different specifications and requirements. These different specifications are called different paradigms. Various paradigms are presented in sub-standards. The higher the paradigm, the lower the redundancy of the database. But sometimes the pursuit of paradigm to reduce redundancy will reduce the efficiency of data reading and writing. At this time, it is necessary to reverse the paradigm and use space for time. There are currently six paradigms in relational databases: First Normal Form (1NF), Second Normal Form (2NF), Third Normal Form (3NF), Bath-Cord Normal Form (BCNF), Fourth Normal Form (4NF) and Fifth Normal Form ( 5NF, also known as perfect paradigm). The paradigm that meets the minimum requirements is the first normal form (1NF). On the basis of the first normal form, the one that further satisfies more specifications is called the second normal form (2NF), and the rest of the normal form can be deduced by analogy. Generally speaking, the database only needs to satisfy the third normal form (3NF). Therefore, only the knowledge related to the three paradigms is recorded here.

1. The first normal form: to ensure the atomicity of each column

It can also be said that each column cannot be split . For example, look at the following table:

Insert picture description hereThe address column can be split into provinces, cities, and addresses. The disassembly is as follows:
Insert picture description here
2. On the basis of the first normal form, non-primary key columns are completely dependent on the primary key, and cannot be part of the primary key.

For example, look at the following table: The
Insert picture description here
above table satisfies the first paradigm, that is, each field cannot be subdivided, but the credits depend on the non-primary key courses , so the second paradigm is not met, which will cause data redundancy, update exceptions, insert exceptions, and Delete the exception. After modification:
Insert picture description here
3. On the basis of the second paradigm, non-primary key columns only depend on the primary key, not other non-primary keys

The third normal form is related to the second normal form. The third normal form is described by the definition of the third normal form. If there is no transfer function dependency of any candidate key field in the non-key field in the database table, it conforms to the third normal form, the so-called transfer Functional dependency means that if there is a decision relationship of "A–>B–>C", then the transfer function of C depends on A. That is to say, the fields in the table and the primary key directly correspond and do not rely on other intermediate fields. To put it bluntly, it is the primary key that determines the value of a field.

For example, look at the following table:

Insert picture description here
The third paradigm is a bit similar to the second paradigm. From this database table structure, you can see that "name", "age", "college" and the primary key "student ID" are directly related, but "college location" and "college phone number" "But not directly related to the primary key "student number", and "college phone" is directly related to "college", if the table structure is designed like this, it will also cause the same data redundancy, update exception, and insertion as in the second normal form Abnormal, delete abnormal issues. After the change:

Insert picture description here

What are the MySQL tables related to permissions?

The MySQL server controls the user's access to the database through the permission table. The permission table is stored in the mysql database and initialized by the mysql_install_db script. These permission tables are user, db, table_priv, columns_priv and host.

The following describes the structure and content of these tables:

  • User permission table: record the user account information allowed to connect to the server, the permissions inside are global.
  • db permission table: record the operation permissions of each account on each database.
  • table_priv permission table: records the operation permissions at the data table level.
  • columns_priv permission table: record data column-level operation permissions.
  • Host permission table: cooperate with the db permission table to make more detailed control of database-level operation permissions on a given host. This permission table is not affected by the GRANT and REVOKE statements.

What data types does mysql have?

classification type name Description
Integer type tinyInt Very small integer (8-bit binary)
smallint Small integer (16-bit binary)
mediumint Medium-sized integer (24-bit binary)
int(integer) Normal-sized integer (32-bit binary)
Decimal type float Single precision floating point
double Double-precision floating-point number
decimal(m,d) Strictly compressed fixed-point numbers
Date type year YYYY 1901~2155
time HH:MM:SS -838:59:59~838:59:59
date YYYY-MM-DD 1000-01-01~9999-12-3
datetime YYYY-MM-DD HH:MM:SS 1000-01-01 00:00:00~ 9999-12-31 23:59:59
timestamp YYYY-MM-DD HH:MM:SS 19700101 00:00:01 UTC~2038-01-19 03:14:07UTC
Text, binary type CHAR(M) M is an integer between 0 and 255
VARCHAR(M) M is an integer between 0 and 65535
TINYBLOB Allowed length 0~255 bytes
BLOB Allowed length 0~65535 bytes
MEDIUMBLOB Allowed length 0~167772150 bytes
LONGBLOB Allowed length 0~4294967295 bytes
TINYTEXT Allowed length 0~255 bytes
TEXT Allowed length 0~65535 bytes
MEDIUMTEXT Allowed length 0~167772150 bytes
LONGTEXT Allowed length 0~4294967295 bytes
VARBINARY(M) Allows variable-length byte strings with a length of 0~M bytes
BINARY(M) Allows a fixed-length byte string with a length of 0~M bytes

What are super keys, candidate keys, primary keys, and foreign keys?

超键(super key): 在关系中能唯一标识元组的属性集称为关系模式的超键
候选键(candidate key): 不含有多余属性的超键称为候选键。也就是在候选键中,若再删除属性,就不是超键了!  
主键(primary key): 用户选作元组标识的一个候选键程序主键
外键(foreign key):如果关系模式R中属性K是其它模式的主键,那么k在模式R中称为外键。

for example:

Two tables:
student information (student ID number, gender, age, height and weight, dormitory number)
dormitory information (dormitory building number)

  • Super key: As long as the set containing the two attributes of "student ID" or "ID number" is called a super key, such as R1 (student ID gender), R2 (ID number height), R3 (student ID ID number), etc. And so on can be called super keys!
  • Candidate keys: super keys that do not contain redundant attributes, such as (student ID) and (ID number) are candidate keys, and the attribute of R1 middle school ID can uniquely identify tuples, and whether there is gender Attributes have no influence on whether to uniquely identify tuples!
  • Primary key: A key selected by the user from many candidate keys is the primary key. For example, if you require the student ID to be the primary key, then the ID number cannot be the primary key!
  • The dormitory number is the foreign key of the student information table

What are the main types of SQL statements?

  • Data Definition Language DDL (Data Ddefinition Language)

    主要为CREATE,DROP,ALTER等操作 即对逻辑结构等有操作的,其中包括表结构,视图和索引。

  • Data query language DQL (Data Query Language) SELECT

    这个较为好理解 即查询操作,以select关键字。各种简单查询,连接查询等 都属于DQL。

  • Data Manipulation Language (DML)

    主要为INSERT,UPDATE,DELETE等操作 即对数据进行操作的,对应上面所说的查询操作 DQL与DML共同构建了多数初级程序员常用的增删改查操作。而查询是较为特殊的一种 被划分到DQL中。

  • Data control function DCL (Data Control Language)

    主要为GRANT,REVOKE,COMMIT,ROLLBACK等操作 即对数据库安全性完整性等有操作的,可以简单的理解为权限控制等。

Tell me about what kinds of constraints are there for SQL fields?

  • NOT NULL: The content used to control the field must not be empty (NULL).
  • UNIQUE: The content of the control field cannot be repeated. A table allows multiple Unique constraints .
  • PRIMARY KEY: It is also used for the control field content cannot be repeated, but it only allows one in a table.
  • FOREIGN KEY: It is used to prevent the action of destroying the connection between tables, and it can also prevent illegal data from being inserted into the foreign key column, because it must be one of the values ​​in the table it points to.
  • CHECK: Used to control the value range of the field.

Talk about SQL related queries?

  • Inner join (INNER JOIN)
    内连接分为三类:
    1. 等值连接:ON A.id=B.id
    2. 不等值连接:ON A.id > B.id
    3. 自连接:SELECT * FROM A T1 INNER JOIN A T2 ON T1.id=T2.pid
    
  • Outer join (LEFT JOIN/RIGHT JOIN)
    左外连接:LEFT OUTER JOIN, 以左表为主,先查询出左表,按照ON后的关联条件匹配右表,没有匹配到的用NULL填充,可以简写成LEFT JOIN
    右外连接:RIGHT OUTER JOIN, 以右表为主,先查询出右表,按照ON后的关联条件匹配左表,没有匹配到的用NULL填充,可以简写成RIGHT JOIN
    
  • Joint query (UNION and UNION ALL)
    SELECT * FROM A UNION SELECT * FROM B UNION ...
    
    1. 就是把多个结果集集中在一起,UNION前的结果为基准,需要注意的是联合查询的列数要相等,相同的记录行会合并
    2. 如果使用UNION ALL,不会合并重复的记录行
    3. 效率 UNION 高于 UNION ALL
    
  • FULL JOIN
    MySQL不支持全连接可以使用LEFT JOIN 和UNION和RIGHT JOIN联合使用
    
    SELECT * FROM A LEFT JOIN B ON A.id=B.id UNIONSELECT * FROM A RIGHT JOIN B ON A.id=B.id
    
  • Cross join (CROSS JOIN)

Talk about SQL subqueries?

Condition: The query result of one SQL statement is used as the condition or query result of another query statement.
Nesting: Multiple SQL statements are used in nesting, and the inner SQL query statement is called a subquery.

three situations:

子查询是单行单列的情况:结果集是一个值,父查询使用:=、 <、 > 等运算符
// 查询工资最高的员工是谁? 
select  * from employee where salary=(select max(salary) from employee);   

子查询是多行单列的情况:结果集类似于一个数组,父查询使用:in 运算符
// 查询工资最高的员工是谁?
select  * from employee where salary=(select max(salary) from employee);    

子查询是多行多列的情况:结果集类似于一张虚拟表,不能用于where条件,用于select子句中做为子表
// 1) 查询出2011年以后入职的员工信息
// 2) 查询所有的部门信息,与上面的虚拟表中的信息比对,找出所有部门ID相等的员工。
select * from dept d,  (select * from employee where join_date > '2011-1-1') e where e.dept_id =  d.id;    
// 使用表连接:
select d.*, e.* from  dept d inner join employee e on d.id = e.dept_id where e.join_date >  '2011-1-1'  

What is the difference between in and exists in mysql?

https://cloud.tencent.com/developer/article/1144244
https://www.jianshu.com/p/f212527d76ff

select * from A where id in (select id from B);

select * from A where exists (select 1 from B where A.id=B.id);
  1. The internal principle of in() statement

    IN() is executed only once, it finds all the id fields in the B table and caches them. After that, check whether the id of the A table is equal to the id in the B table, if they are equal, add the records of the A table to the result set until all the records of the A table are traversed.

    可以看出,当B表数据较大时不适合使用in(),因为它会把B表数据全部遍历一次

    For example: A table has 10,000 records, and B table has 1,000,000 records, so it is possible to traverse 10,000*1,000,000 times at most, which is very inefficient.

  2. The inner workings of the EXISTS() statement

    Exists is used to query the external table one by one with loop. Each time the query will check the conditional statement of exists, when the conditional statement in exists can return the record row (no matter how many record rows are, as long as it can be returned), the condition is true and the current loop is returned. To the record; on the contrary, if the conditional statement in exists cannot return the record row, the record to which the current loop is reached is discarded. The condition of exists is like a bool condition. When the result set can be returned, it is true and cannot be returned. The result set is false

    当B表比A表数据大时适合使用exists(),因为它没有那么多遍历操作,只需要再执行一次查询就行。

    For example: A table has 10,000 records, and B table has 1,000,000 records, then exists() will be executed 10,000 times to determine whether the id in the A table is equal to the id in the B table.

The difference between varchar and char? how to choose?

Characteristics of char:

1. char表示定长字符串,长度是固定的;
2. 如果插入数据的长度小于char的固定长度时,则用空格填充;
3. 因为长度固定,所以存取速度要比varchar快很多,甚至能快50%,但正因为其长度固定,所以会占据多余的空间,是空间换时间的做法;
4. 对于char来说,最多能存放的字符个数为255,和编码无关

Characteristics of varchar:

1. varchar表示可变长字符串,长度是可变的;
2. 插入的数据是多长,就按照多长来存储;
3. varchar在存取方面与char相反,它存取慢,因为长度不固定,但正因如此,不占据多 余的空间,是时间换空间的做法;
4. 对于varchar来说,最多能存放的字符个数为65532

how to choose?

1. 对于经常变更的数据来说,CHAR比VARCHAR更好,因为CHAR不容易产生碎片。
2. 对于非常短的列,CHAR比VARCHAR在存储空间上更有效率。
3. 使用时要注意只分配需要的空间,更长的列排序时会消耗更多内存。
4. 尽量避免使用TEXT/BLOB类型,查询时会使用临时表,导致严重的性能开销。

What is the difference between int(10), char(10) and varchar(10) in mysql?

  • The 10 in int(10) represents the length of the displayed data, not the size of the stored data;
  • char(10) means to store 10 characters of fixed length, fill up with spaces if there are less than 10 characters, occupying more storage space;
  • Varchar(10) means to store 10 variable-length characters. The number of characters is as many as the number. Spaces are also stored as a character. This is different from the spaces of char(10). The spaces of char(10) indicate that the space is not occupied. Count a character

What is the difference between delete, drop and truncate in MySQL?

All three indicate deletion, but there are some differences between the three:

Delete Truncate Drop
Types of Belongs to DML Belongs to DDL Belongs to DDL
Rollback Rollback Not rollback Not rollback
Delete content The table structure is still there, delete all or part of the data rows of the table The table structure is still there, delete all the data in the table Delete the table from the database, all data rows, indexes and permissions will also be deleted
Delete speed Delete speed is slow, need to delete line by line Fast deletion Fastest deletion

Therefore, when a table is no longer needed, use drop; when you want to delete some data rows, use delete; when you retain the table and delete all data, use truncate.

How many input formats does MySQL binlog have? What are the differences?

There are three formats, statement, row and mixed.

  • In statement mode, every sql that will modify data will be recorded in the binlog. There is no need to record the changes of each line, which reduces the amount of binlog, saves IO, and improves performance. Since the execution of sql is contextual, relevant information needs to be saved when saving, and there are some statements that use functions and the like that cannot be recorded and copied.
  • At the row level, the context-related information of the SQL statement is not recorded, only which record is modified is saved. The record unit is the change of each line, which can basically be all recorded. However, due to many operations, a large number of line changes (such as alter table) will be caused. Therefore, the file of this mode saves too much information and the log volume is too large.
  • Mixed, a compromise solution, uses statement records for common operations, and row when statement cannot be used.

In addition, some optimizations have been made to the row level in the new version of MySQL. When the table structure changes, the statement will be recorded instead of row by row.

What kind of logs does MySQL have?

https://database.51cto.com/art/201806/576300.htm

What is the difference between binary log and redo log?

Someone may ask, since it also records the transaction log, what is the difference from the binary log introduced earlier? First, the binary log records all the log records related to the MySQL database, including the logs of other storage engines such as InnoDB, MyISAM, and Heap. The redo log of the InnoDB storage engine only records the transaction log of the storage engine itself.

Secondly, the contents of the records are different. Whether the user sets the format of the binary log file record to STATEMENT, ROW, or MIXED, it records the specific operation content of a transaction, that is, the log is a logical log. The redo log file of the InnoDB storage engine records the physical status of each page (Page) change.

In addition, the writing time is also different. The binary log file is submitted only before the transaction is committed, that is, it is only written to the disk once, no matter how big the transaction is at this time. In the process of the transaction, redo log entries (redoentry) are continuously written to the redo log file.

Guess you like

Origin blog.csdn.net/weixin_44533129/article/details/112768901