索引 - 索引实战 - 《MySQL必会小册》

第一章
第二章

原文地址：https://mp.weixin.qq.com/s/3xkLTe7r388lRq-SBQllXw

第一章

举例，业务场景，用户表，表结构为：

t_user(
  uid primary key,
  login_name unique,
  passwd,
  login_time,
  age,
  …
);

聚集索引(clustered index)：聚集索引决定数据在磁盘上的物理排序，一个表只能有一个聚集索引，一般用primary key来约束。

举例：t_user场景中，uid上的索引。

非聚集索引(non-clustered index)：它并不决定数据在磁盘上的物理排序，索引上只包含被建立索引的数据，以及一个行定位符row-locator，这个行定位符，可以理解为一个聚集索引物理排序的指针，通过这个指针，可以找到行数据。

举例，查找年轻MM的业务需求：
select uid from t_user where age > 18 and age < 26;
age上建立的索引，就是非聚集索引。

联合索引：多个字段上建立的索引，能够加速复核查询条件的检索
举例，登录业务需求：
select uid, login_time from t_user where
login_name=? and passwd=?
可以建立(login_name, passwd)的联合索引。

联合索引能够满足最左侧查询需求，例如(a, b, c)三列的联合索引，能够加速a | (a, b) | (a, b, c) 三组查询需求。

这也就是为何不建立(passwd, login_name)这样联合索引的原因，业务上几乎没有passwd的单条件查询需求，而有很多login_name的单条件查询需求。

提问：
select uid, login_time from t_user where
passwd=? and login_name=?
能否命中(login_name, passwd)这个联合索引？
回答：可以，最左侧查询需求，并不是指SQL语句的写法必须满足索引的顺序（这是很多朋友的误解）

索引覆盖：被查询的列，数据能从索引中取得，而不用通过行定位符row-locator再到row上获取，即“被查询列要被所建的索引覆盖”，这能够加速查询速度。

举例，登录业务需求：
select uid, login_time from t_user where
login_name=? and passwd=?
可以建立(login_name, passwd, login_time)的联合索引，由于login_time已经建立在索引中了，被查询的uid和login_time就不用去row上获取数据了，从而加速查询。

末了多说一句，登录这个业务场景，login_name具备唯一性，建这个单列索引就好。

作业：
假设订单有三种状态：0已下单，1已支付，2已完成
业务需求，查询未完成的订单，哪个SQL更快呢？

select from order where status*!=2
select from order where status=0 *or status=1
select from order where status *IN (0,1)
select from order where status=0
union
select from order where stauts=1

第二章

假设订单业务表结构为：
order(oid, date, uid, status, money, time, …)
其中：

oid，订单ID，主键
date，下单日期，有普通索引，管理后台经常按照date查询
uid，用户ID，有普通索引，用户查询自己订单
status，订单状态，有普通索引，管理后台经常按照status查询
money/time，订单金额/时间，被查询字段，无索引
…

假设订单有三种状态：0已下单，1已支付，2已完成
业务需求，查询未完成的订单，哪个SQL更快呢？
select * from order where status!=2
select * from order where status=0 or status=1
select * from order where status IN (0,1)
select from order where status=0
union all
select from order where status=1

结论：方案1最慢，方案2，3，4都能命中索引

但是…

一：union all 肯定是能够命中索引的
select from order where status=0
union all
select from order where status=1
说明：

直接告诉MySQL怎么做，MySQL耗费的CPU最少
程序员并不经常这么写SQL(union all)

二：简单的in能够命中索引
select * from order where status in (0,1)
说明：
让MySQL思考，查询优化耗费的cpu比union all多，但可以忽略不计
程序员最常这么写SQL(in)，这个例子，最建议这么写

三：对于or，新版的MySQL能够命中索引
select * from order where status=0 or status=1
说明：
让MySQL思考，查询优化耗费的cpu比in多，别把负担交给MySQL
不建议程序员频繁用or，不是所有的or都命中索引
对于老版本的MySQL，建议查询分析下

四、对于!=，负向查询肯定不能命中索引
select * from order where status!=2
说明：
全表扫描，效率最低，所有方案中最慢
禁止使用负向查询

五、其他方案
select * from order where status < 2
这个具体的例子中，确实快，但是：
这个例子只举了3个状态，实际业务不止这3个状态，并且状态的“值”正好满足偏序关系，万一是查其他状态呢，SQL不宜依赖于枚举的值，方案不通用
这个SQL可读性差，可理解性差，可维护性差，强烈不推荐

六、作业
这样的查询能够命中索引么？
select * from order where uid in (
select uid from order where status=0
)
select * from order where status in (0, 1) order by date desc
select * from order where status=0 or date <= CURDATE()