1. 分区表的意义
- 1. 当查询或更新一个分区上的大部分数据时,对分区进行索引扫描代价很大,然而,在分区上使用顺序扫描能提升性能。
- 2. 当需要删除一个分区数据时,通过drop table删除一个分区,远比delete删除数据高效,特别适用于日志数据场景。
- 3. 由于一个表只能存储在一个表空间,使用分区后,可以将分区放到不同的表空间上,例如可以将系统很少访问的分区放到廉价的存储设备上,也可以将系统常访问的分区存储在高速存储上。
2.传统分区表
2.1 传统分区表说明:
- 1. 传统分区表是通过继承和触发器的方式实现的,需要定义父表、定义子表、定义子表约束、创建子表索引、创建分区插入、删除、修改、修改函数和触发器等,可以说是在普通表基础上手工实现的分区表。
2.2 继承表:
- 1. 创建一张日志模型表(父表)
- create table tb1_log(id int4,create_date date,log_type text);
- insert into tb1_log values (1,'2017-08-26',null);
- 2. 创建一张子表:
- create table tb1_log_sql(sql text) inherits (tb1_log);
- insert into tb1_log_sql values(2,'2017-08-27',null,'select 2');
- 3. 此时查看父表数据,发现也有子表的数据:
- select * from tb1_log;
- 4. 虽然查询父表时连子表的数据也打印了出来,但是没有显示子表的字段,这个时候我们就需要通过OID找到父表对应的子表:
- select a.*,b.relname from tb1_log a,pg_class b where a.tableoid=b.oid;
- 5. 如果只想查看父表的数据,在父表前面加上only即可:
- select * from only tb1_log;
2.3 创建分区表:
- ①    创建分区表步骤说明:
- # 步骤一:创建父表,如果父表上定义了约束,子表会继承,因此除非是全局约束,否则不应该在父表上定义约束,另外,子表不应该写入数据。
- # 步骤二:通过inherits方式创建继承表,也称之为子表或分区,子表的字段定义应该和父表保持一致。
- # 步骤三:给所有子表创建约束,只有满足约束条件的数据才能写入对应分区,注意分区约束值范围不要有重叠。
- # 步骤四:给所有子表创建索引,由于继承操作不会继承继承父表的索引,因此索引需要手工创建。
- # 步骤五:在父表上定义insert、delete、update触发器,将SQL分发到对应分区,这步可选,因为应用可以根据分区规则定位到对应分区进行DML操作。
- ②    创建父表:
- create table log_ins(id serial,user_id int4,create_time timestamp(0) without time zone);
- ③    创建13个子表:
- create table log_ins_history(check (create_time < '2017-01-01')) inherits (log_ins);
- create table log_ins_201701(check (create_time >= '2017-01-01' and create_time < '2017-02-01')) inherits (log_ins);
- create table log_ins_201702(check (create_time >= '2017-02-01' and create_time < '2017-03-01')) inherits (log_ins);
- create table log_ins_201703(check (create_time >= '2017-03-01' and create_time < '2017-04-01')) inherits (log_ins);
- create table log_ins_201704(check (create_time >= '2017-04-01' and create_time < '2017-05-01')) inherits (log_ins);
- create table log_ins_201705(check (create_time >= '2017-05-01' and create_time < '2017-06-01')) inherits (log_ins);
- create table log_ins_201706(check (create_time >= '2017-06-01' and create_time < '2017-07-01')) inherits (log_ins);
- create table log_ins_201707(check (create_time >= '2017-07-01' and create_time < '2017-08-01')) inherits (log_ins);
- create table log_ins_201708(check (create_time >= '2017-08-01' and create_time < '2017-09-01')) inherits (log_ins);
- create table log_ins_201709(check (create_time >= '2017-09-01' and create_time < '2017-10-01')) inherits (log_ins);
- create table log_ins_201710(check (create_time >= '2017-10-01' and create_time < '2017-11-01')) inherits (log_ins);
- create table log_ins_201711(check (create_time >= '2017-11-01' and create_time < '2017-12-01')) inherits (log_ins);
- create table log_ins_201712(check (create_time >= '2017-12-01' and create_time < '2018-01-01')) inherits (log_ins);
- ④    给子表创建索引:
- create index idx_his_ctime on log_ins_history using btree (create_time);
- create index idx_his_201701_ctime on log_ins_201701 using btree (create_time);
- create index idx_his_201702_ctime on log_ins_201702 using btree (create_time);
- create index idx_his_201703_ctime on log_ins_201703 using btree (create_time);
- create index idx_his_201704_ctime on log_ins_201704 using btree (create_time);
- create index idx_his_201705_ctime on log_ins_201705 using btree (create_time);
- create index idx_his_201706_ctime on log_ins_201706 using btree (create_time);
- create index idx_his_201707_ctime on log_ins_201707 using btree (create_time);
- create index idx_his_201708_ctime on log_ins_201708 using btree (create_time);
- create index idx_his_201709_ctime on log_ins_201709 using btree (create_time);
- create index idx_his_201710_ctime on log_ins_201710 using btree (create_time);
- create index idx_his_201711_ctime on log_ins_201711 using btree (create_time);
- create index idx_his_201712_ctime on log_ins_201712 using btree (create_time);
- ⑤    创建触发器函数,设置数据插入父表时的路由规则:
- create or replace function log_ins_insert_trigger()
-     returns trigger
-     language plpgsql
- as $function$
- begin
-     if (new.create_time<'2017-01-01') then
-         insert into log_ins_history values (new.*);
-     elsif (new.create_time>='2017-01-01' and new.create_time<'2017-02-01') then
-         insert into log_ins_201701 values (new.*);
-     elsif (new.create_time>='2017-02-01' and new.create_time<'2017-03-01') then
-         insert into log_ins_201702 values (new.*);
-     elsif (new.create_time>='2017-03-01' and new.create_time<'2017-04-01') then
-         insert into log_ins_201703 values (new.*);
-     elsif (new.create_time>='2017-04-01' and new.create_time<'2017-05-01') then
-         insert into log_ins_201704 values (new.*);
-     elsif (new.create_time>='2017-05-01' and new.create_time<'2017-06-01') then
-         insert into log_ins_201705 values (new.*);
-     elsif (new.create_time>='2017-06-01' and new.create_time<'2017-07-01') then
-         insert into log_ins_201706 values (new.*);
-     elsif (new.create_time>='2017-07-01' and new.create_time<'2017-08-01') then
-         insert into log_ins_201707 values (new.*);
-     elsif (new.create_time>='2017-08-01' and new.create_time<'2017-09-01') then
-         insert into log_ins_201708 values (new.*);
-     elsif (new.create_time>='2017-09-01' and new.create_time<'2017-10-01') then
-         insert into log_ins_201709 values (new.*);
-     elsif (new.create_time>='2017-10-01' and new.create_time<'2017-11-01') then
-         insert into log_ins_201710 values (new.*);
-     elsif (new.create_time>='2017-11-01' and new.create_time<'2017-12-01') then
-         insert into log_ins_201711 values (new.*);
-     elsif (new.create_time>='2017-12-01' and new.create_time<'2018-01-01') then
-         insert into log_ins_201712 values (new.*);
-     else
-         raise exception 'create_time out of range. fix the log_ins_insert_trigger() function!';
-     end if;
-     return null;
- end;
- $function$;
- ⑥    创建并调用触发器函数:
- create trigger insert_log_ins_trigger before insert on log_ins for each row execute procedure log_ins_insert_trigger();
2.4 使用分区表:
- ①    在父表log_ins中插入测试数据,并验证数据是否插入对应分区:
- # 插入数据:
- insert into log_ins(user_id,create_time) select round(100000000*random()),generate_series('2016-12-01'::date,'2017-12-01'::date,'1 minute');
- # 查看父表数据,发现父表中没有数据:
- postgres=# select count(*) from only log_ins;
-  count 
- -------
-      0
- postgres=# select count(*) from log_ins;
-  count  
- --------
-  525601
- # 查看子表数据:
- postgres=# select min(create_time),max(create_time) from log_ins_201701;
-          min         |         max         
- ---------------------+---------------------
-  2017-01-01 00:00:00 | 2017-01-31 23:59:00
- # 查看子表大小,由此可见数据都已经插入到了子表中:
- postgres=# \dt+ log_ins*
- ②    查询数据时,查询子表和查询父表哪个性能高?我们可以通过执行计划分别查看:
- # 查询父表:
- explain analyze select * from log_ins where create_time > '2017-01-01' and create_time < '2017-01-02';
- # 查询子表:
- explain analyze select * from log_ins_201701 where create_time > '2017-01-01' and create_time < '2017-01-02';

2.5 constraint_exclusion 参数:
- 1. constraint_exclusion参数用来控制优化器是否根据表上的约束来优化查询:
- # on所有表都通过约束优化查询。
- # off所有表都不通过约束优化查询。
- # partition只对继承表和union all子查询通过检索约束来优化查询。
- 2. 开启此参数和不开启此参数进行测试:
- # 关闭此参数:
- set constraint_exclusion=off;
- explain analyze select * from log_ins where create_time > '2017-01-01' and create_time < '2017-01-02';
- # 开启此参数:
- set constraint_exclusion=partition;
- explain analyze select * from log_ins where create_time > '2017-01-01' and create_time < '2017-01-02';
2.6 添加分区表:
- 1. 方法一:
- # 在分区表历史范围到期之前,就需要扩分区:
- create table log_ins_201801(check (create_time >= '2018-01-01' and create_time < '2018-02-01')) inherits (log_ins);
- # 为扩展的分区添加索引:
- create index idx_his_201801_ctime on log_ins_201801 using btree (create_time);
- # 刷新触发器函数:
- 略
- 2. 方法二:(建议)
- # 以上添加分区的方法比较直接,创建分区时就将分区继承到父表,如果中间步骤有错可能对生产系统带来影响,所以建议将以上操作分解为如下几个步骤:
- # 创建分区:
- create table log_ins_201802(like log_ins including all);
- # 添加约束:
- alter table log_ins_201802 add constraint log_ins_201802_create_time_check check (create_time >= '2018-02-01' and create_time < '2018-03-01');
- # 刷新触发器函数:
- 略
- # 将分区继承到父表:
- alter table log_ins_201802 inherit log_ins;
3.内置分区表
3.1 内置分区表说明:
-   内置分区表不需要预先在父表上定义insert、delete、update触发器,对父表的DML操作会自动路由到相应分区,相比传统分区表大幅度降低了维护成本,目前仅支持范围和列表分区,本次就以创建范围分区为例。
3.2 创建内置分区表:
- 1. 创建主表:
- create table log_par (
-     id serial,
-     user_id int4,
-     create_time timestamp(0) without time zone
-     ) partition by range(create_time);
- # 创建主表时可选择分区方式,range范围分区或list列表分区,并指定地段或表达式作为分区键。此表指定了分区策略为范围分区,分区键为create_time字段。
- 2. 创建分区,并设置分区键取值范围:
- create table log_par_his partition of log_par for values from ('2016-12-01') to ('2017-01-01');
- create table log_par_his_201701 partition of log_par for values from ('2017-01-01') to ('2017-02-01');
- create table log_par_his_201702 partition of log_par for values from ('2017-02-01') to ('2017-03-01');
- create table log_par_his_201703 partition of log_par for values from ('2017-03-01') to ('2017-04-01');
- create table log_par_his_201704 partition of log_par for values from ('2017-04-01') to ('2017-05-01');
- create table log_par_his_201705 partition of log_par for values from ('2017-05-01') to ('2017-06-01');
- create table log_par_his_201706 partition of log_par for values from ('2017-06-01') to ('2017-07-01');
- create table log_par_his_201707 partition of log_par for values from ('2017-07-01') to ('2017-08-01');
- create table log_par_his_201708 partition of log_par for values from ('2017-08-01') to ('2017-09-01');
- create table log_par_his_201709 partition of log_par for values from ('2017-09-01') to ('2017-10-01');
- create table log_par_his_201710 partition of log_par for values from ('2017-10-01') to ('2017-11-01');
- create table log_par_his_201711 partition of log_par for values from ('2017-11-01') to ('2017-12-01');
- create table log_par_his_201712 partition of log_par for values from ('2017-12-01') to ('2018-01-01');
- 3. 为所有分区创建索引:
- create index idx_log_par_his_ctime on log_par_his using btree(create_time);
- create index idx_log_par_201701_ctime on log_par_his_201701 using btree(create_time);
- create index idx_log_par_201702_ctime on log_par_his_201702 using btree(create_time);
- create index idx_log_par_201703_ctime on log_par_his_201703 using btree(create_time);
- create index idx_log_par_201704_ctime on log_par_his_201704 using btree(create_time);
- create index idx_log_par_201705_ctime on log_par_his_201705 using btree(create_time);
- create index idx_log_par_201706_ctime on log_par_his_201706 using btree(create_time);
- create index idx_log_par_201707_ctime on log_par_his_201707 using btree(create_time);
- create index idx_log_par_201708_ctime on log_par_his_201708 using btree(create_time);
- create index idx_log_par_201709_ctime on log_par_his_201709 using btree(create_time);
- create index idx_log_par_201710_ctime on log_par_his_201710 using btree(create_time);
- create index idx_log_par_201711_ctime on log_par_his_201711 using btree(create_time);
- create index idx_log_par_201712_ctime on log_par_his_201712 using btree(create_time);
- 4. 插入数据:
- insert into log_par(user_id,create_time)
- select round(100000000*random()),generate_series('2016-12-01'::date,
- '2017-12-01'::date,'1 minute');
- 5. 进行验证,此时发现父表中没有任何数据:
- \dt+ log_par*
3.3 添加分区:
- 1. 添加分区:
- create table log_par_his_201801 partition of log_par for values from ('2018-01-01') to ('2018-02-01');
- 2.     创建索引:
- create index idx_log_par_201801_ctime on log_par_his_201801 using btree(create_time);
3.4 删除分区:
- 1. 方法一:直接使用drop命令进行删除:
- drop table log_par_his_201801;
- 2. 方法二:使用解绑方式:
- # 进行解绑:
- alter table log_par detach partition log_par_his_201801 ;
- # 如果想恢复,可以通过链接分区方式进行恢复:
- alter table log_par attach partition log_par_his_201801 for values from ('2018-01-01') to  ('2018-02-01');