1. 分区表的意义

  1. 1. 当查询或更新一个分区上的大部分数据时,对分区进行索引扫描代价很大,然而,在分区上使用顺序扫描能提升性能。
  2. 2. 当需要删除一个分区数据时,通过drop table删除一个分区,远比delete删除数据高效,特别适用于日志数据场景。
  3. 3. 由于一个表只能存储在一个表空间,使用分区后,可以将分区放到不同的表空间上,例如可以将系统很少访问的分区放到廉价的存储设备上,也可以将系统常访问的分区存储在高速存储上。

2.传统分区表

2.1 传统分区表说明:

  1. 1. 传统分区表是通过继承和触发器的方式实现的,需要定义父表、定义子表、定义子表约束、创建子表索引、创建分区插入、删除、修改、修改函数和触发器等,可以说是在普通表基础上手工实现的分区表。

2.2 继承表:

  1. 1. 创建一张日志模型表(父表)
  2. create table tb1_log(id int4,create_date date,log_type text);
  3. insert into tb1_log values (1,'2017-08-26',null);
  4. 2. 创建一张子表:
  5. create table tb1_log_sql(sql text) inherits (tb1_log);
  6. insert into tb1_log_sql values(2,'2017-08-27',null,'select 2');
  7. 3. 此时查看父表数据,发现也有子表的数据:
  8. select * from tb1_log;
  9. 4. 虽然查询父表时连子表的数据也打印了出来,但是没有显示子表的字段,这个时候我们就需要通过OID找到父表对应的子表:
  10. select a.*,b.relname from tb1_log a,pg_class b where a.tableoid=b.oid;
  11. 5. 如果只想查看父表的数据,在父表前面加上only即可:
  12. select * from only tb1_log;

2.3 创建分区表:

  1. 创建分区表步骤说明:
  2. # 步骤一:创建父表,如果父表上定义了约束,子表会继承,因此除非是全局约束,否则不应该在父表上定义约束,另外,子表不应该写入数据。
  3. # 步骤二:通过inherits方式创建继承表,也称之为子表或分区,子表的字段定义应该和父表保持一致。
  4. # 步骤三:给所有子表创建约束,只有满足约束条件的数据才能写入对应分区,注意分区约束值范围不要有重叠。
  5. # 步骤四:给所有子表创建索引,由于继承操作不会继承继承父表的索引,因此索引需要手工创建。
  6. # 步骤五:在父表上定义insert、delete、update触发器,将SQL分发到对应分区,这步可选,因为应用可以根据分区规则定位到对应分区进行DML操作。
  7. 创建父表:
  8. create table log_ins(id serial,user_id int4,create_time timestamp(0) without time zone);
  9. 创建13个子表:
  10. create table log_ins_history(check (create_time < '2017-01-01')) inherits (log_ins);
  11. create table log_ins_201701(check (create_time >= '2017-01-01' and create_time < '2017-02-01')) inherits (log_ins);
  12. create table log_ins_201702(check (create_time >= '2017-02-01' and create_time < '2017-03-01')) inherits (log_ins);
  13. create table log_ins_201703(check (create_time >= '2017-03-01' and create_time < '2017-04-01')) inherits (log_ins);
  14. create table log_ins_201704(check (create_time >= '2017-04-01' and create_time < '2017-05-01')) inherits (log_ins);
  15. create table log_ins_201705(check (create_time >= '2017-05-01' and create_time < '2017-06-01')) inherits (log_ins);
  16. create table log_ins_201706(check (create_time >= '2017-06-01' and create_time < '2017-07-01')) inherits (log_ins);
  17. create table log_ins_201707(check (create_time >= '2017-07-01' and create_time < '2017-08-01')) inherits (log_ins);
  18. create table log_ins_201708(check (create_time >= '2017-08-01' and create_time < '2017-09-01')) inherits (log_ins);
  19. create table log_ins_201709(check (create_time >= '2017-09-01' and create_time < '2017-10-01')) inherits (log_ins);
  20. create table log_ins_201710(check (create_time >= '2017-10-01' and create_time < '2017-11-01')) inherits (log_ins);
  21. create table log_ins_201711(check (create_time >= '2017-11-01' and create_time < '2017-12-01')) inherits (log_ins);
  22. create table log_ins_201712(check (create_time >= '2017-12-01' and create_time < '2018-01-01')) inherits (log_ins);
  23. 给子表创建索引:
  24. create index idx_his_ctime on log_ins_history using btree (create_time);
  25. create index idx_his_201701_ctime on log_ins_201701 using btree (create_time);
  26. create index idx_his_201702_ctime on log_ins_201702 using btree (create_time);
  27. create index idx_his_201703_ctime on log_ins_201703 using btree (create_time);
  28. create index idx_his_201704_ctime on log_ins_201704 using btree (create_time);
  29. create index idx_his_201705_ctime on log_ins_201705 using btree (create_time);
  30. create index idx_his_201706_ctime on log_ins_201706 using btree (create_time);
  31. create index idx_his_201707_ctime on log_ins_201707 using btree (create_time);
  32. create index idx_his_201708_ctime on log_ins_201708 using btree (create_time);
  33. create index idx_his_201709_ctime on log_ins_201709 using btree (create_time);
  34. create index idx_his_201710_ctime on log_ins_201710 using btree (create_time);
  35. create index idx_his_201711_ctime on log_ins_201711 using btree (create_time);
  36. create index idx_his_201712_ctime on log_ins_201712 using btree (create_time);
  37. 创建触发器函数,设置数据插入父表时的路由规则:
  38. create or replace function log_ins_insert_trigger()
  39. returns trigger
  40. language plpgsql
  41. as $function$
  42. begin
  43. if (new.create_time<'2017-01-01') then
  44. insert into log_ins_history values (new.*);
  45. elsif (new.create_time>='2017-01-01' and new.create_time<'2017-02-01') then
  46. insert into log_ins_201701 values (new.*);
  47. elsif (new.create_time>='2017-02-01' and new.create_time<'2017-03-01') then
  48. insert into log_ins_201702 values (new.*);
  49. elsif (new.create_time>='2017-03-01' and new.create_time<'2017-04-01') then
  50. insert into log_ins_201703 values (new.*);
  51. elsif (new.create_time>='2017-04-01' and new.create_time<'2017-05-01') then
  52. insert into log_ins_201704 values (new.*);
  53. elsif (new.create_time>='2017-05-01' and new.create_time<'2017-06-01') then
  54. insert into log_ins_201705 values (new.*);
  55. elsif (new.create_time>='2017-06-01' and new.create_time<'2017-07-01') then
  56. insert into log_ins_201706 values (new.*);
  57. elsif (new.create_time>='2017-07-01' and new.create_time<'2017-08-01') then
  58. insert into log_ins_201707 values (new.*);
  59. elsif (new.create_time>='2017-08-01' and new.create_time<'2017-09-01') then
  60. insert into log_ins_201708 values (new.*);
  61. elsif (new.create_time>='2017-09-01' and new.create_time<'2017-10-01') then
  62. insert into log_ins_201709 values (new.*);
  63. elsif (new.create_time>='2017-10-01' and new.create_time<'2017-11-01') then
  64. insert into log_ins_201710 values (new.*);
  65. elsif (new.create_time>='2017-11-01' and new.create_time<'2017-12-01') then
  66. insert into log_ins_201711 values (new.*);
  67. elsif (new.create_time>='2017-12-01' and new.create_time<'2018-01-01') then
  68. insert into log_ins_201712 values (new.*);
  69. else
  70. raise exception 'create_time out of range. fix the log_ins_insert_trigger() function!';
  71. end if;
  72. return null;
  73. end;
  74. $function$;
  75. 创建并调用触发器函数:
  76. create trigger insert_log_ins_trigger before insert on log_ins for each row execute procedure log_ins_insert_trigger();

2.4 使用分区表:

  1. 在父表log_ins中插入测试数据,并验证数据是否插入对应分区:
  2. # 插入数据:
  3. insert into log_ins(user_id,create_time) select round(100000000*random()),generate_series('2016-12-01'::date,'2017-12-01'::date,'1 minute');
  4. # 查看父表数据,发现父表中没有数据:
  5. postgres=# select count(*) from only log_ins;
  6. count
  7. -------
  8. 0
  9. postgres=# select count(*) from log_ins;
  10. count
  11. --------
  12. 525601
  13. # 查看子表数据:
  14. postgres=# select min(create_time),max(create_time) from log_ins_201701;
  15. min | max
  16. ---------------------+---------------------
  17. 2017-01-01 00:00:00 | 2017-01-31 23:59:00
  18. # 查看子表大小,由此可见数据都已经插入到了子表中:
  19. postgres=# \dt+ log_ins*
  20. 查询数据时,查询子表和查询父表哪个性能高?我们可以通过执行计划分别查看:
  21. # 查询父表:
  22. explain analyze select * from log_ins where create_time > '2017-01-01' and create_time < '2017-01-02';
  23. # 查询子表:
  24. explain analyze select * from log_ins_201701 where create_time > '2017-01-01' and create_time < '2017-01-02';

image.png

2.5 constraint_exclusion 参数:

  1. 1. constraint_exclusion参数用来控制优化器是否根据表上的约束来优化查询:
  2. # on所有表都通过约束优化查询。
  3. # off所有表都不通过约束优化查询。
  4. # partition只对继承表和union all子查询通过检索约束来优化查询。
  5. 2. 开启此参数和不开启此参数进行测试:
  6. # 关闭此参数:
  7. set constraint_exclusion=off;
  8. explain analyze select * from log_ins where create_time > '2017-01-01' and create_time < '2017-01-02';
  9. # 开启此参数:
  10. set constraint_exclusion=partition;
  11. explain analyze select * from log_ins where create_time > '2017-01-01' and create_time < '2017-01-02';

2.6 添加分区表:

  1. 1. 方法一:
  2. # 在分区表历史范围到期之前,就需要扩分区:
  3. create table log_ins_201801(check (create_time >= '2018-01-01' and create_time < '2018-02-01')) inherits (log_ins);
  4. # 为扩展的分区添加索引:
  5. create index idx_his_201801_ctime on log_ins_201801 using btree (create_time);
  6. # 刷新触发器函数:
  7. 2. 方法二:(建议)
  8. # 以上添加分区的方法比较直接,创建分区时就将分区继承到父表,如果中间步骤有错可能对生产系统带来影响,所以建议将以上操作分解为如下几个步骤:
  9. # 创建分区:
  10. create table log_ins_201802(like log_ins including all);
  11. # 添加约束:
  12. alter table log_ins_201802 add constraint log_ins_201802_create_time_check check (create_time >= '2018-02-01' and create_time < '2018-03-01');
  13. # 刷新触发器函数:
  14. # 将分区继承到父表:
  15. alter table log_ins_201802 inherit log_ins;

3.内置分区表

3.1 内置分区表说明:

  1. 内置分区表不需要预先在父表上定义insertdeleteupdate触发器,对父表的DML操作会自动路由到相应分区,相比传统分区表大幅度降低了维护成本,目前仅支持范围和列表分区,本次就以创建范围分区为例。

3.2 创建内置分区表:

  1. 1. 创建主表:
  2. create table log_par (
  3. id serial,
  4. user_id int4,
  5. create_time timestamp(0) without time zone
  6. ) partition by range(create_time);
  7. # 创建主表时可选择分区方式,range范围分区或list列表分区,并指定地段或表达式作为分区键。此表指定了分区策略为范围分区,分区键为create_time字段。
  8. 2. 创建分区,并设置分区键取值范围:
  9. create table log_par_his partition of log_par for values from ('2016-12-01') to ('2017-01-01');
  10. create table log_par_his_201701 partition of log_par for values from ('2017-01-01') to ('2017-02-01');
  11. create table log_par_his_201702 partition of log_par for values from ('2017-02-01') to ('2017-03-01');
  12. create table log_par_his_201703 partition of log_par for values from ('2017-03-01') to ('2017-04-01');
  13. create table log_par_his_201704 partition of log_par for values from ('2017-04-01') to ('2017-05-01');
  14. create table log_par_his_201705 partition of log_par for values from ('2017-05-01') to ('2017-06-01');
  15. create table log_par_his_201706 partition of log_par for values from ('2017-06-01') to ('2017-07-01');
  16. create table log_par_his_201707 partition of log_par for values from ('2017-07-01') to ('2017-08-01');
  17. create table log_par_his_201708 partition of log_par for values from ('2017-08-01') to ('2017-09-01');
  18. create table log_par_his_201709 partition of log_par for values from ('2017-09-01') to ('2017-10-01');
  19. create table log_par_his_201710 partition of log_par for values from ('2017-10-01') to ('2017-11-01');
  20. create table log_par_his_201711 partition of log_par for values from ('2017-11-01') to ('2017-12-01');
  21. create table log_par_his_201712 partition of log_par for values from ('2017-12-01') to ('2018-01-01');
  22. 3. 为所有分区创建索引:
  23. create index idx_log_par_his_ctime on log_par_his using btree(create_time);
  24. create index idx_log_par_201701_ctime on log_par_his_201701 using btree(create_time);
  25. create index idx_log_par_201702_ctime on log_par_his_201702 using btree(create_time);
  26. create index idx_log_par_201703_ctime on log_par_his_201703 using btree(create_time);
  27. create index idx_log_par_201704_ctime on log_par_his_201704 using btree(create_time);
  28. create index idx_log_par_201705_ctime on log_par_his_201705 using btree(create_time);
  29. create index idx_log_par_201706_ctime on log_par_his_201706 using btree(create_time);
  30. create index idx_log_par_201707_ctime on log_par_his_201707 using btree(create_time);
  31. create index idx_log_par_201708_ctime on log_par_his_201708 using btree(create_time);
  32. create index idx_log_par_201709_ctime on log_par_his_201709 using btree(create_time);
  33. create index idx_log_par_201710_ctime on log_par_his_201710 using btree(create_time);
  34. create index idx_log_par_201711_ctime on log_par_his_201711 using btree(create_time);
  35. create index idx_log_par_201712_ctime on log_par_his_201712 using btree(create_time);
  36. 4. 插入数据:
  37. insert into log_par(user_id,create_time)
  38. select round(100000000*random()),generate_series('2016-12-01'::date,
  39. '2017-12-01'::date,'1 minute');
  40. 5. 进行验证,此时发现父表中没有任何数据:
  41. \dt+ log_par*

3.3 添加分区:

  1. 1. 添加分区:
  2. create table log_par_his_201801 partition of log_par for values from ('2018-01-01') to ('2018-02-01');
  3. 2. 创建索引:
  4. create index idx_log_par_201801_ctime on log_par_his_201801 using btree(create_time);

3.4 删除分区:

  1. 1. 方法一:直接使用drop命令进行删除:
  2. drop table log_par_his_201801;
  3. 2. 方法二:使用解绑方式:
  4. # 进行解绑:
  5. alter table log_par detach partition log_par_his_201801 ;
  6. # 如果想恢复,可以通过链接分区方式进行恢复:
  7. alter table log_par attach partition log_par_his_201801 for values from ('2018-01-01') to ('2018-02-01');