现象

记录一个生产问题,DBA在执行下面这个GRANT操作的时候,偶发的会出现这个报错

  1. [guqi@intel175 ~]$ psql -p 51300
  2. psql (14devel)
  3. Type "help" for help.
  4. guqi=# grant select on all tables in schema public to guqi ;
  5. ERROR: tuple concurrently updated
  6. guqi=# \q

分析

GRANT操作不会更新表内的数据,应该只是会更新系统表内的行。换言之,当执行GRANT的时候,有些数据表在系统表内的行被别人更新了。设置下errverbose,发现是在更新pg_classrelacl属性

  1. [guqi@intel175 ~]$ psql -p 51300
  2. psql (14devel)
  3. Type "help" for help.
  4. guqi=#\errverbose
  5. guqi=# grant select on all tables in schema public to guqi ;
  6. ^CCancel request sent
  7. ERROR: canceling statement due to user request
  8. CONTEXT: while updating tuple (6,6) in relation "pg_class"
  9. guqi=# select * from pg_class where ctid = '(6,6)';
  10. oid | relname | relnamespace | reltype | reloftype | relowner | relam | relfilenode | reltablespace | relpages | reltuples
  11. | relallvisible | reltoastrelid | relhasindex | relisshared | relpersistence | relkind | relnatts | relchecks | relhasrules |
  12. relhastriggers | relhassubclass | relrowsecurity | relforcerowsecurity | relispopulated | relreplident | relispartition | relr
  13. ewrite | relfrozenxid | relminmxid | relacl | reloptions | relpartbound
  14. -------+---------+--------------+---------+-----------+----------+-------+-------------+---------------+----------+-----------
  15. +---------------+---------------+-------------+-------------+----------------+---------+----------+-----------+-------------+-
  16. ---------------+----------------+----------------+---------------------+----------------+--------------+----------------+-----
  17. -------+--------------+------------+--------+------------+--------------
  18. 16458 | tb1 | 2200 | 16460 | 0 | 10 | 2 | 16462 | 0 | 0 | -1
  19. | 0 | 0 | f | f | p | r | 2 | 0 | f |
  20. f | f | f | f | t | d | f |
  21. 0 | 580 | 1 | | |
  22. (1 row)

同时查看pg_stat_activity,发现有TRUNCATE在运行,与内核开发的小伙伴沟通了下,truncate在内部会重建表,重建会导致表的物理存储发生变化。基于这一点,我测试了下

  1. # 窗口1
  2. guqi=# begin;
  3. BEGIN
  4. guqi=*# truncate tb1 ;
  5. TRUNCATE TABLE
  6. -- 暂不提交
  7. # 窗口2
  8. guqi=# grant select on all tables in schema public to guqi ;
  9. -- 操作会处于锁等状态
  10. # 窗口1
  11. guqi=*# commit;
  12. COMMIT
  13. # 窗口2
  14. guqi=# grant select on all tables in schema public to guqi ;
  15. ERROR: tuple concurrently updated

按照上述顺序执行,100%触发,而且通过跟踪tb1pg_class内的ctid值,发现TRUNCATE前后确实变了,那么基本原因就是这个了。

解决的话,把GRANTTRUNCATE分开执行,不要并发。