Redis作为缓存是可以允许数据丢失的,但是当Redis作为数据库使用时,就不能丢失数据。Redis是内存数据库,而内存数据是掉电丢失的,因此,Redis需要对数据进行持久化保存。

通常对于数据库而言,持久化保存数据通常是有快照、日志两种方式。Redis也是如此:

  • 快照方式(Snapshotting)——RDB
  • 日志方式(Append only file)——AOF

    数据库持久化需考虑的问题

    数据持久化过程中的数据丢失问题

    数据库数据持久化的一般步骤(简化):
  1. clent 发送写命令到数据库(此时数据在客户端的内存)
  2. 数据库接收到写命令(此时数据在数据库服务器的内存)
  3. 数据库调用数据库服务器 write(2)操作系统把数据从内存写到磁盘(此时数据在操作系统内核缓存(kernel’s buffer))
  4. 操作系统将数据发送给硬盘控制器(此时数据在硬盘缓存)
  5. 最后硬盘控制器将数据写入到硬盘(数据保存在硬盘上)
  • 如果只考虑数据库故障,当数据库已经把数据从内存写到内核时,那么数据可以成功持久化到硬盘,因为内核会负责把数据传输到硬盘控制器。

  • 如果考虑断电等情况,则需要考虑数据库调用系统操作将数据写到内核缓存的频率?内核将数据刷新到硬盘控制器的频率?硬盘控制器将数据写到硬盘的频率?

==》 使用write 系统调用,将数据从内存写到系统内核,但是无法控制花费时间
==》linux系统默认每30s会将内核缓存的数据写传输给硬盘控制器,换言之内核buffer没30秒刷新一次,但是使用fsync系统调用,可以立刻刷新内核缓存区。Fsync每次被调用时都会启动一个写操作,并且内核缓冲区中有一些数据等待处理。Fsync()还会阻塞进程完成写操作所需的所有时间,如果这还不够,在Linux上它还会阻塞所有正在对同一个文件进行写操作的其他线程。

==》使用write(2) 系统调用,防止数据库进程失败引起的数据丢失
==》 使用fync(2)系统调用,及时刷新内核缓存区数据,保证数据及时写入到硬盘,规避断电等故障带来的数据丢失。

数据持久化过程中的数据损坏问题

SQL和NSQL数据库都会以一定形式的数据结构在磁盘中保存数据结构和数据内容,当在写数据过程中发生故障,磁盘保存的数据内容是否还可以正常被读入?

一般来说,针对数据损坏有三种处理方式:

  1. 用户使用副本进行数据恢复,可使用数据库提供的工具,在可能的情况下尝试重建有效的数据。
  2. 使用操作日志(日志)的数据库系统,以便在发生故障后能够恢复到一致的状态。
  3. 数据库系统从不修改已经写入的数据,而只在仅追加模式下工作,因此不可能发生损坏。

    Redsi的持久化操作

    Snapshotting 快照 —— RDB

    RDB方式是Redis以快照方式将Redis数据库中当前时刻的数据保存到磁盘中,包含数据库当前时刻的全量数据。

Redis保存快照数据时,使用了系统fork()调用,创建出一个子进程异步进行数据的持久化写入,从而保证Redis正常对外提供服务。linux的fork()调用采用了父子进程的模式,父、子进程中的数据修改不影响对方,仅自身可见;同时采用写时复制(copy-on-write),保证不用复制全量的数据,从而节约内存,也提高的效率。

Redis何时进行RDB快照操作?

  1. 可以使用命令的方式操作,save命令是以前台阻塞式进行数据持久化保存(通常用于需要停机维护前,调用save进行数据保存);bgsave是后台异步的方式进行数据持久化保存。
  2. 也可以通过配置文件进行配置时间频率(如下图示例),Redsi将自动在后台进行数据的的保存。也可设置 save “” 关闭RDB功能。

image.png
Redis 快照模式RDB如何解决数据丢失和数据损坏的问题?

  • 对于数据丢失问题,因为RDB是根据配置在特定的时间点进行数据持久化存储,如果在两个存储时间点之间发生了事故,那么RDB方式最多丢失一个周期的数据,根据配置最多900s的数据。
  • 对于数据损坏问题
    • 对于事务快照,Redis事务保证要么MULTI/EXEC事务被完全写入快照,要么它根本不存在,因此不会存在一个事务不完整的数据。
    • Redis开始RDB保存数据时,fork()一个子进程,子进程首先将数据库的数据写入一个临时文件中,当完全写入后,使用新的rdb文件覆盖旧的文件,因此rbd文件结构不会被破坏。

RDB方式对数据丢失和数据损坏问题的处理,可总结Redis RDB方式的不足:

  • 会有可能丢失数据(对于需要100%保证数据可靠性该方式不可取)
  • 新产生的rdb文件总是覆盖旧的rdb文件,无法拉链式保存rdb文件,如果需要保存不同时刻的快照,需要人为将rdb文件拷贝出来进行备份(按快照时间点进行命名) ```powershell
    ########################## SNAPSHOTTING
    #

    Save the DB on disk:

    #

    save

    #

    Will save the DB if both the given number of seconds and the given

    number of write operations against the DB occurred.

    #

    In the example below the behaviour will be to save:

    after 900 sec (15 min) if at least 1 key changed

    after 300 sec (5 min) if at least 10 keys changed

    after 60 sec if at least 10000 keys changed

    #

    Note: you can disable saving completely by commenting out all “save” lines.

    #

    It is also possible to remove all the previously configured save

    points by adding a save directive with a single empty string argument

    like in the following example:

    #

    save “”

save 900 1 # 900s内进行了1次数据的修改操作 save 300 10 # 300s内进行了10次数据修改操作 save 60 10000 # 60s内 进行了10000次数据修改操作

By default Redis will stop accepting writes if RDB snapshots are enabled

(at least one save point) and the latest background save failed.

This will make the user aware (in a hard way) that data is not persisting

on disk properly, otherwise chances are that no one will notice and some

disaster will happen.

#

If the background saving process will start working again Redis will

automatically allow writes again.

#

However if you have setup your proper monitoring of the Redis server

and persistence, you may want to disable this feature so that Redis will

continue to work as usual even if there are problems with disk,

permissions, and so forth.

stop-writes-on-bgsave-error yes # 后台存储发生错误时会进制写操作

Compress string objects using LZF when dump .rdb databases?

For default that’s set to ‘yes’ as it’s almost always a win.

If you want to save some CPU in the saving child set it to ‘no’ but

the dataset will likely be bigger if you have compressible values or keys.

rdbcompression yes # 设置允许进行压缩

Since version 5 of RDB a CRC64 checksum is placed at the end of the file.

This makes the format more resistant to corruption but there is a performance

hit to pay (around 10%) when saving and loading RDB files, so you can disable it

for maximum performances.

#

RDB files created with checksum disabled have a checksum of zero that will

tell the loading code to skip the check.

rdbchecksum yes

The filename where to dump the DB

dbfilename dump.rdb # RDB 保存文件名称

The working directory.

#

The DB will be written inside this directory, with the filename specified

above using the ‘dbfilename’ configuration directive.

#

The Append Only File will also be created inside this directory.

#

Note that you must specify a directory here, not a file name.

dir ./ # RDB文件保存路径

  1. <a name="dTnav"></a>
  2. ### 日志 Apend only File —— AOF
  3. 对于RDB方式持久化保存数据,有可能会丢失部分数据,而操作日志方式——追加文件(Append Only File) AOF 方式可保存用户全部的操作,每次执行修改内存中的数据集的写操作时,都会记录该操作。
  4. AOF模式默认关闭,可在启动Redis时 ./redis-server --appendonly yes 设置开启,也可以通过修改配置文件进行开启。appendonly yes # 开启AOF
  5. 例:开启AOF,对于以下操作<br />![image.png](https://cdn.nlark.com/yuque/0/2022/png/1636836/1651585087881-45ae5bbe-4309-45de-8b99-f83b12d349f2.png#clientId=u12d99601-6d58-4&crop=0&crop=0&crop=1&crop=1&from=paste&height=276&id=ueb24aa69&margin=%5Bobject%20Object%5D&name=image.png&originHeight=293&originWidth=402&originalType=binary&ratio=1&rotation=0&showTitle=false&size=15997&status=done&style=none&taskId=uf12bcb3e-c2ed-4725-ad66-a42f80f292f&title=&width=378)
  6. ```powershell
  7. *2 # 当前操作2位操作符
  8. $6 # 下一个操作符字节数
  9. SELECT
  10. $1
  11. 0
  12. *3
  13. $3
  14. set
  15. $3
  16. foo
  17. $4
  18. name
  19. *3
  20. $3
  21. set
  22. $3
  23. bar
  24. $3
  25. age
  26. *3
  27. $3
  28. set
  29. $3
  30. foo
  31. $1
  32. 1
  33. *3
  34. $3
  35. set
  36. $3
  37. bar
  38. $1
  39. 2
  40. *2
  41. $4
  42. incr
  43. $3
  44. age
  45. *2
  46. $4
  47. incr
  48. $3
  49. age

AOF 是采用追加的方式,将用户修改数据的操作追加到append.aof文件,那么一直追加下去那么append.aof文件的大小将会很大,对于这个问题AOF采用rewrite操作。可使用BGREWRITEOF命令进行后台异步rewrite操作,也可以通过配置文件进行配置。
image.png
Redis4.0之前,AOF删除抵消的命令,合并重复的命令,重写到临时的append.aof文件;

而Redis4.0以后AOF重写首先将RDB(如果开启)的内容写到临时的append.aof文件,然后再将增量以append的方式添加到文件后面。

重写完成后,使用fsync将临时文件同步到磁盘上,并用于覆盖旧的AOF文件。

AOF持久化,是将用户的操作append 追加到文件后,数据还是需要从内存—》内核—》硬盘,那么AOF如何保证数据丢失问题和数据损坏问题呢?

  • 对于数据丢失问题,AOF将用户操作保存到append.aof文件有三种模式可以设置,
    • appendfsync no 在这个配置中,Redis根本不执行fsync(2)调用。而是让操作系统进行控制,当内核缓冲器满了在刷新到硬盘。
    • appendfsync everysec 每秒使用fsync(2)系统调用将内核缓冲区数据刷新到磁盘
    • appednfsync always 每个命令都调用fsync(2)系统调用将内核缓冲器数据刷新硬盘中

以上三种策略,appendfsync no 效率最高,但是安全性最低(有可能丢失内核缓冲器大小的数据(4k左右);appednfsync always 最安全,不会丢失数据,但是效率最低;appendfsync everysec 可靠性和安全性介于中间,最多丢失1s的数据。

  • 对于数据损坏问题,AOF采取append的方式,不会修改已经写入的数据;另外如果AOF文件出错,可使用redis-check-aof对AOF文件就行修复。
############################## APPEND ONLY MODE ###############################

# By default Redis asynchronously dumps the dataset on disk. This mode is
# good enough in many applications, but an issue with the Redis process or
# a power outage may result into a few minutes of writes lost (depending on
# the configured save points).
#
# The Append Only File is an alternative persistence mode that provides
# much better durability. For instance using the default data fsync policy
# (see later in the config file) Redis can lose just one second of writes in a
# dramatic event like a server power outage, or a single write if something
# wrong with the Redis process itself happens, but the operating system is
# still running correctly.
#
# AOF and RDB persistence can be enabled at the same time without problems.
# If the AOF is enabled on startup Redis will load the AOF, that is the file
# with the better durability guarantees.
#
# Please check http://redis.io/topics/persistence for more information.

appendonly yes # 开启AOF

# The name of the append only file (default: "appendonly.aof")

appendfilename "appendonly.aof"

# The fsync() call tells the Operating System to actually write data on disk
# instead of waiting for more data in the output buffer. Some OS will really flush
# data on disk, some other OS will just try to do it ASAP.
#
# Redis supports three different modes:
#
# no: don't fsync, just let the OS flush the data when it wants. Faster.
# always: fsync after every write to the append only log. Slow, Safest.
# everysec: fsync only one time every second. Compromise.
#
# The default is "everysec", as that's usually the right compromise between
# speed and data safety. It's up to you to understand if you can relax this to
# "no" that will let the operating system flush the output buffer when
# it wants, for better performances (but if you can live with the idea of
# some data loss consider the default persistence mode that's snapshotting),
# or on the contrary, use "always" that's very slow but a bit safer than
# everysec.
#
# More details please check the following article:
# http://antirez.com/post/redis-persistence-demystified.html
#
# If unsure, use "everysec".

# appendfsync always
appendfsync everysec
# appendfsync no

# When the AOF fsync policy is set to always or everysec, and a background
# saving process (a background save or AOF log background rewriting) is
# performing a lot of I/O against the disk, in some Linux configurations
# Redis may block too long on the fsync() call. Note that there is no fix for
# this currently, as even performing fsync in a different thread will block
# our synchronous write(2) call.
#
# In order to mitigate this problem it's possible to use the following option
# that will prevent fsync() from being called in the main process while a
# BGSAVE or BGREWRITEAOF is in progress.
#
# This means that while another child is saving, the durability of Redis is
# the same as "appendfsync none". In practical terms, this means that it is
# possible to lose up to 30 seconds of log in the worst scenario (with the
# default Linux settings).
#
# If you have latency problems turn this to "yes". Otherwise leave it as
# "no" that is the safest pick from the point of view of durability.

no-appendfsync-on-rewrite no

# Automatic rewrite of the append only file.
# Redis is able to automatically rewrite the log file implicitly calling
# BGREWRITEAOF when the AOF log size grows by the specified percentage.
#
# This is how it works: Redis remembers the size of the AOF file after the
# latest rewrite (if no rewrite has happened since the restart, the size of
# the AOF at startup is used).
#
# This base size is compared to the current size. If the current size is
# bigger than the specified percentage, the rewrite is triggered. Also
# you need to specify a minimal size for the AOF file to be rewritten, this
# is useful to avoid rewriting the AOF file even if the percentage increase
# is reached but it is still pretty small.
#
# Specify a percentage of zero in order to disable the automatic AOF
# rewrite feature.

auto-aof-rewrite-percentage 100 # 文件增加百分百(达到百分比是进行重新)
auto-aof-rewrite-min-size 64mb  # 自动重写时文件大小阈值(没达到一个阈值时进行重新)

# An AOF file may be found to be truncated at the end during the Redis
# startup process, when the AOF data gets loaded back into memory.
# This may happen when the system where Redis is running
# crashes, especially when an ext4 filesystem is mounted without the
# data=ordered option (however this can't happen when Redis itself
# crashes or aborts but the operating system still works correctly).
#
# Redis can either exit with an error when this happens, or load as much
# data as possible (the default now) and start if the AOF file is found
# to be truncated at the end. The following option controls this behavior.
#
# If aof-load-truncated is set to yes, a truncated AOF file is loaded and
# the Redis server starts emitting a log to inform the user of the event.
# Otherwise if the option is set to no, the server aborts with an error
# and refuses to start. When the option is set to no, the user requires
# to fix the AOF file using the "redis-check-aof" utility before to restart
# the server.
#
# Note that if the AOF file will be found to be corrupted in the middle
# the server will still exit with an error. This option only applies when
# Redis will try to read more data from the AOF file but not enough bytes
# will be found.
aof-load-truncated yes

# When rewriting the AOF file, Redis is able to use an RDB preamble in the
# AOF file for faster rewrites and recoveries. When this option is turned
# on the rewritten AOF file is composed of two different stanzas:
#
#   [RDB file][AOF tail]
#
# When loading Redis recognizes that the AOF file starts with the "REDIS"
# string and loads the prefixed RDB file, and continues loading the AOF
# tail.
aof-use-rdb-preamble yes  #AOF重写时先把RDB的数据写到AOF中,结合RDB恢复数据快的特点