数据库系统概论:恢复系统 Recovery System
Recovery System
Failure Classification
Transaction failure:
- Logical errors
- System errors
System crash
Disk failure
Recovery Algorithms
- Actions taken during normal transaction processing to ensure enough information exists to recover from failures
- Actions taken after a failure to recover the database contents to a state that ensures atomicity, consistency and durability
Storage Structure
- Volatile storage:
- Does not survive system crashes
- Main memory, cache memory
- Nonvolatile storage:
- Survives system crashes
- Disk, non-volatile RAM
- Stable storage:
- Survives all failures
- Approximated by maintaining multiple copies on distinct nonvolatile media
Undo And Redo
undo(Ti) :从 Ti 的最后一条日志往回( backward )撤回,将所有数据项恢复到还没有执行 Ti 的样子
undo(Ti) —— restores the value of all data items updated by Ti to their old values, going backwards from the last log record for Ti
redo(Ti) :从 Ti 的第一条日志往前( forward )执行,将所有数据项还原成 Ti 执行后的样子。
Repeating History
假设一个事务被 Undo 了且这个事务的中止记录已经被写入日志,在这之后错误发生了。
Suppose that transaction Ti was undone earlier and the
record was written to the log, and then a failure occurs:
当从错误中恢复的时候,会对 Ti 执行 redo 操作
On recovery from failure transaction Ti is redone
这样的恢复操作会将事务 Ti 的所有操作都再做一遍,包括他的恢复操作。这就被称为重复历史
Such a redo redoes all the original actions of transaction Ti including the steps that restored old values —— known as repeating history
看似浪费,却简化了回收的过程
Seems wasteful, but simplifies recovery greatly
Checkpoints
Redoing/undoing all transactions recorded in the log can be very slow
通过定时设置检查点简化恢复过程
Streamline recovery procedure by periodically performing checkpointing
checkpointing : 设置检查点
设置检查点步骤如下:
-
将当前驻留在主存中的所有日志记录输出到稳态存储 (Stable storage)
-
将所有修改过的缓冲区块输出到磁盘
-
将日志记录
写入稳态存储中,其中 L 是设置检查点时所有活动的事务列表 Write a log record < checkpoint L> onto stable storage where L is a
list of all transactions active at the time of checkpoint. -
在设置检查点时,所有更新都将停止
在恢复期间,我们只需要考虑在检查点之前启动的最近的事务Ti,以及在检查点之后启动的事务Ti。
During recovery we need to consider only the most recent transaction Ti that started before the checkpoint, and transactions that started after Ti.
-
从日志末尾往回扫描,找到最近的检查点
Scan backwards from end of log to find the most recent
record -
只有在 L 中或在检查点后的事务才需要执行 Redo/Undo
Only transactions that are in L or started after the checkpoint need to be redone or undone
-
那些在检查点之前的事务,无论成功与否,都已经输出到了稳态存储器中。(所以无需 Redo/Undo)
Transactions that committed or aborted before the checkpoint already have all their updates output to stable storage.
undo操作可能需要日志的一些早期部分 继续向后扫描,直到为 L 中的每个事务 Ti 找到一条
Some earlier part of the log may be needed for undo operations
- Continue scanning backwards till a record
is found for every transaction Ti in L. - Parts of log prior to earliest
record above are not needed for recovery, and can be erased whenever desired.
Example of checkpoint
在执行 的时候发生了系统故障,从检查点 checkpoint 开始回复,T2、T3 需要再执行一遍(Redo), T4 需要撤销(undo).

Immediate DB Modification Recovery
立即的数据库更新

Recovery Algorithm
恢复策略
Log
Logging ( during normal operation )
at transaction start - <Ti, Xj, V1, V2> for each update, and
at transaction end
Transaction rollback (during normal operation)
Let Ti be the transaction to be rolled back
Scan log backwards from the end, and for each log record of Ti of the form <Ti, Xj, V1, V2> :
- Perform the undo by writing V1 to Xj,
- Write a log record <Ti , Xj, V1> , called compensation(补偿) log records
- Once the record
is found stop the scan and write the log record
Recovery from failure
phase 阶段
Two phases:
-
Redo phase: replay updates of all transactions, whether they committed, aborted, or are incomplete.
-
Find last
record, and set undo-list to L. -
Scan forward from above
record -
将所有数据项都设置为修改后的值
Whenever a record <Ti, Xj, V1, V2> or <Ti, Xj, V2> is found, redo it by writing V2 to Xj
-
若遇到了一个事务的开始标志,则将这个事务加入 undo-list 中
Whenever a log record
is found, add Ti to undo-list -
若遇到了一个事务的结束标志(commit 或 abort),就将这个事务从 undo-list 中移除
Whenever a log record
or is found, remove Ti from undo-list
-
-
-
Undo phase: undo all incomplete transactions.
Scan log backwards from end
- Whenever a log record <Ti, Xj, V1, V2> is found where Ti is in undo-list perform same actions as for transaction rollback:
- 将该数据项设置为原本的值
- 写入日志 <Ti, Xj, V1>
- Whenever a log record
is found where Ti is in undo-list - Write a log record
- Remove Ti from undo-list
- Ti 的 Undo 已经执行完成
- Write a log record
- Stop when undo-list is empty
- Whenever a log record <Ti, Xj, V1, V2> is found where Ti is in undo-list perform same actions as for transaction rollback: