数据库系统概论:Normalization 范型

Decomposition 分解

Avoid the repetition-of-information problem.

Lossless decomposition 无损分解

let relation R1R_1 and R2R_2 form a decomposition of relation RR. That is

R=R1R2R = R_1 \cup R_2

If

ΠR1(r1)ΠR2(r2)=r\Pi_{R_1}(r_1) \bowtie \Pi_{R_2}(r_2) = r

(\bowtie is natural join)

The decomposition is a lossless decomposition.

Read the chapter Functional Dependencies

Represent by functional dependencies is:

R1R2R1R1R2R2R_1 \cap R_2 \rightarrow R_1 \\ R_1 \cap R_2 \rightarrow R_2

The above functional dependecies are a suffcient condition for lossless join decomposition; the dependencies are a necessary condition only if all constraints are functional dependencies.

Notation BBCB \rightarrow BC is a shorthand notation for B{B,C}B\rightarrow \{B, C\}

Functional Dependencies

函数依赖

Legal instance: An instance of a relation that satisfies all such real-world constraints is called a legal instance of the relation.

Legal instance of database: one where all the relation instances are lagal instances.

Definition

If

αRβR\alpha \subset R \\ \beta \subset R \\

functional dependencies

αβ\alpha \rightarrow \beta

holds on RR if and only if for any legal relations r(R)r(R) , whenever any two tuples t1t_1 and t2t_2 :

t1[α]=t2[α]t1[β]=t2[β]t_1[\alpha] = t_2[\alpha] \rightarrow t_1[\beta] = t_2[\beta]

Closure

闭包

If ABA\rightarrow B andd BCB \rightarrow C, then we can infer that ACA \rightarrow C

F+F^+ (Closure of F) : The set of all functional dependencies logically implied by F

Keys and Functional Dependencies

Review:

KK is a superkey(超码) of RR if:

  • KRK \subset R
  • KK are sufficient(充足的) to identify a unique tuple of each possible relation r(R)r(R)

KK is a candidate key(候选码) if

  • KK is superkey
  • KK is minimal

KK is primary key(主码) if

  • KK is a candidate key
  • KK is selected to be primary key by the designer.

K is a superkey for relation schema R if and only if KRK \rightarrow R

K is a candiate key for R if and only if:

  1. KRK\rightarrow R
  2. for no αK,αR\alpha \subset K,\alpha \rightarrow RK 的所有子集都不能单独决定 R

If we want to express constraints that cannot be expressed using superkeys, we can use functional dependencies. For example, we expect $ \alpha \rightarrow \beta$, would not expect αR\alpha \rightarrow R

Use

  • To test relations to see if they are legal under a given set of functional dependencies.

    r satisfies F

  • To specify constraints on the set of legal relations

    F holds on R

Dependency Prevervation ??

A decomposition that makes it computationally hard to enforce functional dependency is said to be NOT dependency preserving.

Trivial Functional Dependencies

A functional dependency is trivial if it is satisfied by all instances of a relation.

αββα\alpha \rightarrow \beta \\ \beta\subset \alpha

Example:

ID, name &\rightarrow& ID \\ name &\rightarrow& name

Normal Forms

参考资料:

[1] Microsoft https://docs.microsoft.com/en-us/office/troubleshoot/access/database-normalization-description

[2] 昨夜听风 https://blog.csdn.net/u014458048/article/details/56678698

First Normal Form

  • Eliminate repeating groups in individual tables.
  • Create a separate table for each set of related data.
  • Identify each set of related data with a primary key.

简单来说,就是解决一个表中拥有重复作用属性的问题,应当创建一个中间表并使用一个外键来维护他们的一对多或多对多关系。也就是规定了每个属性的原子性。

Second Normal Form

  • Create separate tables for sets of values that apply to multiple records.
  • Relate these tables with a foreign key.
  • Records should not depend on anything other than a table’s primary key

为应用于多条记录的值创建一个单独的表,并使用外键来关联这些表。

一条记录只能依赖于他的主键或其子集,不能依赖于其他任何属性。

Third Normal Form

  • Eliminate fields that do not depend on the key.

所有属性都必须直接依赖于主键,即为不能出现非主键列 A 依赖于非主键列 B,非主键列 B 依赖于主键的情况。

举例:

Employee(emp_id,emp_name,emp_age,dept_id,dept_name,dept_info),当员工表中 emp_id 能够唯一确定员工员工信息,但是 dept_name 可由 dept_id 唯一确定

F+F^+ 中存在:

αβ\alpha \rightarrow \beta

至少满足以下一个条件:

  • αβ\alpha \rightarrow \beta is trivial
  • α\alpha is a superkey for R
  • Each attribute A in βα\beta - \alpha in contained in a candidate key for R (右边减去左边的所有属性都包含在候选码中)

Boyce Codd Normal Form

一个关系中的所有函数依赖的闭包中所有依赖 αβ\alpha \rightarrow \beta 满足以下至少一个条件

  • αβ\alpha \rightarrow \beta is trivial
  • α\alpha is a superkey for R

Canonical Cover

We can reduce the effort spent in checking for violations by testing a simplified set of functional dependencies that has the same closure as the given set.

This simplified set is termed the canonical cover

Extraneous Attributes: An attribute of a functional dependency in F is extraneous if **we can remove it without changing F+F^+ **

  1. Removing an attribute from the left side of a functional dependency could make it a stronger constraint

    ABCAB \rightarrow C and remove BB, get ACA \rightarrow C, is a stronger result which imply ABCAB \rightarrow C

    要看具体的函数依赖,能否让替换 ABCAB \rightarrow C 安全,不影响其他函数依赖。

  2. Removing an attribute from the right side of a functional dependency could make it a weaker constraint.

    ABCDAB \rightarrow CD and remove CC, we get the possibly weaker result ABDAB \rightarrow D. It may be weaker because using just ABDAB \rightarrow D, we can no longer inferABCAB \rightarrow C.

    要看具体的函数依赖,能否让替换 ABCDAB \rightarrow CD 安全,即就算替换了也仍然能推出 ABDAB \rightarrow D

image-20211229122935434