:::success In this tutorial we review Chapter 07. :::

Question 1

Show that there can be more than one canonical cover for a given set of functional dependencies, using the following set of dependencies: X → Y Z, Y → X Z, and Z → X Y.

Answer

  1. R = (X, Y, Z), F = {X > YZ, Y > XZ, Z > XY}
  2. X > YZ, Y > XZ, Z > XY
  3. X > Y, X > Z, Y > X, Y > Z, Z > X, Z > Y
  4. 1. (X > Y, Y > Z, X > Z), (Y > X, Z > X, Z > Y) => (X > Z, Z > X)
  5. 2. (Y > Z, Z > X, Y > X), (X > Z, X > Y, Z > Y) => (Y > X, X > Y)

Correct Answer: :::danger Consider the first functional dependency.
Section 01: Z is extraneous
We can verify that Z is extraneous in X -> YZ and delete it.
Subseqently, we can similarly check that X is extraneous in Y -> XZ and delete it,
and that Y is extraneous in Z -> X Y and delete it,
resulting in a canonical cover { X -> Y, Y -> Z, Z -> X }

Section 02: Y is extraneous
However, we can also verify that Y is extraneous in X -> Y Z and delete it.
Subsequently, we can similarly check that Z is extraneous in Y -> X Z and delete it,
and that X is extraneous in Z -> X Y and delete it,
resulting in a canonical cover { X -> Z, Y -> X, Z -> Y} :::

Question 2

Consider the schema R = (A, B, C, D, E, G) and the set F of functional dependencies:

  1. AB CD
  2. B D
  3. DE B
  4. DEG AB
  5. AC DE

R is not in BCNF for many reasons, one of which arises from the functional dependency AB → CD.

  • Explain why AB → CD shows that R is not in BCNF and then use the BCNF decomposition algorithm starting with AB → CD to generate a BCNF decomposition of R.
  • Once that is done, determine whether your result is or is not dependency preserving, and explain your reasoning.

Answer:

Find Candidate Keys
Q: Explain why AB → CD shows that R is not in BCNF and then use the BCNF decomposition algorithm starting with AB → CD to generate a BCNF decomposition of R. :::danger Ans:
A B is not a superkey for R, nor is A B -> C D trivial. We start using A B -> C D and replace R with:
R1 = (A, B, C, D)
R2 = (A, B, E, G)

It turns out that neither of those is in BCNF. In R1 = {A, B, C, D}
R1 is not in BCNF because B -> D holds on R1 and is nontrivial, and B is not a superkey for R1.
So we replace R1 with:

  • R3 = ( B, D )
  • R4 = ( A, B, C )

R3 is in BCNF because all two-attribute schemas are in BCNF. So it is part of our answer.

If R4 is not in BCNF, it must be because of a functional dependency with just one attribute on the left side (anything else is either trivial or has a superkey on the left side).
It is then easy to see that R4 is also in BCNF and therefore part of the answer.

Back now to R2 = {A, B, E, G} , which is not in BCNF because AB > E is non-trivial(非平凡的) and the left side is not a superkey.
The reason this dependency holds on R2 is that AB > CD so AB > C so AB > AC and then,
since AC > DE. We get AB > DE,
then, AB > E. So we decompose R2 to:

  • R5 = (A, B, E)
  • R6 = (A, B, G)

Both of which are in BCNF. The answer is R3, R4, R5, R6 :::

Q: Determine whether your result is or is not dependency preserving 依賴保持, and explain your reasoning. :::danger Ans:
This is not dependency preserving. It suffices to show one dependency that is not preserved.
Consider DE > B. DE determines only itself and B, and there is no schema in the decomposition with all of B, D, E. :::

Question 3

Consider the schema R = (A, B, C, D, E, G,H)and the set F of functional dependencies:

  1. AB CD
  2. D C
  3. DE B
  4. DEH AB
  5. AC DC

Use the 3NF decomposition algorithm to generate a 3NF decomposition of R, and show your work.
This means:
a. A list of all candidate keys
b. A canonical cover for F
c. The steps of the algorithm, with explanation
d. The final decomposition

Answer 03

a. A list of all candidate keys

  1. Starting with AB -> CD, we can conclude: AB > C and AB > D
  2. Since AB > A, AB > C => C > DC
  3. Since AC > DC => A > D
  4. Since D > C => D > D, A > C
  5. Since DE > B and DEH > AB
  6. DEH > AB => DEH > A, DEH > B, DEH > CD
  7. where alpha is any subset of {A, B, C, D, E, F, G, H}
  8. Therefore, the candidate keys are DEH, AB, C.

:::danger R = (A, B, C, D, E, G,H)

Candidate Keys:
AB > CD => AB > C , AB > D
D > C => AB > C, Since we have AB > ABCD
DE > B
DEH > AB => DEH > A, DEH > B >> D EH > ABCDEH
AC > DC => AC > D, AC > C >> AC EH > ABCDEH

Observe that E, G, H do not appear on the right side of any functional dependency, and so they must
all be a part of any candidate key.
(1) AB EGH > ABCDEGH
(2) D EGH > ABCDEGH
(3) AC EGH > ABCDEGH

The candidate keys are:
AB EGH,
AC EGH,
D EGH :::

b. A canonical cover for F

  1. The given set of FDs F is:
  2. AB CD => AB > C, AB > D
  3. D C
  4. DE B
  5. DEH AB => DEH > A, DEH > B
  6. AC DC

:::danger b. Canonical Cover

  • C is extraneous on the right side of AC > DC.
    AC > D, D > C => AC > D
  • B is extraneous on the right side of DEH > AB
    DEH > A, DEH > B, DEH > DE, DE > B => DEH > A
  • C is extraneous on the right side of AB > CD
    AB > C, AB > D, D > C => AB > D

So we now the canonical cover have:

  • AB > D
  • D > C
  • DE > B
  • DEH > A
  • AC > D :::

c. The steps of the algorithm, with explanation :::danger c. So our initial design is
(A,B,D), (C,D), (B,D,E), (A, D, E, H), (A, C, D)

We remove (C,D) because it is contained in (A, C, D) and we add a schema consisting of one of the candidate keys. Because the candidate keys are (AB EGH, AC EGH, D EGH)
We arbitrarily choose (D, E, G, H) :::

d. The final decomposition :::danger d. (A, B, D), (B, D, E), (A, D, E, H), (A, C, D), (D, E, G, H) :::

❌ Question 4

List the three design goals for relational databases, and explain why each is desirable.

Answer:

:::success Three design goals are lossless-join decompositions, dependency-preserving decompositions, and minimization of repetition of information.

They are desirable so we can maintain an accurate database, check the correctness of updates quickly, and use the smallest amount of space possible. :::

❌ Question 5

In designing a relational database, why might we choose a non-BCNF design?

Answer:

:::success BCNF is not always dependency preserving.
Therefore, we may want to choose another normal form (specifically, 3NF) in order to make checking dependencies easier during updates.
This would avoid joins to check dependencies and increase system performance. :::

Question 6

Normalize the following schema, with given constraints, to 4NF.

  1. books(accessionno, isbn, title, author, publisher)
  2. users(userid, name, deptid, deptname)
  3. accessionno isbn
  4. isbn title
  5. isbn publisher
  6. isbn author
  7. userid name
  8. userid deptid
  9. deptid deptname

Answer:

In books , we see that

  1. isbn -> -> title, publisher, author

and yet, isbn is not a super key. Thus, we break books into

  1. book_accnno(accessionno, isbn)
  2. book_details(isbn, title, publisher, author)

After this, we still have

  1. isbn -> -> author

But neither is isbn a primary key of book_details , nor are the attributes of book_details equal to {isbn} U {author} . Therefore, we decompose book_details again into

  1. books_details_new(isbn, title, publisher)
  2. books_authors(isbn, author)

On the other hand, in users ,

  1. deptid -> deptname

and yet, deptid is not a super key. Hence, we break users into

  1. users(userid, name, deptid)
  2. departments(deptid, deptname)

We verify that there are no further functional or multivalued dependencies that cause violation of 4NF, so the final set of relations are:

  1. books_accnno(accessionno, isbn)
  2. books_details_new(isbn, title, publisher)
  3. books_authors(isbn, author)
  4. users(userid, name, deptid)
  5. departments(deptid, deptname)

Reference