Prior to beginning work on this interactive assignment, read chapter 6 of your textbook as well as the Absolute Beginner’s Guide to Database Normalization (Links to an external site.), Database Normalization (Links to an external site.) and Normalization: How far is far enough (Links to an external site.) online resources, and review any relevant information in this week’s lecture.
Normalization is an important step towards translating models and diagrams into relational databases. In this exercise, you will need to normalize the Grade_Report table below and explain the steps needed to do so.
For this assessment, consider the following entity and its attributes and normalize the entity Grade_Report to 3NF.
Entity: Grade_Report (StudentID, StudentName, Phone, DOB, Address, Email_Address, CourseNumber, CourseName, Grade, DepartmentID, DepartmentName)
Explain the steps needed to normalize this entity to 1NF, 2NF and 3NF. List the resultant Normalized entities in the same format as the original Grade_Report.
Once the entity is normalized to 3NF, describe any issues this entity might have that can be solved by normalization. Explain the functional dependencies that exist between the attributes. Provide the steps required to break up the Grade_Report entity into multiple entities that each have a single theme.
Your response must be a minimum of 250 words.
Assignment Due: June/07
Prior to beginning work on this assignment review Chapter 6 in your textbook, read the Absolute Beginner’s Guide to Database Normalization (Links to an external site.) and Database Normalization (Links to an external site.) online resources, and review any relevant information in this week’s lecture. These sources will be used to support your work in the assignment.
You developed and revised your ERD in Week Two, bringing you one step closer to creating the physical database in the Structured Query Language (SQL) server in your virtual lab environment. However, before doing so, you need to check the conceptual model you created and ensure that it meets the First, Second and Third normal forms.
For this assignment, explain the purpose of the normalization process including a description for each of the five normal forms (1NF, 2NF, 3NF, 4NF, and 5NF) and providing examples for each. Cite the sources for all examples. You will then translate the ERD model you created and revised in Week Two into 3NF.
Within your paper explain how each table in your ERD meets 1NF, 2NF, and 3NF. If the tables do not meet the forms, you need to resolve any issues and normalize the database in order to meet 1NF, 2NF, 3NF. If issues were encountered and had to be resolved, describe the rules that were violated and how you resolved them.
In the previous weeks, we have learned that a database (DB) is a collection of inter-related tables with each table composed of several attributes. However, a DB might exhibit several inefficiencies and data anomalies when initially built. This week will explore the normalization process, which helps to eliminate these data anomalies and minimize any redundancies.
No matter how long you spend in the industry designing DBs you will probably never encounter a customer who knows anything about the concept of normalization. Software packages like Microsoft Access allow people with limited knowledge of DB normalization and/or programming to develop simple DBs and create applications that interact with the DBs.
However, it is important to study normalization as is represents an important step for developing an efficient DB. A poorly designed relational database will require much more design effort when you have any future changes. Thus, it is critical to take time and design your tables well. When the underlying table structure is poor, modifying the database can become a very challenging task, taking much more time than it would if the tables are well designed and properly related to one another. An additional factor to consider is that the performance of a poorly designed relational DB can be significantly slower than a well designed one. For example, queries will run much more efficiently when the DB is well designed.
The study of normal forms and the processes of normalization are not trivial and will require some effort to understand. Normalization can be thought of as a process of repeated decomposition. At each level or stage, we are decomposing a table into two or more tables to minimize redundancy and/or remove anomalies.
As a guideline for your database, achieving the third normal form or Boyce-Codd normal form, is a good initial goal. Sometimes the logical organization of the data will force you to accept the third normal form (commonly referred to as 3NF) instead of fourth or fifth normal form (4NF or 5NF). It is also important to remember that normalization is more of an art than a science. It takes some experience through trial and error on your first few DBs to get an intuitive idea of what works well and what does not.
You might ask why not try to get to the fifth normal form or domain/key normal form? While some level of additional efficiencies might be achieved by normalizing your tables to the fifth normal form, it does not justify the loss of performance that will result from breaking up your tables even further. The domain/key normal form on the other hand is of purely theoretical importance, and has not been used as a practical design objective (Harrington, 2012).
Harrington also argues that the gains achieved in the fifth normal form do not justify the effort:
Fifth normal form (5NF), however, is a complex set of criteria that is extremely difficult to work with. It is, for example, very difficult to verify that a relation is in 5NF. Most practitioners do not bother with 5NF, knowing that if their relations are in 3NF (or 4NF if the situation warrants), then their designs are generally problem free. (2012).
There are some hazards with the normalization process that should be considered. First is the possibility of data loss, and second, is the possible loss of dependencies. You should be checking to ensure that you neither lose data nor lose functional dependencies when you normalize your tables.
Referential integrity is a property of data which, when satisfied, requires every value of one attribute (column) of a relation (table) to exist as a value of another attribute in a different (or the same) relation (Coronel & Morris, 2015). Let us assume we have two tables. One table with a primary key in a row that refers to one or more rows of data in the second table through a foreign key contained in the rows of the second table. Let us also suppose that for some reason a user deletes the row containing the Primary Key. This would orphan the corresponding records in the second table, meaning the data in the second table would be essentially inaccessible though logical means.
Clearly, having orphaned records is not desirable, since they cannot be accessed. A table with orphaned records is often assumed by the user to be data loss as the records they expect to see are no longer accessible by their applications. Referential integrity is a collection of rules in the database management system (DBMS) that ensure the proper relationship between the two tables.
Implementing appropriate referential integrity would avoid the scenario described above because deleting the row with critical referential data would be prevented by the integrity rules set in the DB. In most DBs you can implement strategies to deal with referential integrity problems. Most common are strategies or mechanisms to cascade updates for related fields and cascade the deletion of related fields. The result of these strategies is that there is a greatly reduced risk of future orphaned rows or suddenly inaccessible data.
Coronel, C., & Morris, S. (2015). Database systems: Design, implementation, & management (11th ed.). Stamford, CT: Cengage Learning.
Harrington, L. J. (2012) Relational Database Design Clearly Explained )2nd ed.). The Morgan Kaufmann Series in Data Management Systems.