In 1972, Ted Codd from IBM developed the relational model and identified normal forms which were called first, second, and third. When normalizing a database, you are ensuring that it has been broken into its normal forms. Normalization is a common technique for analyzing relational data structures. It is important to understand the basic concept of normalization when data modeling. Normalization applies to physical database models.
The benefits of having a properly normalized data model and database design are that it:
The process of normalization involves three main steps. These are:
Remove repeating groups. Remove from an entity those elements which maintain the same value between occurrences (or records) while the rest of the elements change. |
|
Remove elements which are only partially dependent on the key of the entity. |
|
Remove elements which are dependent on (i.e. are identified by) a key other than that of the entity. |
The result of the normalization process should be that each entity contains only those elements which are properly identified by the keys of that entity.
For example, consider the information which might appear on an order form. It could start with an entity which contains all the data items such as in this example:
Un-normalized Form
|
First remove Repeating Groups. The customer details will be the same for each product ordered (i.e. for each line of the order). They are separated from the ORDER information which will be different on each line.
|
Next remove attributes only partially dependent on primary key (or part of compound key). The product name depends only on Product Number (not on Order Number) and so it is removed.
|
Finally, remove attributes dependent on a key other than the primary (or compound key). The customer details depend only on the customer number and so are removed.
|