1 1 12 Normalization

You are here: Visual LANSA Logical Modeler > 1. Introduction to Data Modeling > 1.1 What is Data Modeling? > 1.1.12 Normalization

1.1.12 Normalization

In 1972, Ted Codd from IBM developed the relational model and identified normal forms which were called first, second, and third. When normalizing a database, you are ensuring that it has been broken into its normal forms. Normalization is a common technique for analyzing relational data structures. It is important to understand the basic concept of normalization when data modeling. Normalization applies to physical database models.

The benefits of having a properly normalized data model and database design are that it:

Reduces data redundancy.

Produces a database structure that is easily understood by users and system developers.

Produces a data model that is flexible in the face of changing business requirements.

Produces a database structure that encourages structured, and therefore simpler, program development.

The process of normalization involves three main steps. These are:

First Normal Form	Remove repeating groups. Remove from an entity those elements which maintain the same value between occurrences (or records) while the rest of the elements change.
Second Normal Form	Remove elements which are only partially dependent on the key of the entity.
Third Normal Form	Remove elements which are dependent on (i.e. are identified by) a key other than that of the entity.

The result of the normalization process should be that each entity contains only those elements which are properly identified by the keys of that entity.

For example, consider the information which might appear on an order form. It could start with an entity which contains all the data items such as in this example:

Un-normalized Form

ORDER

Order Number, Customer Number, Customer Name, Customer Address, Product Number, Product Name, Quantity Ordered

First Normal Form

First remove Repeating Groups. The customer details will be the same for each product ordered (i.e. for each line of the order). They are separated from the ORDER information which will be different on each line.

ORDER

Order Number, Customer Number, Customer Name, Customer Address

The fields in bold are the unique or candidate key fields.

ORDERED PRODUCT

Order Number, Product Number, Product Name, Quantity Ordered

Second Normal Form

Next remove attributes only partially dependent on primary key (or part of compound key). The product name depends only on Product Number (not on Order Number) and so it is removed.

ORDER

Order Number, Customer Number, Customer Name, Customer Address

The fields in bold are the unique or candidate key fields.

ORDERED PRODUCT

Order Number, Product Number, Quantity Ordered

PRODUCT

Product Number, Product Name

Third Normal Form

Finally, remove attributes dependent on a key other than the primary (or compound key). The customer details depend only on the customer number and so are removed.

ORDER	Order Number, Customer Number	The fields in bold are the unique or candidate key fields.
ORDERED PRODUCT	Order Number, Product Number, Quantity Ordered
PRODUCT	Product Number, Product Name
CUSTOMER	Customer Number, Customer Name, Customer Address