
Autor: 27.10.2023
Database Normalization and the Three Normal Forms
This article provides a concise introduction to the topic of normalization, focusing on the so-called three normal forms, knowledge of which is essential for understanding the subject. Each of the three normal forms will be explained using simple examples. We assume that the audience has basic skills in relational databases.
SQL Developer
Unlock the power of data manipulation and retrieval with SQL, the language that's essential for modern data-driven careers. Embark on a journey to become a proficient SQL specialist through a hands-on learning experience. Dive into real-world skills with practical tasks and exercises that guide your learning. Learn more
First Normal Form (1NF)
The First Normal Form (1NF) is the initial level of data normalization, which requires that each cell must store a single, atomic value, not a set of values or data structures.
Let's look at the following table:

In this case, the product_name column contains a set of products, and the product_price column contains a set of prices. This does not meet the requirements of 1NF because the cells do not contain single values.
After 1NF normalization, the data is split into separate rows, where each row contains a single product and its price. The table may now look like this:

Now, each row contains a single product and its price, fulfilling the 1NF conditions.
Second Normal Form (2NF)
The Second Normal Form (2NF) builds upon 1NF and adds the requirement that each column in a table must depend on the primary key.
Let's examine the following "orders" table:

The table above satisfies 1NF but does not meet the 2NF criteria. It contains too much information. Customer and product data are dependent on the order identifier. The data is overly connected. We should separate customer and product data from the order. Here's how we can do it:
Customer data table

Product data table

Order table

Now, all data unrelated to the orders has been moved to other tables. The order table contains only foreign keys (links) to the customers and products tables.
Third Normal Form (3NF)
The Third Normal Form (3NF) goes beyond 2NF and adds the requirement that a non-key column should not depend on another non-key column.
Let's analyze the "products" table:

For 3NF, each non-key column must depend solely on the primary key, not on other non-key columns. In the products table, the price and pizza size are non-key columns, and the price depends on the size, which violates the 3NF. To meet this rule, you can create a separate "product_sizes" table where the price is associated only with the pizza size.
First, we'll establish the price's dependence on the size but in a separate "product_sizes" table:

Next, we'll reconstruct the products table:

Now, the products table contains a foreign key to the "product_sizes" table, allowing for price management based on pizza size while maintaining the product structure in the products table.
SQL Developer
Unlock the power of data manipulation and retrieval with SQL, the language that's essential for modern data-driven careers. Embark on a journey to become a proficient SQL specialist through a hands-on learning experience. Dive into real-world skills with practical tasks and exercises that guide your learning. Learn more
Summary
This article should provide you with an understanding of the fundamental principles of the three normal forms. They play a crucial role in designing an efficient and consistent database.