DB Basics – How to control Data Redundancy in a database system (RDBMS) by Normalization
Redundancy in Database systems occurs with various insert, update, and delete anomalies.
To avoid these anomalies in first step you need to make sure your database tables or relations are in good normal forms or normalized upto a certain level.
Normalization is a systematic way of ensuring that a database structure is suitable for general-purpose querying and free the anomalies discussed above. that could lead to a loss of data integrity.
Normal forms in a database or the concept of Normalization makes a Relation or Table free from insert/update/delete anomalies and saves space by removing duplicate data.
–> According to E. F. Codd the objectives of normalization were stated as follows:
1. To free the collection of relations from undesirable insertion, update and deletion dependencies.
2. To reduce the need for restructuring the collection of relations as new types of data are introduced, and thus increase the life span of application programs.
3. To make the relational model more informative to users.
4. To make the collection of relations neutral to the query statistics, where these statistics are liable to change as time goes by.
As of now there are total 8 normal forms, but to keep our data consistent & non-redundant the first 3 Normal Forms are sufficient.
–> Anomalies like: Let’s say you have a single table that stores Employee and Department details, thus:
1. Insert Anomaly: If you are inserting a detail of an Employee then his department detail will also be entered for every employee record, thus departments details will be repeated with multiple records, thus storing duplicate data for Departments.
2. Update Anomaly: While updating a department detail you have to update the same department for various employees, which may lead to inconsistent state if any record is left while updating or on any error.
3. Delete Anomaly: If a department is closed, then deleting department record will also delete the Employee records, thus missing records.
The process of normalization makes this EmployeeDepartment table to decompose or split into 2 or more tables and linked them by Foreign Keys, thus eliminating duplicate records, data redundancy and making data/records consistent across all relations/tables.
– 1st NF talks about atomic values and non-repeating groups.
– 2nd NF enforces that a non-Key attribute should belong to entire Key attribute.
– 3rd NF makes sure that there should be no transitive dependency between a non-Key and a Key attribute.
For details on these 3 NFs check my blog post on [Database Normalization].