DB Basics – Database Normalization | 1NF, 2NF, 3NF
In the field of Relational Database design, normalization is a systematic way of ensuring that a database structure is suitable for general-purpose querying and free of certain undesirable characteristics—insertion, update, and deletion anomalies—that could lead to a loss of data integrity.
According to E. F. Codd the objectives of normalization were stated as follows:
1. To free the collection of relations from undesirable insertion, update and deletion dependencies.
2. To reduce the need for restructuring the collection of relations as new types of data are introduced, and thus increase the life span of application programs.
3. To make the relational model more informative to users.
4. To make the collection of relations neutral to the query statistics, where these statistics are liable to change as time goes by.
E. F. Code the inventor of Relational Model, introduced the concept of normalization (1NF at 1970, 2-3NF at 1971, then with R. F. Boyce defined the BCFN in 1974).
C. Date, H. Darwin, R. Fagin, N. Lorentzos defined other higher forms upto 6NF by 2002.
As of now there are total 8 normal forms, as follows:
1. First normal form (1NF)
2. Second normal form (2NF)
3. Third normal form (3NF)
4. Boyce-Codd normal form (BCNF)
5. Fourth normal form (4NF)
6. Fifth normal form (5NF)
7. Domain/key normal form (DKNF)
8. Sixth normal form (6NF)
But to keep our data consistent & non-redundant the first 3 Normal Forms are sufficient.
1. The 1st Normal Form:
– There are no duplicate rows and each row should have a unique identifier (or Primary key). A table should be free from repeating groups.
– The values in each column of a table are atomic. Meaning a field value cannot be decomposed into smaller pieces or should not be divided into parts with more than one kind of data in it.
Like: A Person’s Name column could be further divided into First, Middle, Last Name columns.
2. The 2nd Normal Form:
– A table should be in 1st Normal Form.
– Any Candidate key (K) and any Attribute (A) that is not a constituent of a candidate key, A depends upon whole of K rather than just part of it.
Means all its non-prime attributes are functionally dependent on the whole of a candidate key.
In Simple terms, any non-key columns must be dependent on the entire primary key. In the case of a composite primary key, this means that a non-key column cannot depend on only part of the composite key.
3. The 3rd Normal Form:
– A table should be in 2nd Normal Form.
– Every non-prime attribute of R is non-transitively dependent (i.e. directly dependent) on every candidate/primary key of R.
– All columns should depend directly on the primary key. Tables violate the Third Normal Form when one column depends on another column, which in turn depends on the primary key (a transitive dependency).
>> Check & Subscribe my [YouTube videos] on SQL Server.
Subkeys and normalizationNormalization means following a procedure or set of rules to insure that a database is well designed. Most normalization rules are meant to eliminate redundant data (that is, unnecessary duplicate data) in the database. Subkeys always result in redundant data, so we need to eliminate them.• If there are no subkeys in any of the tables in your database, you have a well-designed model according to what is usually called third normal form, or 3NF. Actually, 3NF permits subkeys in some very exceptional circumstances that we won’t discuss here; the strict no-subkey form is formally known as Boyce-Codd normal form, or BCNF.• A super key always functionally determines all of the other attributes in a relation (as well as itself). This is a “good” FD. A “bad” FD happens when we have an attribute or set of attributes that are a super key for some of the other attributes in the relation, but not a super key for the entire relation. We call this set of attributes a subkey of the relation.• Some textbooks use the terms partial FDs and transitive FDs. Both of these are subkeys—the first where the subkey is part of a primary key, the second where is isn’t.More on Subkeys & Normalization: http://www.tomjewett.com/dbdesign/dbdesign.php?page=subkeys.php
Nice topic on NF: http://www.phlonx.com/resources/nf3/