Archive for March, 2009

DB Basics – Database Normalization | 1NF, 2NF, 3NF

March 25, 2009 5 comments

In the field of Relational Database design, normalization is a systematic way of ensuring that a database structure is suitable for general-purpose querying and free of certain undesirable characteristics—insertion, update, and deletion anomalies—that could lead to a loss of data integrity.

According to E. F. Codd the objectives of normalization were stated as follows:

1. To free the collection of relations from undesirable insertion, update and deletion dependencies.

2. To reduce the need for restructuring the collection of relations as new types of data are introduced, and thus increase the life span of application programs.

3. To make the relational model more informative to users.

4. To make the collection of relations neutral to the query statistics, where these statistics are liable to change as time goes by.

E. F. Code the inventor of Relational Model, introduced the concept of normalization (1NF at 1970, 2-3NF at 1971, then with R. F. Boyce defined the BCFN in 1974).

C. Date, H. Darwin, R. Fagin, N. Lorentzos defined other higher forms upto 6NF by 2002.

As of now there are total 8 normal forms, as follows:
1. First normal form (1NF)
2. Second normal form (2NF)
3. Third normal form (3NF)
4. Boyce-Codd normal form (BCNF)
5. Fourth normal form (4NF)
6. Fifth normal form (5NF)
7. Domain/key normal form (DKNF)
8. Sixth normal form (6NF)

But to keep our data consistent & non-redundant the first 3 Normal Forms are sufficient.

1. The 1st Normal Form:

– There are no duplicate rows and each row should have a unique identifier (or Primary key). A table should be free from repeating groups.

– The values in each column of a table are atomic. Meaning a field value cannot be decomposed into smaller pieces or should not be divided into parts with more than one kind of data in it.
Like: A Person’s Name column could be further divided into First, Middle, Last Name columns.

2. The 2nd Normal Form:

– A table should be in 1st Normal Form.

– Any Candidate key (K) and any Attribute (A) that is not a constituent of a candidate key, A depends upon whole of K rather than just part of it.

Means all its non-prime attributes are functionally dependent on the whole of a candidate key.

In Simple terms, any non-key columns must be dependent on the entire primary key. In the case of a composite primary key, this means that a non-key column cannot depend on only part of the composite key.

3. The 3rd Normal Form:

– A table should be in 2nd Normal Form.

– Every non-prime attribute of R is non-transitively dependent (i.e. directly dependent) on every candidate/primary key of R.

– All columns should depend directly on the primary key. Tables violate the Third Normal Form when one column depends on another column, which in turn depends on the primary key (a transitive dependency).

>> Check & Subscribe my [YouTube videos] on SQL Server.

Categories: DB Concepts Tags: , , , ,

DB Basics – SQL Server JOINS and Types

March 12, 2009 5 comments

JOIN clause in SQL Server is used to combine records and create a new record set from two tables based upon the relationship between them. The relationship is established by JOINing common columns with the ON clause from both the tables and returning only required columns from both the tables.

JOIN clause is specified with the FROM clause. Clauses like AND, WHERE and/or HAVING can also be used to filter the rows selected by the JOIN clause.

–> A JOIN table operator operates on two input tables. The three fundamental types of joins are CROSS JOIN, INNER JOIN, and OUTER JOINS. These three types of joins differ in how they apply their logical query processing phases; each type applies a different set of phases:

– A CROSS JOIN applies only one phase — Cartesian Product.

– An INNER JOIN applies two phases — Cartesian Product and Filter.

– An OUTER JOIN applies three phases — Cartesian Product, Filter, and Add Outer Rows.

–> Here is a pictorial representation of various types JOINs you can create in T-SQL:


–> Joins can be categorized as:

1. CROSS JOINs: Cross Joins return all rows from the Left table. Each row from the Left table is combined with all rows from the Right table. Cross Joins are also called Cartesian products.

2. INNER JOIN: (the typical Join operation, which uses some comparison operator like = or ). These include equi-joins and natural joins.
Inner Joins use a comparison operator to match rows from two tables based on the values in common columns from each table.

3. OUTER JOIN: Outer joins can be a Left, a Right, or Full Outer Join.
Outer joins are specified with one of the following sets of keywords when they are specified in the FROM clause:

3.a. LEFT JOIN or LEFT OUTER JOIN: The result set of a Left Outer Join includes all the rows from the left table specified in the LEFT OUTER clause, not just the ones in which the joined columns match. When a row in the left table has no matching rows in the right table, the associated result set row contains null values for all select list columns coming from the right table.

3.b. RIGHT JOIN or RIGHT OUTER JOIN: A Right Outer Join is the reverse of a Left Outer Join. All rows from the right table are returned. Null values are returned for the left table any time a right table row has no matching row in the left table.

3.c. FULL JOIN or FULL OUTER JOIN: A Full Outer Join returns all rows in both the Left and Right tables. Any time a row has no match in the other table, the select list columns from the other table contain null values. When there is a match between the tables, the entire result set row contains data values from the base tables.

>> Check & Subscribe my [YouTube videos] on SQL Server.