Open In Colab

Introduction to Normalization#

Database normalization is the process of structuring a relational database to reduce data redundancy and improve data integrity. Let’s review what each normal form requires:

First Normal Form (1NF)- Each table cell should contain a single value#

  • Each record needs to be unique (typically through a primary key)

  • No repeating groups or arrays

Second Normal Form (2NF)#

  • Must be in 1NF

  • All non-key attributes must be fully functionally dependent on the primary key

  • No partial dependencies (where an attribute depends on only part of the primary key)

Third Normal Form (3NF)#

  • Must be in 2NF

  • No transitive dependencies (where a non-key attribute depends on another non-key attribute)

  • Every non-key attribute must depend directly on the primary key

Exercise: Normalize the Movies Database to 3NF#

Starting Point: Unnormalized Table#

We have an unnormalized table containing information about movies with redundant data and mixed data types. The table includes data about movies, directors, actors, studios, and more.

Movie_ID

Title

Year

Director

Director_Birthdate

Lead_Actor

Lead_Actor_Gender

Lead_Actor_Age

Lead_Actor_Previous_Movies

Genre

Genre_Description

Runtime_Minutes

Box_Office_USD

Production_Studio

Studio_Founded

Studio_CEO

Sequel_To

Prequel_To

Country_of_Origin

Original_Language

Filming_Locations

Award_Wins

Award_Nominations

IMDB_Rating

Rotten_Tomatoes_Score

Budget_USD

Soundtrack_Composer

Screenplay_Writer

Release_Date

1

The Silence of the Lambs

1991

Jonathan Demme

February 22, 1944

Jodie Foster

Female

29

Taxi Driver, The Accused, Bugsy Malone

Thriller, Crime

Psychological horror thriller

118

272,742,922

Orion Pictures

1978

Gary Barber

Manhunter

Hannibal

United States

English

Pittsburgh, Pennsylvania

67

48

8.6

95%

19,000,000

Howard Shore

Ted Tally

February 14, 1991

2

Pulp Fiction

1994

Quentin Tarantino

March 27, 1963

Uma Thurman

Female

24

Henry & June, Mad Dog and Glory, Even Cowgirls Get the Blues

Crime, Drama

Stylized crime stories with nonlinear narratives

154

213,928,762

Miramax Films

1979

Bob Weinstein

None

None

United States

English

Los Angeles, California

24

68

8.9

92%

8,000,000

Various Artists

Quentin Tarantino

October 14, 1994

3

The Dark Knight

2008

Christopher Nolan

July 30, 1970

Christian Bale

Male

34

American Psycho, Batman Begins, The Prestige

Action, Crime, Drama

Superhero film with crime elements

152

1,005,973,645

Warner Bros.

1923

David Zaslav

Batman Begins

The Dark Knight Rises

United States, United Kingdom

English

Chicago, Illinois; Hong Kong, China

30

153

9.0

94%

185,000,000

Hans Zimmer

Jonathan Nolan, Christopher Nolan

July 18, 2008

4

Spirited Away

2001

Hayao Miyazaki

January 5, 1941

Rumi Hiiragi (voice)

Female

15

None

Animation, Adventure, Fantasy

Japanese animated fantasy film

125

383,397,782

Studio Ghibli

1985

Koji Hoshino

None

None

Japan

Japanese

Studio Ghibli animation studios, Tokyo

35

21

8.6

97%

19,000,000

Joe Hisaishi

Hayao Miyazaki

July 20, 2001

5

Black Panther

2018

Ryan Coogler

May 23, 1986

Letitia Wright

Female

25

Urban Hymn, The Commuter, Ready Player One

Action, Adventure

Superhero film exploring themes of heritage and identity

134

1,347,280,838

Marvel Studios

1993

Kevin Feige

Captain America: Civil War

Black Panther: Wakanda Forever

United States

English

Atlanta, Georgia; Busan, South Korea

110

231

7.3

96%

200,000,000

Ludwig Göransson

Ryan Coogler, Joe Robert Cole

February 16, 2018

6

The Godfather

1972

Francis Ford Coppola

April 7, 1939

Marlon Brando

Male

48

On the Waterfront, A Streetcar Named Desire, Last Tango in Paris

Crime, Drama

Epic mafia crime family saga

175

246,120,986

Paramount Pictures

1912

Brian Robbins

None

The Godfather Part II

United States

English, Italian, Latin

New York, Sicily

31

29

9.2

97%

6,000,000

Nino Rota

Mario Puzo, Francis Ford Coppola

March 24, 1972

7

Alien

1979

Ridley Scott

November 30, 1937

Sigourney Weaver

Female

30

Annie Hall

Horror, Sci-Fi

Science fiction horror film set in space

117

203,630,630

20th Century Fox

1935

Dana Walden

None

Aliens, Prometheus

United States, United Kingdom

English

Shepperton Studios, England

56

19

8.5

98%

11,000,000

Jerry Goldsmith

Dan O’Bannon

May 25, 1979

8

The Matrix

1999

Lana and Lilly Wachowski

June 21, 1965, December 29, 1967

Keanu Reeves

Male

35

Speed, The Devil’s Advocate, Point Break

Action, Sci-Fi

Cyberpunk action film about simulated reality

136

466,364,409

Warner Bros.

1923

David Zaslav

None

The Matrix Reloaded, The Matrix Revolutions

United States, Australia

English

Sydney, Australia; Oakland, California

42

77

8.7

88%

63,000,000

Don Davis

Lana and Lilly Wachowski

March 31, 1999

9

Little Women

2019

Greta Gerwig

August 4, 1983

Saoirse Ronan

Female

25

Lady Bird, Brooklyn, Atonement

Drama, Romance

Period drama about four sisters coming of age

135

218,805,280

Columbia Pictures

1924

Tony Vinciquerra

None

None

United States

English

Massachusetts

76

243

7.8

95%

40,000,000

Alexandre Desplat

Greta Gerwig

December 25, 2019

10

The Revenant

2015

Alejandro González Iñárritu

August 15, 1963

Leonardo DiCaprio

Male

41

Titanic, The Wolf of Wall Street, Inception

Adventure, Drama

Survival drama in the American frontier

156

532,950,503

20th Century Fox

1935

Dana Walden

None

None

United States

English

Canada, Argentina, United States

87

180

8.0

78%

135,000,000

Ryuichi Sakamoto

Mark L. Smith, Alejandro González Iñárritu

December 25, 2015

Task 1: Identify Violations of Normal Forms#

Examine the unnormalized movies table and identify at least three specific violations for each normal form (1NF, 2NF, and 3NF). Explain why each example violates the corresponding normal form.

Task 2: Create Tables for 1NF#

Convert the unnormalized table to First Normal Form by:

  1. Identifying repeating groups and creating separate records

  2. Ensuring each cell contains atomic values

  3. Defining a primary key for the table

Draw the resulting table(s) and explain your reasoning.

Task 3: Transform to 2NF#

Starting with your 1NF solution, move to Second Normal Form by:

  1. Identifying partial dependencies

  2. Creating separate tables to eliminate these dependencies

  3. Establishing relationships between tables using foreign keys

Draw the resulting tables and explain your approach.

Task 4: Achieve 3NF#

Finally, transform your 2NF solution to Third Normal Form by:

  1. Identifying transitive dependencies

  2. Creating additional tables as needed

  3. Updating relationships between tables

Draw the final set of tables that satisfy 3NF requirements.

Task 5: Diagram the Relationships#

Create an Entity-Relationship Diagram (ERD) for your 3NF solution showing:

  1. All tables (entities)

  2. Primary keys

  3. Foreign keys

  4. Relationships between tables (one-to-one, one-to-many, many-to-many)