Chapter 6: Data Modification with DML (Rolling Stone’s Greatest Albums)#
Databases Through Pop Culture | Brendan Shea, PhD#
This chapter shifts from reading data to writing it, using a Rolling Stone Greatest Albums dataset. You will work through INSERT, constraint violations, UPDATE, DELETE vs. DROP TABLE, soft deletes, and triggers — and then step outside SQL to look at shell scripts and ORMs as ways application code talks to databases.
Learning Outcomes#
By the end of this chapter, you will be able to:
Create tables that support data entry and enforce integrity with keys and constraints.
Insert single rows, multiple rows, and data selected from other tables.
Recognize and troubleshoot common constraint violations, including primary key, NOT NULL, and CHECK errors.
Update existing rows and explain when auto-incrementing keys simplify inserts.
Delete data safely and distinguish DELETE from DROP TABLE.
Choose appropriate foreign key delete behavior, including manual deletes, ON DELETE CASCADE, and ON DELETE SET NULL.
Implement soft-delete patterns when records should be hidden rather than physically removed.
Create triggers for automation, logging, and auditing.
Compare common ways applications interact with databases, including scripts, CLIs, and ORMs.
Keywords: SQL, INSERT INTO, UPDATE, DELETE, DROP TABLE, constraints, foreign keys, ON DELETE CASCADE, ON DELETE SET NULL, soft delete, triggers, logging, ORM
Brendan’s Lecture#
##Click here to launch my lecture
from IPython.display import YouTubeVideo
YouTubeVideo('nrJwS2Ch9qE', width=800, height=500)
Creating Tables for Rolling Stone’s Greatest Albums of All Time#
For this chapter, we’ll be creating and populating a database based on a small subset of the 2023 version of Rolling Stone Magazine’s “500 greatest albums of all time”.
Before we can start inserting data into our database, we need to create the necessary tables. In this case, we’ll create two tables: Artist and Album. Let’s define the structure of these tables and include some constraints to ensure data integrity.
%%sql
DROP TABLE IF EXISTS Artist;
DROP TABLE IF EXISTS Album;
CREATE TABLE Artist (
ArtistID INTEGER PRIMARY KEY AUTOINCREMENT,
Name VARCHAR(100) NOT NULL,
Country VARCHAR(50),
Founded INT
CHECK (Founded >= 1900 AND Founded <= 2023)
);
CREATE TABLE Album (
AlbumID INTEGER PRIMARY KEY AUTOINCREMENT,
Title VARCHAR(100) NOT NULL,
ArtistID INT,
ReleaseYear INT,
Genre VARCHAR(50),
Ranking,
FOREIGN KEY (ArtistID) REFERENCES Artist(ArtistID),
CHECK (ReleaseYear >= 1900 AND ReleaseYear <= 2023),
CHECK (Ranking >= 1 AND Ranking <= 500)
);
--enable foreign key contraints
PRAGMA foreign_keys = ON;
Let’s break down the key aspects of these table definitions:
The
Artisttable has anArtistIDcolumn as the primary key, which uniquely identifies each artist. TheNamecolumn is marked asNOT NULL, ensuring that every artist has a name. We also haveCountryandFoundedcolumns to store additional information about the artists.The
Albumtable has anAlbumIDcolumn as the primary key. TheTitlecolumn is marked asNOT NULLto ensure that every album has a title. We have a foreign keyArtistIDthat references theArtistIDcolumn in theArtisttable, establishing a relationship between albums and artists. TheReleaseYearandGenrecolumns provide additional details about each album.We have specified that the primary keys (
ArtistIDandAlbumID) are bothNOT NULLandAUTOINCREMENT. This means they cannot be left blank and that, if they are left blank, SQL will assign the next available integer to them.We include check constraints in both tables to enforce certain conditions. In the
Artisttable, we ensure that theFoundedyear is between 1900 and 2023. Similarly, in theAlbumtable, we check that theReleaseYearis within the same range.Finally, we enable foreign key constraints, which means that SQLite will ensure that attempts to update, delete, or insert data that causes problems for this will fail.
By creating these tables with appropriate constraints, we set up a solid foundation for our database. The constraints help maintain data integrity by preventing invalid or inconsistent data from being inserted into the tables.
Now that we have our tables ready, we can start inserting data into them using the INSERT INTO statement, which we’ll explore in the next section.
Using INSERT INTO (Single Row)#
Now that we have our Artist and Album tables created, let’s explore how to insert a single row of data into each table using the INSERT INTO statement.
The basic syntax for inserting a single row is as follows:
INSERT INTO table_name (column1, column2, ...)
VALUES (value1, value2, ...);
Let’s insert a single artist into the Artist table:
%%sql
INSERT OR IGNORE INTO Artist (ArtistID, Name, Country, Founded)
VALUES (90001, 'Marvin Gaye', 'United States', 1939);
In this example, we specify the table name Artist and list the columns we want to insert data into (ArtistID, Name, Country, Founded). We then provide the corresponding values for each column using the VALUES clause. The values are listed in the same order as the columns specified.
Now, let’s insert a single album into the Album table:
%%sql
INSERT OR IGNORE INTO Album (AlbumID, Title, ArtistID, ReleaseYear, Genre, Ranking)
VALUES (90001, 'What''s Going On', 90001, 1971, 'Soul', 1);
%sql SELECT * FROM Album;
| AlbumID | Title | ArtistID | ReleaseYear | Genre | Ranking |
|---|---|---|---|---|---|
| 90001 | What's Going On | 90001 | 1971 | Soul | 1 |
Similarly, we specify the Album table and list the columns we want to insert data into. We provide the corresponding values in the VALUES clause, ensuring that the ArtistID matches the ArtistID of the artist we inserted earlier.
It’s important to note that when inserting data, we need to provide values for all columns that are marked as NOT NULL or do not have a default value defined. If we omit a column that allows NULL values or has a default value, the database will automatically assign the appropriate value.
By using the INSERT INTO statement, we can easily add single rows of data to our tables. This is particularly useful when we have specific values for each column and want to insert them one at a time.
In the next section, we’ll explore how to insert multiple rows of data in a single statement, which is more efficient when dealing with larger datasets.
Using INSERT INTO (Multiple Rows)#
Inserting data one row at a time can be inefficient when you have a large number of records to insert. Fortunately, SQL allows you to insert multiple rows of data in a single INSERT INTO statement. This is achieved by specifying multiple sets of values in the VALUES clause.
Let’s insert the top 10 albums from Rolling Stone’s Greatest Albums of All Time list into our database. First, we’ll insert the artists:
%%sql
INSERT INTO Artist (ArtistID, Name, Country, Founded)
VALUES
(2, 'The Beach Boys', 'United States', 1961),
(3, 'Joni Mitchell', 'Canada', 1964),
(4, 'Stevie Wonder', 'United States', 1961),
(5, 'Nirvana', 'United States', 1987),
(6, 'Fleetwood Mac', 'United Kingdom', 1967),
(7, 'Prince', 'United States', 1975),
(8, 'Bob Dylan', 'United States', 1961),
(9, 'Lauryn Hill', 'United States', 1988),
(10, 'The Beatles', 'United Kingdom', 1960),
(11, 'Radiohead', 'United Kingdom', 1985),
(12, 'Kendrick Lamar', 'United States', 2003),
(13, 'Public Enemy', 'United States', 1985),
(14, 'The Rolling Stones', 'United Kingdom', 1962),
(15, 'Aretha Franklin', 'United States', 1956),
(16, 'Michael Jackson', 'United States', 1964),
(17, 'Kanye West', 'United States', 1996),
(19, 'The Clash', 'United Kingdom', 1976);
Now, let’s insert the corresponding albums:
%%sql
INSERT INTO Album (AlbumID, Title, ArtistID, ReleaseYear, Genre, Ranking)
VALUES
(2, 'Pet Sounds', 2, 1966, 'Rock',2),
(3, 'Blue', 3, 1971, 'Folk',3),
(4, 'Songs in the Key of Life', 4, 1976, 'Soul',4),
(5, 'Nevermind', 5, 1991, 'Grunge',5),
(6, 'Rumours', 6, 1977, 'Soft Rock',6),
(7, 'Purple Rain', 7, 1984, 'Pop',7),
(8, 'Blood on the Tracks', 8, 1975, 'Folk Rock',8),
(9, 'The Miseducation of Lauryn Hill', 9, 1998, 'Hip Hop',9),
(10, 'Abbey Road', 10, 1969, 'Rock',10),
(11, 'Revolver', 10, 1966, 'Rock',11),
(12, 'Thriller', 16, 1982, 'Pop',12),
(13, 'I Never Loved a Man the Way I Love You', 15, 1967, 'Soul',13),
(14, 'Exile on Main Street', 14, 1972, 'Rock',14),
(15, 'It Takes a Nation of Millions to Hold Us Back', 13, 1988, 'Hip Hop',15),
(16, 'London Calling', 19, 1979, 'Punk',16),
(17, 'My Beautiful Dark Twisted Fantasy', 17, 2010, 'Hip Hop',17),
(18, 'Highway 61 Revisited', 8, 1965, 'Folk Rock',18),
(19, 'To Pimp a Butterfly', 12, 2015, 'Hip Hop',19),
(20, 'Kid A', 11, 2000, 'Electronic',20);
In these examples, we use a single INSERT INTO statement for each table, but we provide multiple sets of values separated by commas. Each set of values represents a new row to be inserted into the table.
By inserting multiple rows at once, we can significantly reduce the number of statements required and improve the efficiency of our data insertion process.
It’s important to ensure that the number of values matches the number of columns specified and that the values adhere to any constraints defined on the table. If there are any violations, such as trying to insert a duplicate primary key value or violating a check constraint, the entire INSERT INTO statement will fail, and no rows will be inserted.
In the next section, we’ll explore what happens when constraints are violated and how to handle such situations.
What Happens When Constraints Are Violated?#
Constraints are essential for maintaining data integrity in a database. They enforce rules and restrictions on the data that can be inserted into tables. When attempting to insert data that violates these constraints, SQL will raise an error and prevent the insertion from occurring. Let’s explore some common constraint violations and their consequences.
Problems with Primary Keys#
A primary key uniquely identifies each row in a table. It ensures that no two rows have the same primary key value. If we attempt to insert a row with a primary key value that already exists in the table, SQL will raise a primary key constraint violation error.
For example, let’s try to insert an artist with an existing ArtistID:
import sqlite3
with sqlite3.connect('greatest.db') as conn:
try:
conn.execute("""
INSERT INTO Artist (ArtistID, Name, Country, Founded)
VALUES (1, 'Pink Floyd', 'United Kingdom', 1962);
""")
conn.commit()
except sqlite3.IntegrityError as exc:
print(f'Expected constraint error: {exc}')
Expected constraint error: UNIQUE constraint failed: Artist.ArtistID
Since we already have an artist with an artist_id of 1 this fails with an error message. This is because primary keys must be UNIQUE.
Violating CHECK Constraints#
CHECK constraints allow us to specify conditions that the data must satisfy before it can be inserted into a table. If we attempt to insert data that violates a CHECK constraint, SQL will raise a constraint violation error.
For example, let’s try to insert an album with an invalid ReleaseYear:
import sqlite3
with sqlite3.connect('greatest.db') as conn:
try:
conn.execute("""
INSERT INTO Album (AlbumID, Title, ArtistID, ReleaseYear, Genre)
VALUES (11, 'Future Album', 1, 2028, 'Rock');
""")
conn.commit()
except sqlite3.IntegrityError as exc:
print(f'Expected constraint error: {exc}')
Expected constraint error: CHECK constraint failed: ReleaseYear >= 1900 AND ReleaseYear <= 2023
In our Album table, we have a CHECK constraint that ensures the ReleaseYear is between 1900 and 2023. Since 2028 violates this constraint, the insertion will fail with an error message indicating a CHECK constraint violation.
Violating NOT NULL Constraints#
NOT NULL constraints ensure that a column cannot contain a NULL value. If we attempt to insert a row with a NULL value for a column that has a NOT NULL constraint, SQL will raise a constraint violation error.
For example, let’s try to insert an artist without specifying a Name:
import sqlite3
with sqlite3.connect('greatest.db') as conn:
try:
conn.execute("""
INSERT INTO Artist (ArtistID, Country, Founded)
VALUES (21, 'United States', 1980);
""")
conn.commit()
except sqlite3.IntegrityError as exc:
print(f'Expected constraint error: {exc}')
Expected constraint error: NOT NULL constraint failed: Artist.Name
Since the Name column in the Artist table has a NOT NULL constraint, this insertion will fail with an error message indicating a NOT NULL constraint violation.
When constraint violations occur, the entire INSERT INTO statement is rolled back, and no data is inserted into the table. This ensures that the database remains in a consistent state and maintains data integrity.
To handle constraint violations, you can:
Modify the data being inserted to satisfy the constraints.
Update the table structure or constraints to accommodate the data.
Catch and handle the specific error messages in your application code.
By understanding and properly handling constraint violations, you can ensure that only valid and consistent data is inserted into your database tables.
Auto-incrementing Primary Keys#
As we just saw, SQL will generally produce an error if you fail to provide values for a NOT NULL column. One exception to this is for integer primary keys with an AUTOINCREMENT option (such as ArtistId and AlbumId). If you leave this out of an insert statement, SQL will assign the *next available integer as a primary key.
Here’s an example:
%%sql
--Inserting without specifying primary key
INSERT INTO Artist (Name, Country, Founded)
VALUES ('Pink Floyd', 'United Kingdom', 1962);
%%sql
SELECT *
FROM Artist
WHERE Name = "Pink Floyd";
| ArtistID | Name | Country | Founded |
|---|---|---|---|
| 90002 | Pink Floyd | United Kingdom | 1962 |
Deleting Data#
When working with databases, there may be situations where you need to remove data from tables. In SQL, the DELETE statement is used to delete rows from a table based on specified conditions. It’s important to understand how to use the DELETE statement effectively and handle the deletion of related data to maintain data integrity.
Let’s start by inserting some sample data into our Artist and Album tables. We’ll insert a fake artist named “The Terrible Trio” and a few terrible albums associated with this artist.
First, let’s insert the fake artist into the Artist table:
%%sql
INSERT INTO Artist (Name, Country, Founded)
VALUES ('The Terrible Trio', 'Nowhere', 2020);
Now, let’s insert some terrible albums. We’ll use a subquery to find the ArtistID for the “Terrible Trio.”
%%sql
-- Inserting albums
INSERT INTO Album (Title, ArtistID, ReleaseYear, Genre)
VALUES
('Awful Anthems', (SELECT ArtistID FROM Artist WHERE Name = 'The Terrible Trio'), 2005, 'Noise'),
('Cringeworthy Chronicles', (SELECT ArtistID FROM Artist WHERE Name = 'The Terrible Trio'), 2010, 'Cacophony'),
('Disastrous Ditties', (SELECT ArtistID FROM Artist WHERE Name = 'The Terrible Trio'), 2015, 'Racket');
Let’s assure ourselves that these albums have been added.
%%sql
SELECT
al.Title,
al.AlbumID,
ar.Name AS Artist,
ar.ArtistID
FROM
Album al
JOIN Artist ar ON al.ArtistID = ar.ArtistID
WHERE ar.Name = 'The Terrible Trio';
| Title | AlbumID | Artist | ArtistID |
|---|---|---|---|
| Awful Anthems | 90002 | The Terrible Trio | 90003 |
| Cringeworthy Chronicles | 90003 | The Terrible Trio | 90003 |
| Disastrous Ditties | 90004 | The Terrible Trio | 90003 |
Basic DELETE Statement#
The basic syntax of the DELETE statement is as follows:
DELETE FROM table_name
WHERE condition;
table_name: The name of the table from which you want to delete rows.condition: Specifies the condition that determines which rows will be deleted. If omitted, all rows in the table will be deleted.
For example, to delete the album “Awful Anthems” from the Album table, you can use the following statement:
%%sql
DELETE FROM Album
WHERE Title = 'Awful Anthems';
%%sql
--Check that album is deleted
SELECT * FROM Album WHERE Title = 'Awful Anthems';
| AlbumID | Title | ArtistID | ReleaseYear | Genre | Ranking |
|---|
Manual Deletion#
The most straightforward to way to do this deletion is to:
Delete the related albums from the Album table first.
Second, delete the artist from the Artist table.
--Delete albums
DELETE FROM Album WHERE ArtistID = (SELECT ArtistID FROM Artist WHERE Name = 'The Terrible Trio');
--Delete artist
DELETE FROM Artist WHERE Name = 'The Terrible Trio';
ON DELETE CASCADE#
The ON DELETE CASCADE option is used to automatically delete related records from a child table when a record in the parent table is deleted. It ensures data consistency and maintains referential integrity by cascading the delete operation to the associated records.
When you define a foreign key constraint with ON DELETE CASCADE, deleting a record from the parent table will automatically delete all the related records in the child table that reference the deleted parent record.
To set up ON DELETE CASCADE, you include it in the foreign key constraint definition when creating the child table:
CREATE TABLE Artist(
-- define all your columms
...
FOREIGN KEY (ArtistID) REFERENCES Artist(ArtistID) ON DELETE CASCADE
)
By adding this line to the foreign key constraint, you instruct the database to cascade the delete operation from the parent table (Artist) to the child table (Album) when an artist is deleted.
With this option set, you can delete an artist and its related albums using a single DELETE statement:
DELETE FROM Artist WHERE Name = 'The Terrible Trio';
ON DELETE SET NULL#
The ON DELETE SET NULL option is used to automatically set the foreign key values in the child table to NULL when a record in the parent table is deleted. It allows you to maintain the child records even if the associated parent record is removed.
When you define a foreign key constraint with ON DELETE SET NULL, deleting a record from the parent table will set the foreign key values in the child table to NULL for the related records.
To set up ON DELETE SET NULL, you include it in the foreign key constraint definition when creating the child table:
CREATE TABLE Artist(
-- define all your columms
...
FOREIGN KEY (ArtistID) REFERENCES Artist(ArtistID) ON DELETE SET NULL
)
By adding this line to the foreign key constraint, you instruct the database to set the ArtistID values in the Album table to NULL when an artist is deleted from the Artist table.
Choosing Between ON DELETE CASCADE, ON DELETE SET NULL, and no action#
The choice between ON DELETE CASCADE and ON DELETE SET NULL depends on your application’s requirements and data integrity rules.
Use
ON DELETE CASCADEwhen you want to ensure that related records in the child table are automatically deleted when a parent record is deleted. This option maintains strict referential integrity and removes all associated data.Use
ON DELETE SET NULLwhen you want to keep the related records in the child table even if the parent record is deleted. This option allows you to maintain the child records but sets the foreign key values toNULL, indicating that they are no longer associated with a valid parent record.
It’s important to consider the implications of each option. ON DELETE CASCADE permanently deletes related data, which may not be desirable in all scenarios. ON DELETE SET NULL keeps the related records but with a NULL foreign key value, which may require additional handling in your application.
If you don’t specify either ON DELETE CASCADE or ON DELETE SET NULL, the default behavior is to restrict the deletion of a parent record if there are related records in the child table. In this case, you would need to manually delete the related records from the child table before deleting the parent record.
Deleting All Data versus Dropping a Table#
As we’ve seen, the DELETE FROM command is used to remove rows from a table. This operation can be selective or comprehensive. For example, if you only want to delete rows that meet certain criteria, you use a WHERE clause with your DELETE statement. Without a WHERE clause, DELETE will remove all rows in the table, but importantly, the table’s structure remains untouched. So, If you decide to remove all albums from the albums table, you simply omit the WHERE clause:
DELETE FROM Albums;
This action clears all data from the albums table but keeps its structure intact for future use. You can still add new albums to it or modify its structure later.
One significant aspect of DELETE operations is that they are logged row by row in the database’s transaction log. This means each row deletion is recorded, allowing for the possibility to undo the deletions if the operation is part of a transaction. However, this logging can make DELETE operations slower when dealing with a large number of rows.
The DROP TABLE command, in contrast, is much more drastic. When you execute a DROP TABLE statement, you remove the entire table from the database. This includes not just the data but the table’s structure, its columns, indexes, and any constraints defined on it.
For instance, if you decide that the artists table is no longer needed and you want to erase it entirely from the database, you would use:
DROP TABLE Artists;
Executing this command means the artists table is deleted. The table, along with all its data and structure, is permanently removed from the database. Unlike DELETE, DROP TABLE does not log individual row deletions because it doesn’t process each row; it removes the table as a whole. This makes DROP TABLE a fast operation but with the significant caveat that it is typically irreversible through standard SQL commands. Once a table is dropped, you cannot simply undo the action unless you have backups or specific database recovery tools in place.
UPDATE#
The UPDATE statement in SQL is used to modify existing data in a table. It allows you to change the values of one or more columns in one or more rows based on specified conditions. The basic syntax of the UPDATE statement is as follows:
UPDATE table_name
SET column1 = value1, column2 = value2, ...
WHERE condition;
table_name: The name of the table you want to update.column1, column2, ...: The columns you want to modify and their new values.condition: Optional. Specifies the condition that determines which rows will be updated. If omitted, all rows in the table will be updated.
Let’s look at a concrete example using our Artist and Album tables.
Suppose we want to update the name of the artist “Bob Dylan” to “Robert Zimmerman”.
%%sql
UPDATE Artist
SET Name = 'Robert Zimmerman'
WHERE Name = 'Bob Dylan';
%%sql
-- Confirm our change
SELECT * FROM Artist WHERE Name = 'Robert Zimmerman'
| ArtistID | Name | Country | Founded |
|---|---|---|---|
| 8 | Robert Zimmerman | United States | 1961 |
You can also update multiple columns in a single UPDATE statement by separating the column-value pairs with commas. For example, let’s update the Prince’s name to “?” and his County to “Minnesota, United States.”
%%sql
UPDATE Artist
SET Name = '?', Country = 'Minnesota, United States'
WHERE Name = 'Prince';
%%sql
-- Confirm our change
SELECT * FROM Artist WHERE Name = '?'
| ArtistID | Name | Country | Founded |
|---|---|---|---|
| 7 | ? | Minnesota, United States | 1975 |
It’s important to be cautious when using the UPDATE statement, especially if you omit the WHERE clause. Without a WHERE clause, the update will be applied to all rows in the table, which may lead to unintended changes.
To update multiple rows based on a condition, you can use a more complex WHERE clause:
%%sql
UPDATE Album
SET ReleaseYear = ReleaseYear + 1
WHERE ArtistID = 21;
%%sql
-- Confirm our change
SELECT * FROM Album WHERE ArtistID = 21;
| AlbumID | Title | ArtistID | ReleaseYear | Genre | Ranking |
|---|
This statement increments the ReleaseYear by 1 for all albums associated with the artist having ArtistID 21.
Remember to always double-check your UPDATE statements and include appropriate WHERE conditions to ensure that you are modifying only the intended rows.
The UPDATE statement is a powerful tool for modifying existing data in your database tables. It allows you to keep your data up to date and make necessary changes based on specific conditions.
Introduction to Soft Delete#
In database management, there are situations where you may want to keep records even after they are marked as deleted. Instead of permanently removing data from the database, you can implement a “soft delete” approach. Soft delete involves adding a column to the table that indicates whether a record is active or deleted, allowing you to retain historical data while still being able to filter out deleted records when querying the table.
Let’s explore the concept of soft delete using our example of the terrible trio artist and their albums.
Adding a Soft Delete Column#
To implement soft delete, we need to add a column to the Artist and Album tables that represents the deleted status of each record. We’ll call this column IsDeleted and set its default value to 0 (indicating an active record)
%%sql
ALTER TABLE Artist ADD COLUMN IsDeleted INTEGER DEFAULT 0;
ALTER TABLE Album ADD COLUMN IsDeleted INTEGER DEFAULT 0;
These statements add the IsDeleted column to the Artist and Album tables, respectively.
Soft Deleting Records#
Now, let’s say we want to soft delete the artist “The Terrible Trio” and their albums. Instead of using the DELETE statement, we’ll update the IsDeleted column to mark the records as deleted.
%%sql
UPDATE Artist SET IsDeleted = 1 WHERE Name = 'The Terrible Trio';
UPDATE Album SET IsDeleted = 1 WHERE ArtistID = (SELECT ArtistID FROM Artist WHERE Name = 'The Terrible Trio');
The first statement marks the artist “The Terrible Trio” as deleted by setting the IsDeleted column to 1. The second statement marks all the albums associated with “The Terrible Trio” as deleted by setting their IsDeleted column to 1.
Querying Soft Deleted Records#
When querying the Artist and Album tables, you can filter out the soft deleted records by adding a condition to check the IsDeleted column.
%%sql
SELECT * FROM Artist WHERE IsDeleted = 0;
| ArtistID | Name | Country | Founded | IsDeleted |
|---|---|---|---|---|
| 2 | The Beach Boys | United States | 1961 | 0 |
| 3 | Joni Mitchell | Canada | 1964 | 0 |
| 4 | Stevie Wonder | United States | 1961 | 0 |
| 5 | Nirvana | United States | 1987 | 0 |
| 6 | Fleetwood Mac | United Kingdom | 1967 | 0 |
| 7 | ? | Minnesota, United States | 1975 | 0 |
| 8 | Robert Zimmerman | United States | 1961 | 0 |
| 9 | Lauryn Hill | United States | 1988 | 0 |
| 10 | The Beatles | United Kingdom | 1960 | 0 |
| 11 | Radiohead | United Kingdom | 1985 | 0 |
Restoring Soft Deleted Records#
One of the advantages of soft delete is the ability to restore deleted records if needed. To restore a soft deleted record, you can update the IsDeleted column back to 0.
%%sql
UPDATE Artist SET IsDeleted = 0 WHERE Name = 'The Terrible Trio';
UPDATE Album SET IsDeleted = 0 WHERE ArtistID = (SELECT ArtistID FROM Artist WHERE Name = 'The Terrible Trio');
These statements restore the artist “The Terrible Trio” and their associated albums by setting the IsDeleted column back to 0.
Considerations#
When implementing soft delete, keep in mind the following considerations:
Soft delete adds an overhead to your queries, as you need to include the
IsDeletedcondition in yourWHEREclauses to filter out deleted records.Soft deleted records still occupy space in the database, so you need to have a strategy for eventually purging them if necessary.
If you have foreign key constraints, you may need to handle soft deletion carefully to maintain data integrity. You can consider using
ON DELETE SET NULLorON UPDATE CASCADEto manage the relationships between soft deleted records.
Soft delete provides a flexible approach to data deletion, allowing you to retain historical data while still being able to manage deleted records effectivel
%%sql
-- Let's get rid of our soft delete column
ALTER TABLE Artist DROP COLUMN IsDeleted;
ALTER TABLE Album DROP COLUMN IsDeleted;
Triggers#
Triggers are special types of stored procedures in SQL that are automatically executed in response to specific events or actions on a table, such as INSERT, UPDATE, or DELETE operations. Triggers allow you to enforce business rules, maintain data integrity, and perform additional actions before or after the triggering event occurs.
In SQLite, you can create triggers using the CREATE TRIGGER statement. The basic syntax for creating a trigger is as follows:
CREATE TRIGGER (IF NOT EXISTS) trigger_name
BEFORE|AFTER INSERT|UPDATE|DELETE ON table_name
BEGIN
-- Trigger actions
END;
trigger_name: The name you assign to the trigger.BEFORE|AFTER: Specifies whether the trigger should be executed before or after the triggering event.INSERT|UPDATE|DELETE: Specifies the event that activates the trigger.table_name: The name of the table on which the trigger is defined.BEGIN ... END: The block of SQL statements that define the trigger actions.
Let’s create a concrete example using our Artist and Album tables.
Suppose, for example, we’d like to keep track of the date each albums was updated to our database.
First, let’s add a column to accomodate this:
%%sql
ALTER TABLE Album ADD COLUMN LastUpdated DATETIME;
Now, let’s create trigger.
%%sql
CREATE TRIGGER IF NOT EXISTS update_album_date
AFTER UPDATE ON Album
BEGIN
UPDATE Album
SET LastUpdated = DATETIME('now')
WHERE AlbumID = NEW.AlbumID;
END;
Finally, let’s try it out:
%%sql
UPDATE Album
SET ReleaseYear = ReleaseYear - 1
WHERE AlbumID > 20;
And finally, we can see the results.
%%sql
SELECT * FROM Album WHERE AlbumID > 20;
| AlbumID | Title | ArtistID | ReleaseYear | Genre | Ranking | LastUpdated |
|---|---|---|---|---|---|---|
| 90001 | What's Going On | 90001 | 1970 | Soul | 1 | 2026-05-28 13:25:56 |
| 90003 | Cringeworthy Chronicles | 90003 | 2009 | Cacophony | None | 2026-05-28 13:25:56 |
| 90004 | Disastrous Ditties | 90003 | 2014 | Racket | None | 2026-05-28 13:25:56 |
Logging in Databases#
Logging is the process of recording specific events, changes, or actions that occur within a database. Logs serve as an audit trail, allowing database administrators and developers to track and monitor the activities happening in the database. They provide valuable information for debugging, troubleshooting, performance analysis, and security auditing.
Logs can capture various types of information, such as:
Data modifications (inserts, updates, deletes)
User authentication and access attempts
System errors and exceptions
Timestamps of when events occurred
User or application responsible for the actions
By maintaining logs, you can gain insights into the historical changes in your database, detect suspicious activities, and facilitate data recovery in case of failures or accidental modifications.
Now, let’s see how we can use a trigger to implement logging in our database.
Logging Trigger Example#
Suppose we want to create a log table that captures the changes made to the Artist table. Whenever an artist’s information is updated, we want to record the old and new values, along with the timestamp and the user who made the change.
First, let’s create the log table:
%%sql
DROP TABLE IF EXISTS ArtistLog;
CREATE TABLE ArtistLog (
LogID INTEGER PRIMARY KEY AUTOINCREMENT,
ArtistID INTEGER,
OldName VARCHAR(100),
NewName VARCHAR(100),
UpdatedAt TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
UpdatedBy VARCHAR(100)
);
The ArtistLog table has columns to store the ArtistID, the old and new values of the Name column, the timestamp of the update, and the user who made the change.
Next, let’s create the trigger that will populate the log table whenever an update occurs on the Artist table:
%%sql
CREATE TRIGGER IF NOT EXISTS ArtistUpdateTrigger
AFTER UPDATE ON Artist
FOR EACH ROW
BEGIN
INSERT INTO ArtistLog (ArtistID, OldName, NewName, UpdatedBy)
VALUES (OLD.ArtistID, OLD.Name, NEW.Name, 'Brendan Shea');
END;
Let’s break down the trigger:
The trigger is named
ArtistUpdateTrigger.It is defined as an
AFTER UPDATEtrigger on theArtisttable, meaning it will be executed after an update operation on theArtisttable.The
FOR EACH ROWclause specifies that the trigger will be executed for each row affected by the update.Inside the trigger, we have an
INSERTstatement that inserts a new row into theArtistLogtable.The
VALUESclause captures theArtistID, the old and new values of theNamecolumn using the specialOLDandNEWkeywords, and the user who made the changes. Here, I’ve hardcoded this as ‘Brendan Shea’, but in more complext applications, we could capture the details for each user.
Now, let’s test the trigger by updating some artists’ names (undoing our previous updates):
%%sql
UPDATE Artist
SET Name = 'Bob Dylan'
WHERE Name = 'Robert Zimmerman';
UPDATE Artist
SET Name = 'Prince'
WHERE Name = '?';
Now, let’s see what our log table looks like:
%%sql
SELECT * FROM ArtistLog;
| LogID | ArtistID | OldName | NewName | UpdatedAt | UpdatedBy |
|---|---|---|---|---|---|
| 1 | 8 | Robert Zimmerman | Bob Dylan | 2026-05-28 13:25:56 | Brendan Shea |
| 2 | 7 | ? | Prince | 2026-05-28 13:25:56 | Brendan Shea |
The result will show the logged entry with the ArtistID, the old and new values of the Name column, the timestamp of the update, and the user who made the change.
By using triggers for logging, you can automatically capture and store the changes made to your database tables, providing a valuable audit trail for monitoring, debugging, and historical analysis.
Case Study: Database Interaction Methods#
You can work with a database from several layers of the stack. In this chapter, focus on three common approaches: the command line, shell scripts, and ORMs.
1. Command line interface (CLI)#
Use the CLI when you want fast, direct interaction with the database.
Why it is useful:
quick inspection and ad hoc queries
simple admin tasks
good for learning what the database is actually doing
Example in Jupyter:
!sqlite3 music.db "SELECT * FROM artists WHERE genre = 'Rock'"
!sqlite3 music.db "INSERT INTO albums (title, artist_id, release_year) VALUES ('Abbey Road', 1, 1969)"
The ! prefix asks Jupyter to run a shell command. SQLite then executes the SQL and prints the result.
Where scripts run: server-side, client-side, and IDE-based#
The same script can live in three very different runtime locations, and the location often matters more than the language.
Server-side runtime. The script runs on a server you control — the database server, a backend application server, or a scheduled job runner. Rolling Stone’s nightly script that recomputes album rankings runs server-side.
Client-side runtime. The script runs in the user’s environment — a browser, a mobile app, or a desktop application. The reader’s vote-on-album-of-the-year form posts to the server, but the form validation runs client-side in JavaScript.
IDE-based scripting. The script runs inside an interactive development environment like Jupyter, VS Code, or a database tool’s query window. Most of this book runs in an IDE-based environment (Jupyter).
The same Python file can run in all three places. Where it runs decides what it can reach (local files, the network, the database) and who can change it.
Scripting environments at a glance#
A quick reference for the languages you are most likely to see around databases:
Language |
Where it runs |
Typical role |
How it connects to a database |
|---|---|---|---|
Bash / Unix shell |
Server-side, terminal |
Backups, exports, scheduled jobs |
Calls |
PowerShell |
Server-side on Windows |
Backups, exports, Windows-based admin |
SQL Server cmdlets and ODBC modules |
Python |
Server-side, IDE-based |
Data work, ETL, applications |
|
Perl |
Server-side |
Legacy scripts, text munging |
DBI driver |
JavaScript (browser) |
Client-side |
UI logic, form validation |
Indirectly — calls an API that talks to the database |
JavaScript (Node.js) |
Server-side, IDE-based |
Web back ends, APIs |
|
Most teams use more than one. The Rolling Stone editorial team might use Bash for nightly backups, Python for monthly chart updates, and JavaScript for the reader-facing voting page.
2. Scripts and ORMs#
Shell scripting#
Shell scripts are useful when you need repeatable automation.
Good uses:
backups
exports
scheduled batch jobs
data pipelines
%%bash
sqlite3 music.db .dump > music_backup.sql
sqlite3 -csv -header music.db "SELECT title, release_year FROM albums" > albums.csv
Object-relational mapping (ORM)#
An ORM lets you work with tables as language objects instead of writing raw SQL every time.
Why teams use ORMs:
cleaner application code
reusable models
safer parameter handling
easier integration with Python applications
A practical rule:
use the CLI for quick inspection
use scripts for automation
use an ORM inside larger software projects
When ORMs go bad: the N+1 query problem#
ORMs make application code cleaner, but they also hide a lot of SQL. Sometimes what looks like one Python loop turns into hundreds of queries.
Imagine an editorial intern at Rolling Stone who wants to print every album in the catalog along with its track titles:
for album in session.query(Album).all():
print(album.title)
for track in album.tracks: # <-- this line is the trap
print(' -', track.title)
The first query fetches every album. Then, for each album, the ORM runs another query to load that album’s tracks. One query becomes N+1 queries — one for the parent collection, plus one per child record. Multiply by hundreds of albums and the page that should have been instant is now spinning.
2026-05-28 13:25:56,963 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2026-05-28 13:25:56,964 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("albums")
2026-05-28 13:25:56,965 INFO sqlalchemy.engine.Engine [raw sql] ()
2026-05-28 13:25:56,966 INFO sqlalchemy.engine.Engine PRAGMA temp.table_info("albums")
2026-05-28 13:25:56,966 INFO sqlalchemy.engine.Engine [raw sql] ()
2026-05-28 13:25:56,967 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("tracks")
2026-05-28 13:25:56,968 INFO sqlalchemy.engine.Engine [raw sql] ()
2026-05-28 13:25:56,969 INFO sqlalchemy.engine.Engine PRAGMA temp.table_info("tracks")
2026-05-28 13:25:56,969 INFO sqlalchemy.engine.Engine [raw sql] ()
2026-05-28 13:25:56,971 INFO sqlalchemy.engine.Engine
CREATE TABLE albums (
id INTEGER NOT NULL,
title VARCHAR,
PRIMARY KEY (id)
)
2026-05-28 13:25:56,971 INFO sqlalchemy.engine.Engine [no key 0.00040s] ()
2026-05-28 13:25:56,972 INFO sqlalchemy.engine.Engine
CREATE TABLE tracks (
id INTEGER NOT NULL,
title VARCHAR,
album_id INTEGER,
PRIMARY KEY (id),
FOREIGN KEY(album_id) REFERENCES albums (id)
)
2026-05-28 13:25:56,973 INFO sqlalchemy.engine.Engine [no key 0.00037s] ()
2026-05-28 13:25:56,974 INFO sqlalchemy.engine.Engine COMMIT
2026-05-28 13:25:56,978 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2026-05-28 13:25:56,980 INFO sqlalchemy.engine.Engine INSERT INTO albums (title) VALUES (?) RETURNING id
2026-05-28 13:25:56,981 INFO sqlalchemy.engine.Engine [generated in 0.00007s (insertmanyvalues) 1/2 (ordered; batch not supported)] ('Album 0',)
2026-05-28 13:25:56,982 INFO sqlalchemy.engine.Engine INSERT INTO albums (title) VALUES (?) RETURNING id
2026-05-28 13:25:56,982 INFO sqlalchemy.engine.Engine [insertmanyvalues 2/2 (ordered; batch not supported)] ('Album 1',)
2026-05-28 13:25:56,984 INFO sqlalchemy.engine.Engine INSERT INTO tracks (title, album_id) VALUES (?, ?) RETURNING id
2026-05-28 13:25:56,984 INFO sqlalchemy.engine.Engine [generated in 0.00007s (insertmanyvalues) 1/4 (ordered; batch not supported)] ('Track 0', 1)
2026-05-28 13:25:56,985 INFO sqlalchemy.engine.Engine INSERT INTO tracks (title, album_id) VALUES (?, ?) RETURNING id
2026-05-28 13:25:56,986 INFO sqlalchemy.engine.Engine [insertmanyvalues 2/4 (ordered; batch not supported)] ('Track 1', 1)
2026-05-28 13:25:56,986 INFO sqlalchemy.engine.Engine INSERT INTO tracks (title, album_id) VALUES (?, ?) RETURNING id
2026-05-28 13:25:56,986 INFO sqlalchemy.engine.Engine [insertmanyvalues 3/4 (ordered; batch not supported)] ('Track 0', 2)
2026-05-28 13:25:56,988 INFO sqlalchemy.engine.Engine INSERT INTO tracks (title, album_id) VALUES (?, ?) RETURNING id
2026-05-28 13:25:56,988 INFO sqlalchemy.engine.Engine [insertmanyvalues 4/4 (ordered; batch not supported)] ('Track 1', 2)
2026-05-28 13:25:56,989 INFO sqlalchemy.engine.Engine COMMIT
--- Demonstrating N+1 ---
2026-05-28 13:25:56,990 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2026-05-28 13:25:56,991 INFO sqlalchemy.engine.Engine SELECT albums.id AS albums_id, albums.title AS albums_title
FROM albums
2026-05-28 13:25:56,992 INFO sqlalchemy.engine.Engine [generated in 0.00054s] ()
2026-05-28 13:25:56,995 INFO sqlalchemy.engine.Engine SELECT tracks.id AS tracks_id, tracks.title AS tracks_title, tracks.album_id AS tracks_album_id
FROM tracks
WHERE ? = tracks.album_id
2026-05-28 13:25:56,996 INFO sqlalchemy.engine.Engine [generated in 0.00078s] (1,)
2026-05-28 13:25:56,997 INFO sqlalchemy.engine.Engine SELECT tracks.id AS tracks_id, tracks.title AS tracks_title, tracks.album_id AS tracks_album_id
FROM tracks
WHERE ? = tracks.album_id
2026-05-28 13:25:56,998 INFO sqlalchemy.engine.Engine [cached since 0.002394s ago] (2,)
Run the cell above and look at the SQL that SQLAlchemy echoes. You will see:
One
SELECT ... FROM albumsquery.A separate
SELECT ... FROM tracks WHERE album_id = ?query for each album.
With two albums that is three queries. With two thousand albums it is two thousand and one — every page load punishing the database.
A real diagnostic step is to confirm the worst query’s plan. Chapter 8 covers EXPLAIN QUERY PLAN in detail, but the short version is below:
from sqlalchemy import text
result = session.execute(text("EXPLAIN QUERY PLAN SELECT * FROM tracks WHERE album_id = 1"))
for row in result:
print(row)
2026-05-28 13:25:57,003 INFO sqlalchemy.engine.Engine EXPLAIN QUERY PLAN SELECT * FROM tracks WHERE album_id = 1
2026-05-28 13:25:57,004 INFO sqlalchemy.engine.Engine [generated in 0.00064s] ()
(2, 0, 0, 'SCAN tracks')
from sqlalchemy.orm import joinedload
print('--- Fixed: single query with joinedload ---')
albums = session.query(Album).options(joinedload(Album.tracks)).all()
for album in albums:
for track in album.tracks:
pass
--- Fixed: single query with joinedload ---
2026-05-28 13:25:57,012 INFO sqlalchemy.engine.Engine SELECT albums.id AS albums_id, albums.title AS albums_title, tracks_1.id AS tracks_1_id, tracks_1.title AS tracks_1_title, tracks_1.album_id AS tracks_1_album_id
FROM albums LEFT OUTER JOIN tracks AS tracks_1 ON albums.id = tracks_1.album_id
2026-05-28 13:25:57,013 INFO sqlalchemy.engine.Engine [generated in 0.00064s] ()
With joinedload, SQLAlchemy issues a single SELECT with a join, and all the track rows come back attached to their albums. One query instead of N+1.
A good habit:
Turn on SQL logging (
echo=True) in development so you can see what your ORM is doing.When a page or job feels slow, count the queries before guessing.
Use eager loading (
joinedload,selectinload, or the equivalent in your ORM) when you know you will use a child collection.
The database does not know what your application meant to do. The convenience of an ORM can hide expensive patterns; SQL logging is how you find them.
Chapter Summary#
Use this list as a self-check. After working through the chapter, you should be able to do each of the following:
You can create tables and define their constraints.
You can insert single and multiple rows with
INSERT INTO.You can recognize and handle constraint violations.
You can use auto-incrementing primary keys to simplify inserts.
You can drive inserts dynamically with subqueries.
You can distinguish
DELETEfromDROP TABLE.You can implement soft deletes to preserve history.
You can update existing rows with
UPDATE.You can write triggers to automate database actions.
You can use triggers for auditing and monitoring.
You can script database work from the shell and from other programming languages.
You can describe how ORMs interact with databases.
You can name whether a script runs server-side, client-side, or inside an IDE, and explain why that matters.
You can compare the major scripting languages on where they run and how they connect to a database.
You can read the SQL an ORM produces, spot an N+1 pattern, and rewrite it as a single efficient query.
Lab: Practice Your SQL#
Practice with the Loop of the Recursive Dragon#
Sharpen what you just learned with a chapter-matched review set in the Loop of the Recursive Dragon — an adaptive review game with multiple question types and RPG-style mechanics, built for this book.
Glossary#
Use this list as a quick review sheet for the chapter.
Writing data#
INSERT INTO— Adds one or more new rows to a table.VALUES— Supplies the data for anINSERTstatement.UPDATE— Modifies existing rows in a table.SET— The clause inside anUPDATEstatement that assigns new values to columns.DELETE— Removes one or more rows from a table.AUTOINCREMENT— A property that automatically generates a unique integer for each new row.Soft delete — A pattern that marks a row as deleted without actually removing it from the table.
Cascading actions#
ON DELETE CASCADE— A foreign-key rule that deletes related child rows when the parent row is deleted.ON DELETE SET NULL— A foreign-key rule that sets related foreign keys toNULLwhen the parent row is deleted.
Triggers and logs#
TRIGGER— A database object that runs SQL automatically when a defined event occurs.BEFORE— A trigger timing that runs the trigger before the event happens.AFTER— A trigger timing that runs the trigger after the event happens.END— A keyword that marks the end of a code block or SQL statement.Database log — A record of transactions and changes made to a database, useful for auditing and recovery.
Scripting and shells#
Bash — A common Unix shell and command language.
PowerShell — Microsoft’s command-line shell and scripting language.
Linux terminal — A command-line interface for Linux systems.
Windows command prompt — The text-based command interpreter used on Windows.
Python — A general-purpose programming language known for readable syntax.
Client-side script — Code that runs inside the user’s web browser.
Server-side script — Code that runs on a server before the response is sent to the client.
Object-relational mapping#
Object — An instance of a class in an object-oriented programming language.
Object-relational mapping (ORM) — A technique that maps programming objects to rows in a relational database.
SQLAlchemy — A Python SQL toolkit and ORM.
Hibernate — A Java ORM that maps classes to database tables.
Entity Framework — Microsoft’s ORM for .NET applications.
Sequelize — A promise-based ORM for Node.js.
Scripting environments#
Server-side runtime — A script execution environment on a controlled server, such as a database server, application server, or scheduled job runner.
Client-side runtime — A script execution environment in the user’s own device, such as a browser, a mobile app, or a desktop application.
IDE-based scripting — A script execution environment inside an interactive development tool such as Jupyter or a database query window.
Perl — A scripting language with strong text-handling support, often seen in legacy database utilities.
JavaScript (browser) — JavaScript that runs inside a web browser; usually talks to a database indirectly through an API.
JavaScript (Node.js) — JavaScript that runs server-side using the Node.js runtime; can connect to databases directly.
N+1 query problem — A performance trap in which one query for a parent collection triggers one extra query per child record, multiplying the number of database round trips.
SQL logging — The practice of printing the SQL an application generates, used to find slow or unintended queries.
Eager loading — An ORM pattern that loads a parent record and its related child records in a single query, avoiding the N+1 problem.
SQL Code#
SQL Template |
Explanation |
|---|---|
|
SQL code to insert a new row into table t with specified column values. |
|
SQL code to insert rows into table t by selecting data from another table t2. |
|
SQL code to delete rows from table t that satisfy condition c. |
|
SQL code to delete rows from table t1 based on a subquery result from table t2. |
|
SQL code to update specific columns in table t for rows that satisfy condition c. |
|
SQL code to update table t1 using values from table t2 based on a correlated subquery. |
|
SQL code to create a trigger that executes before each row is inserted into table t. |
|
SQL code to create a trigger that executes after each row is updated in table t. |
|
SQL code to create an instead-of trigger on view v, executed when a delete operation is attempted. |