cs480 - Database Systems - 2025 spring

Course webpage for cs480 - 2025 spring taught by Boris Glavic

Lecture Notes for cs480

Overview

Lecture <2025-01-14 Tue>

Lecture <2025-01-16 Thu>

\(\mathbb{D}_1 = \{ a, b\}\)

\(\mathbb{D}_2 = \{1,2,3\}\)

\(\mathbb{D}_1 \times \mathbb{D}_2 = \{ (a,1), (b,2), (b,1), \ldots\}\)

\(\mathbf{Student} = (name, major, GPA, UIN)\)

\(t= (Peter, CS, 2.5, 3321342)\)

\(t' = (Jane, CS, 4.0, 1231232)\)

\(K = \{UIN\}\)

\(t.K = (3321342)\)

\(K' = \{UIN, GPA\}\)

Lecture <2025-01-21 Tue>

weather

temperature wctemp month day hour humidity wind
-10 -30 1 21 6 40 45
45 40 1 22 9 10 5
5 -3 1 23 6 92 60

selection + projection

  • Return month and day where the temperature is below 0

\[ \pi_{month,day}(\sigma_{temperature < 0}(weather)) \]

month day
1 21
  • Return month and day where either:

    • temperature is below 0
    • wctemp is below 0
    • wind > 100
    • humidity > 95

\[ \pi_{month,day}(\sigma_{temperature < 0 \lor wctemp < 0 \lor wind > 100 \lor humidity > 95 }(weather)) \]

month day
1 21
1 23

union

  • Return month and day where either (with union):

    • temperature is below 0
    • wctemp is below 0
    • wind > 100
    • humidity > 95
\begin{align*} &\pi_{month,day}(\\ &\sigma_{temperature < 0}(weather)\\ \cup &\sigma_{wctemp < 0}(weather)\\ \cup &\sigma_{wind > 100}(weather)\\ \cup &\sigma_{humidity > 95}(weather))\\ \end{align*}

cross product

  • Return all pairs days in the same month where the first day has humidity of at most 10 and the other day has a humidity of at least 80

    • result schema (month, day1, day2)

weather

temperature wctemp month day hour humidity wind
-10 -30 1 21 6 40 45
45 40 1 22 9 10 5
5 -3 1 23 6 92 60
4 20 2 1 0 97 20
\begin{align*} &\pi_{month,day1,day2}(\sigma_{month = month2 \land day1 < day2}(\\ &\rho_{month,day1}(\pi_{month,day}(\sigma_{humidity \leq 10}(weather)))\\ &\times\\ &\rho_{month2,day2}(\pi_{month,day}(\sigma_{humidity \geq 80}(weather))))) \end{align*}
  • filter rows where the month does not match
  • get rid of the month2 attributes
month day1 day2
1 22 23

set difference

Lecture <2025-01-23 Thu>

database

  • animal

    • primary key: name, species
name species weight height
Bob tiger 3500 5
Alice lion 4500 5.5
Peter mouse 1 0.25
Giliam mouse 1 0.23
  • cage

    • primary key: id
id size_sqf
1 4000
2 10
3 3000
  • occupant

    • id references cage
    • name and species references animal
name species id
Bob tiger 1
Alice lion 1
Peter mouse 2
Giliam mouse 2

queries

  • names of animals that either a lion or a tiger

\[ \pi_{name}(\sigma_{species = lion \lor species = tiger}(animal)) \]

  • find the the id and sqf for cages that have at least one mouse in them

\[ \pi_{id, size\_sqf}(\sigma_{id = cid \land species = mouse}(occupant \times \rho_{cid, size\_sqf}(cage))) \]

  • give me ids of empty cages

\[ \pi_{id}(cage) - \pi_{id}(occupants) \]

  • give me the id of cage (if it exists) that houses all of the mouses
  • as universal quantification is not supported directly in relational algebra as discussed so far we have to use the following logical equivalence

\[ \forall x: \phi(x) \Leftrightarrow \neg \exists x: \neg \phi(x) \]

  • in our example we have to remove from the set of cages the cages that are missing at least one mouse.
  • we first determine all pairs of cages and mouses (\(Q_{cagemouse}\))
  • then we determine which mouses are in which cage (\(Q_{hasmouse}\))
  • then we find which mouses are missing from which cage by removing from all pairs of mouses and cages, the pairs that exist (\(Q_{missingmouse}\))

    • this gives use pairs of cages and mouses such that the mouse is missing from this pariticular cage
  • finally, we remove from the set of all cages, the cages that are missing a mouse (\(Q\))
\begin{align*} Q_{cagemouse} &\leftarrow \pi_{id}(occupants) \times \pi_{name}(\sigma_{species=mouse}(animal))\\ Q_{hasmouse} &\leftarrow \pi_{id,name}(\sigma_{species=mouse}(occupant))\\ Q_{missingmouse} & \leftarrow Q_{cagemouse} - Q_{hasmouse}\\ Q &\leftarrow \pi_{id}(cage) - \pi_{id}(Q_{missingmouse}) \end{align*}

Lecture <2025-01-28 Tue>

\(R\)

A
1
2
3

\(S\)

B C
a 1
b 2

\(R \times S\)

A B C
1 a 1
1 b 2
2 a 1
2 b 2
3 a 1
3 b 2
  • all combinations of row from \(R\) and \(S\) such that \(A = C\)
  • \(\sigma_{A=C}(R \times S)\)
A B C
1 a 1
2 b 2
  • \(\sigma_{A<C}(R \times S)\)
A B C
1 b 2

Lecture <2025-01-30 Thu>

Selecting project topic

  • retail, stock, MSRP, suppliers, sale
  • webshop
  • bidding webshop
  • travel booking site
  • online gaming - store players
  • coffee or bakery
  • investment (stock accounts)
  • animal shelter
  • airport management (flight ticket)
  • OS simulator (file system + permission system)
  • art exhibition / dealership
  • based on the votes we will do a coffee shop as the project, I will upload requirements within the next two weeks

Lecture <2025-02-04 Tue>

Lecture <2025-02-06 Thu>

FROM clause

1
2
3
SELECT *
FROM student
LIMIT 3;
id name dept_name tot_cred
A1231 XXXXXX History 200
19991 Brandt History 97
44553 Peltier Physics 73
  • never do that even though it is allowed
1
2
3
SELECT s.name, s.name
FROM student s
LIMIT 3;
name name
XXXXXX XXXXXX
Brandt Brandt
Peltier Peltier
1
2
3
SELECT s1.name, s2.name
FROM student s1, student s2
LIMIT 10;
name name
XXXXXX XXXXXX
XXXXXX Brandt
XXXXXX Peltier
XXXXXX Levy
XXXXXX Snow
XXXXXX Brandt
XXXXXX Chavez
XXXXXX Peltier
XXXXXX Levy
XXXXXX Williams
1
SELECT * FROM student;
id name dept_name tot_cred
A1231 XXXXXX History 200
19991 Brandt History 97
44553 Peltier Physics 73
45678 Levy Physics 63
70557 Snow Physics 17
AB199 Brandt History 97
AB231 Chavez History 127
AB445 Peltier History 73
AB456 Levy History 63
AB543 Williams History 71
AB557 Sanchez History 55
AB705 Snow History 17
AB765 Brown History 75
AB766 Aoi History 77
AB987 Bourikas History 115
AB989 Tanaka History 137
11111 test History 67
1
SELECT * FROM takes;
id course_id sec_id semester year grade
19991 HIS-351 1 Spring 2010 B
44553 PHY-101 1 Fall 2009 B-
45678 CS-101 1 Fall 2009 F
45678 CS-101 1 Spring 2010 B+
45678 CS-319 1 Spring 2010 B

joins

1
2
SELECT s.name, s.dept_name, t.course_id, t.grade
FROM student s JOIN takes t ON (s.id = t.id);
name dept_name course_id grade
Brandt History HIS-351 B
Peltier Physics PHY-101 B-
Levy Physics CS-101 F
Levy Physics CS-101 B+
Levy Physics CS-319 B
1
2
3
SELECT 1
FROM student s JOIN takes t USING (id), student s1 JOIN takes t1 USING (id)
LIMIT 2;
?column?
1
1

keywords and identifiers are case insensitive

  • SQL keywords are case insensitive
  • SQL identifiers are case insensitive
1
SelEcT STUDENT.NaMe fROM stUdeNT LImIt 3;
name
XXXXXX
Brandt
Peltier
  • identifiers are case sensitive if they are enclosed in ""
  • Postgres is lower case by default
1
2
3
4
CREATE TABLE AnnoYingtest (
  A int,
  b int
);
CREATE TABLE
1
SELECT * FROM "annoyingtest";
a b
1
2
3
4
CREATE TABLE "AnnoYingtest2" (
  "A" int,
  "b" int
);
CREATE TABLE
1
SELECT * FROM annoyingtest2;
1
2
3
SELECT id
FROM student
WHERE dept_name = 'History' AND tot_cred < 80;
id
AB445
AB456
AB543
AB557
AB705
AB765
AB766
11111

WHERE clause

1
2
3
SELECT name, id
FROM student
WHERE dept_name = 'History' AND tot_cred < 80;
name id
Peltier AB445
Levy AB456
Williams AB543
Sanchez AB557
Snow AB705
Brown AB765
Aoi AB766
test 11111
  • credits between 40 and 80
1
2
3
SELECT name, id
FROM student
WHERE dept_name = 'History' AND tot_cred < 80 AND tot_cred > 40;
name id
Peltier AB445
Levy AB456
Williams AB543
Sanchez AB557
Brown AB765
Aoi AB766
test 11111
1
SELECT * FROM takes;
id course_id sec_id semester year grade
19991 HIS-351 1 Spring 2010 B
44553 PHY-101 1 Fall 2009 B-
45678 CS-101 1 Fall 2009 F
45678 CS-101 1 Spring 2010 B+
45678 CS-319 1 Spring 2010 B
  • name and id of students that have taken CS101 (list each student once)
1
2
3
SELECT DISTINCT name, s.id
FROM student s, takes t
WHERE s.id = t.id AND course_id = 'CS-101'
name id
Levy 45678
1
SELECT * FROM course;
course_id title dept_name credits
BIO-101 Intro. to Biology Biology 4
BIO-301 Genetics Biology 4
BIO-399 Computational Biology Biology 3
CS-101 Intro. to Computer Science Comp. Sci. 4
CS-190 Game Design Comp. Sci. 4
CS-315 Robotics Comp. Sci. 3
CS-319 Image Processing Comp. Sci. 3
CS-347 Database System Concepts Comp. Sci. 3
EE-181 Intro. to Digital Systems Elec. Eng. 3
FIN-201 Investment Banking Finance 3
HIS-351 World History History 3
MU-199 Music Video Production Music 3
PHY-101 Physical Principles Physics 4

Aggregation, ORDER-BY and LIMIT

  • return the smallest number of credits in the course table
1
2
3
4
SELECT credits
FROM course
ORDER BY credits ASC
LIMIT 1;
credits
3
1
SELECT * FROM time_slot;
time_slot_id day start_hr start_min end_hr end_min
A M 8 0 8 50
A W 8 0 8 50
A F 8 0 8 50
B M 9 0 9 50
B W 9 0 9 50
B F 9 0 9 50
C M 11 0 11 50
C W 11 0 11 50
C F 11 0 11 50
D M 13 0 13 50
D W 13 0 13 50
D F 13 0 13 50
E T 10 30 11 45
E R 10 30 11 45
F T 14 30 15 45
F R 14 30 15 45
G M 16 0 16 50
G W 16 0 16 50
G F 16 0 16 50
H W 10 0 12 30
1
SELECT * FROM time_slot ORDER BY time_slot_id;
  • aggregation limits what we can use in the SELECT clause
1
2
3
4
5
SELECT count(*), dept_name
FROM student,
     (SELECT name FROM student) s
WHERE tot_cred > 80
GROUP BY dept_name;
count dept_name
6 History
1
2
3
SELECT *
FROM student
GROUP BY dept_name;
dept_name
History
Physics

Lecture <2025-02-11 Tue>

DISTINCT and join

1
2
3
SELECT DISTINCT s.*
FROM student s, takes t
WHERE s.id = t.id;
id name dept_name tot_cred
98988 Tanaka Biology 120
98765 Bourikas Elec. Eng. 98
55739 Sanchez Music 38
45678 Levy Physics 46
19991 Brandt History 80
76653 Aoi Elec. Eng. 60
00128 Zhang Comp. Sci. 102
76543 Brown Comp. Sci. 58
54321 Williams Comp. Sci. 54
12345 Shankar Comp. Sci. 32
23121 Chavez Finance 110
44553 Peltier Physics 56

Lecture <2025-02-13 Thu>

Window function partition-by

1
SELECT * FROM student;
id name dept_name tot_cred
00128 Zhang Comp. Sci. 102
12345 Shankar Comp. Sci. 32
19991 Brandt History 80
23121 Chavez Finance 110
44553 Peltier Physics 56
45678 Levy Physics 46
54321 Williams Comp. Sci. 54
55739 Sanchez Music 38
70557 Snow Physics 0
76543 Brown Comp. Sci. 58
76653 Aoi Elec. Eng. 60
98765 Bourikas Elec. Eng. 98
98988 Tanaka Biology 120
1
2
3
SELECT name, dept_name,
       count(*) OVER (PARTITION BY dept_name ORDER BY name) AS deppos
FROM student;
name dept_name deppos
Tanaka Biology 1
Brown Comp. Sci. 1
Shankar Comp. Sci. 2
Williams Comp. Sci. 3
Zhang Comp. Sci. 4
Aoi Elec. Eng. 1
Bourikas Elec. Eng. 2
Chavez Finance 1
Brandt History 1
Sanchez Music 1
Levy Physics 1
Peltier Physics 2
Snow Physics 3
1
INSERT INTO student VALUES (12121, 'Snow', 'Physics', 121);
INSERT 0 1

Window function order-by

1
2
3
SELECT name, dept_name,
       count(*) OVER (PARTITION BY dept_name ORDER BY name) AS deppos
FROM student;
name dept_name deppos
Tanaka Biology 1
Brown Comp. Sci. 1
Shankar Comp. Sci. 2
Williams Comp. Sci. 3
Zhang Comp. Sci. 4
Aoi Elec. Eng. 1
Bourikas Elec. Eng. 2
Chavez Finance 1
Brandt History 1
Sanchez Music 1
Levy Physics 1
Peltier Physics 2
Snow Physics 4
Snow Physics 4
1
SELECT * FROM instructor;
id name dept_name salary
10101 Srinivasan Comp. Sci. 65000.00
12121 Wu Finance 90000.00
15151 Mozart Music 40000.00
22222 Einstein Physics 95000.00
32343 El Said History 60000.00
33456 Gold Physics 87000.00
45565 Katz Comp. Sci. 75000.00
58583 Califieri History 62000.00
76543 Singh Finance 80000.00
76766 Crick Biology 72000.00
83821 Brandt Comp. Sci. 92000.00
98345 Kim Elec. Eng. 80000.00

instructors that work for departments with less than 3 students

  • give me the names of instructors that work for departments with less then 3 student
1
2
3
4
5
6
7
WITH numstu AS (SELECT dept_name
                FROM student
                GROUP BY dept_name
                HAVING count(*) < 3)
SELECT name
FROM instructor
WHERE dept_name IN (SELECT dept_name FROM numstu)
name
Wu
Mozart
El Said
Califieri
Singh
Crick
Kim

instructors (only the 3 largest ones) order by the number of classes they teach (higher first)

1
SELECT * FROM teaches;
id course_id sec_id semester year
10101 CS-101 1 Fall 2009
10101 CS-315 1 Spring 2010
10101 CS-347 1 Fall 2009
12121 FIN-201 1 Spring 2010
15151 MU-199 1 Spring 2010
22222 PHY-101 1 Fall 2009
32343 HIS-351 1 Spring 2010
45565 CS-101 1 Spring 2010
45565 CS-319 1 Spring 2010
76766 BIO-101 1 Summer 2009
76766 BIO-301 1 Summer 2010
83821 CS-190 1 Spring 2009
83821 CS-190 2 Spring 2009
83821 CS-319 2 Spring 2010
98345 EE-181 1 Spring 2009
  • instructors (only the 3 largest ones) order by the number of classes they teach (higher first)
1
2
3
     SELECT count(*) AS nc, id
     FROM teaches
     GROUP BY id
nc id
3 83821
1 12121
1 98345
1 15151
1 22222
1 32343
3 10101
2 76766
2 45565
1
2
3
4
5
6
7
8
9
WITH numcourse AS (
     SELECT count(*) AS nc, id
     FROM teaches
     GROUP BY id)
SELECT i.name, nc
FROM instructor i, numcourse c
WHERE i.id = c.id
ORDER BY nc DESC
LIMIT 3;
name nc
Srinivasan 3
Brandt 3
Katz 2

instructors (only the 3 largest ones) order by the number of different classes

  • instructors (only the 3 largest ones) order by the number of different classes they teach (higher first)
1
2
3
4
5
6
7
8
9
WITH numcourse AS (
     SELECT count(DISTINCT course_id) AS nc, id
     FROM teaches
     GROUP BY id)
SELECT i.name, nc
FROM instructor i, numcourse c
WHERE i.id = c.id
ORDER BY nc DESC
LIMIT 3;
name nc
Srinivasan 3
Katz 2
Crick 2
1
2
3
4
5
6
7
8
9
WITH numcourse AS (
     SELECT count(course_id) AS nc, id
     FROM (SELECT DISTINCT course_id, id FROM teaches) t
     GROUP BY id)
SELECT i.name, nc
FROM instructor i, numcourse c
WHERE i.id = c.id
ORDER BY nc DESC
LIMIT 3;
name nc
Srinivasan 3
Katz 2
Crick 2
1
SELECT DISTINCT dept_name FROM student WHERE name = 'Snow';
dept_name
Physics
Physics

students that have not taken any classes (NOT EXISTS, EXCEPT, and OUTER JOIN)

  • give me students that have not taken any classes
1
2
3
SELECT *
FROM student s
WHERE NOT EXISTS(SELECT * FROM takes t WHERE s.id = t.id);
id name dept_name tot_cred
70557 Snow Physics 0
12121 Snow Physics 121
1
2
3
4
5
6
WITH nocourse AS (SELECT id FROM student s
               EXCEPT
               SELECT id FROM takes t)
SELECT *
FROM student s, nocourse c
WHERE s.id = c.id;
id name dept_name tot_cred id
12121 Snow Physics 121 12121
70557 Snow Physics 0 70557
1
2
3
SELECT s.*
FROM student s LEFT OUTER JOIN (SELECT id, 1 AS nevernull FROM takes) t ON (s.id = t.id)
WHERE t.nevernull IS NULL;
id name dept_name tot_cred
70557 Snow Physics 0
12121 Snow Physics 121

Students that have taken all CS courses (universal quantification)

  • give me students that have taken all computer science classes
1
SELECT * FROM course LIMIT 10;
course_id title dept_name credits
BIO-101 Intro. to Biology Biology 4
BIO-301 Genetics Biology 4
BIO-399 Computational Biology Biology 3
CS-101 Intro. to Computer Science Comp. Sci. 4
CS-190 Game Design Comp. Sci. 4
CS-315 Robotics Comp. Sci. 3
CS-319 Image Processing Comp. Sci. 3
CS-347 Database System Concepts Comp. Sci. 3
EE-181 Intro. to Digital Systems Elec. Eng. 3
FIN-201 Investment Banking Finance 3
1
SELECT * FROM takes LIMIT 1;
id course_id sec_id semester year grade
00128 CS-101 1 Fall 2009 A
  • give me students that have taken all computer science classes
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
WITH cscourse AS
     (SELECT * FROM course WHERE dept_name = 'Comp. Sci.'),
     takecs AS
     (SELECT id, course_id FROM takes WHERE course_id IN (SELECT course_id FROM cscourse)),
     postaken AS
     (SELECT id, course_id FROM student s, cscourse b),
     missingcs AS
     (SELECT id FROM (SELECT * FROM postaken EXCEPT SELECT * FROM takecs) s)
SELECT id
FROM student s
EXCEPT
SELECT id
FROM missingcs
id
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
WITH cscourse AS
     (SELECT * FROM course WHERE dept_name = 'Comp. Sci.'),
     takecs AS
     (SELECT id, course_id FROM takes WHERE course_id IN (SELECT course_id FROM cscourse)),
     postaken AS
     (SELECT id, course_id FROM student s, cscourse b),
     missingcs AS
     (SELECT id FROM (SELECT * FROM postaken EXCEPT SELECT * FROM takecs) s)
SELECT *
FROM student s
WHERE s.id NOT IN (SELECT * FROM missingcs);
id name dept_name tot_cred

Lecture <2025-02-24 Mon>

More on Students that have taken all CS courses (universal quantification)

  • give me students that have taken all computer science classes
  • give me students s such that

    • there does not exist a CS course c

      • such that the student s has not taken the course c
1
2
3
4
5
6
7
8
SELECT *
  FROM student s
 WHERE NOT EXISTS (SELECT *
                     FROM course c
                    WHERE dept_name = 'Comp. Sci.' AND
                      NOT EXISTS (SELECT *
                                    FROM takes t
                                  WHERE t.course_id = c.course_id AND t.id = s.id));
id name dept_name tot_cred
1
2
3
4
5
6
7
8
SELECT *
FROM student s,
     LATERAL (SELECT course_id
             FROM course c
            WHERE dept_name = 'Comp. Sci.' AND
              NOT EXISTS (SELECT *
                            FROM takes t
                           WHERE t.course_id = c.course_id AND t.id = s.id)) c
id name dept_name tot_cred course_id
00128 Zhang Comp. Sci. 102 CS-190
00128 Zhang Comp. Sci. 102 CS-315
00128 Zhang Comp. Sci. 102 CS-319
12345 Shankar Comp. Sci. 32 CS-319
19991 Brandt History 80 CS-101
19991 Brandt History 80 CS-190
19991 Brandt History 80 CS-315
19991 Brandt History 80 CS-319
19991 Brandt History 80 CS-347
23121 Chavez Finance 110 CS-101
23121 Chavez Finance 110 CS-190
23121 Chavez Finance 110 CS-315
23121 Chavez Finance 110 CS-319
23121 Chavez Finance 110 CS-347
44553 Peltier Physics 56 CS-101
44553 Peltier Physics 56 CS-190
44553 Peltier Physics 56 CS-315
44553 Peltier Physics 56 CS-319
44553 Peltier Physics 56 CS-347
45678 Levy Physics 46 CS-190
45678 Levy Physics 46 CS-315
45678 Levy Physics 46 CS-347
54321 Williams Comp. Sci. 54 CS-315
54321 Williams Comp. Sci. 54 CS-319
54321 Williams Comp. Sci. 54 CS-347
55739 Sanchez Music 38 CS-101
55739 Sanchez Music 38 CS-190
55739 Sanchez Music 38 CS-315
55739 Sanchez Music 38 CS-319
55739 Sanchez Music 38 CS-347
70557 Snow Physics 0 CS-101
70557 Snow Physics 0 CS-190
70557 Snow Physics 0 CS-315
70557 Snow Physics 0 CS-319
70557 Snow Physics 0 CS-347
76543 Brown Comp. Sci. 58 CS-190
76543 Brown Comp. Sci. 58 CS-315
76543 Brown Comp. Sci. 58 CS-347
76653 Aoi Elec. Eng. 60 CS-101
76653 Aoi Elec. Eng. 60 CS-190
76653 Aoi Elec. Eng. 60 CS-315
76653 Aoi Elec. Eng. 60 CS-319
76653 Aoi Elec. Eng. 60 CS-347
98765 Bourikas Elec. Eng. 98 CS-190
98765 Bourikas Elec. Eng. 98 CS-319
98765 Bourikas Elec. Eng. 98 CS-347
98988 Tanaka Biology 120 CS-101
98988 Tanaka Biology 120 CS-190
98988 Tanaka Biology 120 CS-315
98988 Tanaka Biology 120 CS-319
98988 Tanaka Biology 120 CS-347
12121 Snow Physics 121 CS-101
12121 Snow Physics 121 CS-190
12121 Snow Physics 121 CS-315
12121 Snow Physics 121 CS-319
12121 Snow Physics 121 CS-347
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
WITH allcourse AS (SELECT course_id FROM course WHERE dept_name = 'Physics'),
     cscnt AS (SELECT count(*) AS cnt
               FROM allcourse),
     stdcscnt AS (SELECT id, count(DISTINCT course_id) AS cnt
                  FROM takes
                  WHERE course_id IN (SELECT course_id FROM allcourse)
                  GROUP BY id)
SELECT *
 FROM student s, stdcscnt c, cscnt cs
WHERE s.id = c.id AND c.cnt = cs.cnt;
id name dept_name tot_cred id cnt cnt
44553 Peltier Physics 56 44553 1 1
1
SELECT DISTINCT dept_name FROM course;
dept_name
Finance
History
Physics
Music
Comp. Sci.
Biology
Elec. Eng.

Views

1
2
3
4
SELECT dept_name AS deptname,
       count(*) AS numst
FROM student
GROUP BY dept_name;
deptname numst
Finance 1
History 1
Physics 4
Music 1
Comp. Sci. 4
Biology 1
Elec. Eng. 2
1
2
3
4
5
6
CREATE VIEW numstud AS (
  SELECT dept_name AS deptname,
         count(*) AS numst
  FROM student
  GROUP BY dept_name
);
CREATE VIEW
1
SELECT * FROM numstud;
deptname numst
Finance 1
History 1
Physics 4
Music 1
Comp. Sci. 4
Biology 1
Elec. Eng. 2
1
DROP VIEW numstud;
DROP VIEW

Materialized View

1
2
3
4
5
6
CREATE MATERIALIZED VIEW numstud AS (
  SELECT dept_name AS deptname,
         count(*) AS numst
  FROM student
  GROUP BY dept_name
);
SELECT 7
1
SELECT * FROM numstud;
deptname numst
Finance 1
History 1
Physics 4
Music 1
Comp. Sci. 4
Biology 1
Elec. Eng. 2
1
DELETE FROM student WHERE dept_name = 'Finance';
DELETE 1
  • view has not been refreshed, we still get the finance student
1
SELECT * FROM numstud;
deptname numst
Finance 1
History 1
Physics 4
Music 1
Comp. Sci. 4
Biology 1
Elec. Eng. 2
1
REFRESH MATERIALIZED VIEW numstud;
REFRESH MATERIALIZED VIEW
  • view has not been refreshed, we still get the finance student
1
SELECT * FROM numstud;
deptname numst
History 1
Physics 4
Music 1
Comp. Sci. 4
Biology 1
Elec. Eng. 2

Recursive queries

  • reachability in a graph between node x and y

    • base case: there is a direct edge from x to y
    • recursive case:

      • there exists a path from x to some node z
      • and there exists an edge from z to y
      • \(\Rightarrow\) there is a path from x to y.
  • example application:

    • does there exist a path in a labyrinth?

../laby.jpg

1
2
3
4
CREATE TABLE labyrinth (
  x int,
  y int
);
CREATE TABLE
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
INSERT INTO labyrinth VALUES
 (0,0),
 (1,0),
 (2,0),
 (3,0),
 (2,1),
 (0,2),
 (1,2),
 (2,2),
 (0,3);
INSERT 0 9
1
2
3
4
5
6
CREATE TABLE edge (
   x1 int,
   y1 int,
   x2 int,
   y2 int
);
CREATE TABLE
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
DELETE FROM edge;
INSERT INTO edge VALUES
  (0,0,1,0),
  (1,0,2,0),
  (2,0,3,0),
  (2,0,2,1),
  (2,1,2,2),
  (2,2,1,2),
  (1,2,0,2),
  (0,2,0,3);
INSERT INTO edge (SELECT x2,y2,x1,y1 FROM edge);
DELETE 14
INSERT 0 8
INSERT 0 8
1
SELECT * FROM edge;
x1 y1 x2 y2
0 0 1 0
1 0 2 0
2 0 3 0
2 0 2 1
2 1 2 2
2 2 1 2
1 2 0 2
0 2 0 3
1 0 0 0
2 0 1 0
3 0 2 0
2 1 2 0
2 2 2 1
1 2 2 2
0 2 1 2
0 3 0 2
1
2
3
4
5
6
7
8
9
WITH RECURSIVE reach AS (
  SELECT * FROM edge -- init
  UNION
  SELECT r.x1, r.y1, e.x2, e.y2 -- recursive step
  FROM reach r, -- recursive table
       edge e
  WHERE r.x2 = e.x1 AND r.y2 = e.y1)
SELECT EXISTS (SELECT * FROM reach
               WHERE x1 = 0 AND y1 = 0 AND x2 = 0 AND y2 = 3);
exists
t
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
WITH RECURSIVE reach AS (
  SELECT * FROM edge -- init
  UNION
  SELECT r.x1, r.y1, e.x2, e.y2 -- recursive step
  FROM reach r, -- recursive table
       edge e
  WHERE r.x2 = e.x1 AND r.y2 = e.y1)
SELECT *
FROM reach
WHERE x1 = 0 AND y1 = 0;
x2 y2
1 0
0 0
2 0
2 1
3 0
2 2
1 2
0 2
0 3
1
2
3
DELETE FROM edge
WHERE (x1 = 2 AND y1 = 1 AND x2 = 2 AND y2 = 2)
   OR (x2 = 2 AND y2 = 1 AND x1 = 2 AND y1 = 2);
DELETE 2
1
2
3
4
5
6
7
8
9
WITH RECURSIVE reach AS (
  SELECT * FROM edge -- init
  UNION
  SELECT r.x1, r.y1, e.x2, e.y2 -- recursive step
  FROM reach r, -- recursive table
       edge e
  WHERE r.x2 = e.x1 AND r.y2 = e.y1)
SELECT EXISTS (SELECT * FROM reach
               WHERE x1 = 0 AND y1 = 0 AND x2 = 0 AND y2 = 3);
exists
f
1
2
3
4
5
6
7
8
WITH RECURSIVE reach AS (
  SELECT * FROM edge -- init
  UNION
  SELECT r.x1, r.y1, e.x2, e.y2 -- recursive step
  FROM reach r, -- recursive table
       edge e
  WHERE r.x2 = e.x1 AND r.y2 = e.y1)
SELECT x2,y2 FROM reach WHERE x1 = 0 AND y1 = 0 ORDER BY x2,y2;
x2 y2
0 0
1 0
2 0
2 1
3 0

Scalar expression

1
2
SELECT s.*, COALESCE(t.course_id, 'N/A') AS course
FROM student s LEFT OUTER JOIN takes t ON (s.id = t.id)
id name dept_name tot_cred course
00128 Zhang Comp. Sci. 102 CS-101
00128 Zhang Comp. Sci. 102 CS-347
12345 Shankar Comp. Sci. 32 CS-101
12345 Shankar Comp. Sci. 32 CS-190
12345 Shankar Comp. Sci. 32 CS-315
12345 Shankar Comp. Sci. 32 CS-347
19991 Brandt History 80 HIS-351
44553 Peltier Physics 56 PHY-101
45678 Levy Physics 46 CS-101
45678 Levy Physics 46 CS-101
45678 Levy Physics 46 CS-319
54321 Williams Comp. Sci. 54 CS-101
54321 Williams Comp. Sci. 54 CS-190
55739 Sanchez Music 38 MU-199
76543 Brown Comp. Sci. 58 CS-101
76543 Brown Comp. Sci. 58 CS-319
76653 Aoi Elec. Eng. 60 EE-181
98765 Bourikas Elec. Eng. 98 CS-101
98765 Bourikas Elec. Eng. 98 CS-315
98988 Tanaka Biology 120 BIO-101
98988 Tanaka Biology 120 BIO-301
70557 Snow Physics 0 N/A
12121 Snow Physics 121 N/A
1
2
3
SELECT *
FROM student s LEFT OUTER JOIN takes t ON (s.id = t.id)
WHERE t.id IS NULL;
id name dept_name tot_cred id course_id sec_id semester year grade
70557 Snow Physics 0
12121 Snow Physics 121

querying the catalog

1
SELECT * FROM pg_tables WHERE schemaname = 'public';
schemaname tablename tableowner tablespace hasindexes hasrules hastriggers rowsecurity
public annoyingtest postgres f f f f
public department postgres t f t f
public course postgres t f t f
public instructor postgres t f t f
public section postgres t f t f
public classroom postgres t f t f
public teaches postgres t f t f
public student postgres t f t f
public takes postgres t f t f
public advisor postgres t f t f
public time_slot postgres t f f f
public prereq postgres t f t f
public AnnoYingtest postgres f f f f
public AnnoYingtest2 postgres f f f f
public labyrinth postgres f f f f
public edge postgres f f f f
public flights postgres t f f f
public history postgres f f f f
public myemptytable postgres f f f f
public otherstudents postgres f f f f
public sales postgres f f f f
public nohistory_time postgres t f t f
public addr postgres t f t f
public person postgres f f t f
public seminars postgres f f t f

Explain and stats

1
EXPLAIN (SELECT * FROM student);
QUERY PLAN
Seq Scan on student (cost=0.00..14.50 rows=450 width=152)
1
SELECT count(*) FROM student;
count
13
1
ANALYZE student;
ANALYZE
1
EXPLAIN (SELECT * FROM student);
QUERY PLAN
Seq Scan on student (cost=0.00..1.13 rows=13 width=25)
1
EXPLAIN (SELECT * FROM student);
1
EXPLAIN (SELECT * FROM student WHERE dept_name = 'Biology');
QUERY PLAN
Seq Scan on student (cost=0.00..1.16 rows=1 width=25)
Filter: ((dept_name)::text = 'Biology'::text)
1
EXPLAIN (SELECT * FROM student s, takes t WHERE s.id = t.id);
QUERY PLAN
Hash Join (cost=1.29..17.25 rows=470 width=169)
Hash Cond: ((t.id)::text = (s.id)::text)
-> Seq Scan on takes t (cost=0.00..14.70 rows=470 width=144)
-> Hash (cost=1.13..1.13 rows=13 width=25)
-> Seq Scan on student s (cost=0.00..1.13 rows=13 width=25)
1
ANALYZE takes;
ANALYZE
1
EXPLAIN (SELECT * FROM student s, takes t WHERE s.id = t.id);
QUERY PLAN
Hash Join (cost=1.29..2.58 rows=21 width=53)
Hash Cond: ((t.id)::text = (s.id)::text)
-> Seq Scan on takes t (cost=0.00..1.21 rows=21 width=28)
-> Hash (cost=1.13..1.13 rows=13 width=25)
-> Seq Scan on student s (cost=0.00..1.13 rows=13 width=25)

Lecture <2025-02-27 Thu>

More Recursive query examples

  • counting from 1 to 10

    • initialization: return 1
    • counting: add 1 to the existing values
    • stop once the value is larger than 10
1
2
3
4
5
6
7
8
WITH RECURSIVE numtoten AS (
  SELECT 1 AS i -- initialization a single row with value 1
  UNION
  SELECT i+1 -- recursive step, take current numbers and add 1
  FROM numtoten
  WHERE i < 10 -- do not create numbers larger than 10
)
SELECT * FROM numtoten;
i
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
WITH RECURSIVE numtoten AS (
  SELECT 1::float AS i -- initialization a single row with value 1
  UNION
  SELECT i+1 + (random() / 200) -- recursive step, take current numbers and add 1
  FROM numtoten,
       generate_series(1,2) x(d)
  WHERE i < 5 -- do not create numbers larger than 10
)
SELECT * FROM numtoten;
i
1
2.0001714144150156
2.002122472330663
3.0031854510136395
3.0029968079597205
3.0037162709849383
3.0041431209581915
4.003530503973554
4.007014264260187
4.00512423904581
4.007912865898786
4.004371219541383
4.0043651355843615
4.004705866918702
4.007563690786552
5.004543663789732
5.009422058008416
5.007680352970594
5.011417885444241
5.00705035969399
5.0054175793351074
5.005559668900293
5.011746987965729
5.0045335864244125
5.0119941427562935
5.006957469470024
5.0125919802431955
5.008139706792797
5.005546201482753
5.009516307077774
5.009519911622328
1
SELECT * FROM generate_series(0,2)
generate_series
0
1
2
1
2
3
4
5
CREATE TABLE rev (
 mo INT,
 y INT,
 rev FLOAT
);
CREATE TABLE
1
2
3
4
INSERT INTO rev
(SELECT mo, y, random() * 10000 AS rev
 FROM generate_series(1,12) mo(mo),
      generate_series(2020,2025) y(y));
INSERT 0 72
1
2
3
SELECT y, mo, rev, sum(rev) OVER (ORDER BY y, mo) AS uptorev
FROM rev
ORDER BY y, mo;
y mo rev uptorev
2020 1 5107.982357905938 5107.982357905938
2020 2 1466.238517288412 6574.220875194351
2020 3 6869.56816595842 13443.78904115277
2020 4 6726.84066813237 20170.62970928514
2020 5 2158.9027659985713 22329.53247528371
2020 6 3982.9762720960703 26312.508747379783
2020 7 6186.857768619582 32499.366515999365
2020 8 7734.064561315102 40233.43107731447
2020 9 8626.959936033585 48860.39101334805
2020 10 1875.2785040257659 50735.66951737382
2020 11 2892.5949785298367 53628.264495903655
2020 12 9759.190723570313 63387.455219473966
2021 1 5244.862758310722 68632.31797778468
2021 2 9220.48497116908 77852.80294895376
2021 3 5152.546555396691 83005.34950435045
2021 4 3299.031336561684 86304.38084091214
2021 5 5121.299684340593 91425.68052525274
2021 6 3832.999066405176 95258.67959165791
2021 7 990.7525335243572 96249.43212518227
2021 8 5136.297979768015 101385.73010495029
2021 9 6661.398407489703 108047.12851243999
2021 10 4753.874282198473 112801.00279463847
2021 11 2868.3896272169386 115669.3924218554
2021 12 9283.22244433161 124952.61486618701
2022 1 3591.9303468459952 128544.54521303301
2022 2 5889.249521919544 134433.79473495256
2022 3 5004.062815063874 139437.85755001643
2022 4 6031.3037251994065 145469.16127521583
2022 5 2061.9100365816644 147531.07131179748
2022 6 3352.388693019741 150883.46000481723
2022 7 3035.637988660025 153919.09799347725
2022 8 3012.864179219632 156931.9621726969
2022 9 714.2772860936053 157646.2394587905
2022 10 7312.486382396779 164958.72584118726
2022 11 2827.9535211497687 167786.67936233702
2022 12 7743.448826402661 175530.12818873968
2023 1 1768.8703870648492 177298.99857580452
2023 2 5399.6425902119745 182698.6411660165
2023 3 8902.81520609791 191601.4563721144
2023 4 7581.313501504696 199182.76987361908
2023 5 4729.071371167251 203911.84124478634
2023 6 2298.0761152846508 206209.91736007098
2023 7 110.93981363924365 206320.85717371022
2023 8 4812.665727444904 211133.5229011551
2023 9 3078.05338362241 214211.57628477752
2023 10 6460.677208700334 220672.25349347785
2023 11 7115.878340216006 227788.13183369386
2023 12 4672.11106622031 232460.24289991416
2024 1 3313.5569192540393 235773.7998191682
2024 2 6402.108452311308 242175.9082714795
2024 3 8332.243447404553 250508.15171888407
2024 4 9743.962082127724 260252.1138010118
2024 5 3336.9365485551007 263589.0503495669
2024 6 3743.326463850276 267332.3768134172
2024 7 7841.149794562068 275173.52660797926
2024 8 3027.1098667150854 278200.63647469436
2024 9 9824.899028839613 288025.53550353396
2024 10 3634.617462350955 291660.1529658849
2024 11 4357.112969377906 296017.2659352628
2024 12 8426.748711940721 304444.0146472035
2025 1 9661.387179837524 314105.401827041
2025 2 7946.5337978395255 322051.93562488054
2025 3 9917.628673822737 331969.5642987033
2025 4 1421.2821364843696 333390.8464351877
2025 5 4406.819910356055 337797.66634554375
2025 6 5078.031532004526 342875.6978775483
2025 7 1549.6162797975032 344425.3141573458
2025 8 6144.529288592362 350569.84344593814
2025 9 1565.302551833032 352135.1459977712
2025 10 8276.416441111263 360411.5624388824
2025 11 1580.1872128072202 361991.74965168966
2025 12 9109.985497584019 371101.7351492737
1
2
3
4
5
SELECT y, mo, rev, (SELECT sum(rev)
                    FROM rev r2
                   WHERE r2.y < r1.y OR (r2.y = r1.y AND r2.mo <= r1.mo)) AS uptorev
FROM rev r1
ORDER BY y, mo;
1
2
3
SELECT y, mo, rev, sum(rev) OVER (PARTITION BY y ORDER BY y, mo) AS uptorev
FROM rev
ORDER BY y, mo;
y mo rev uptorev
2020 1 5107.982357905938 5107.982357905938
2020 2 1466.238517288412 6574.220875194351
2020 3 6869.56816595842 13443.78904115277
2020 4 6726.84066813237 20170.62970928514
2020 5 2158.9027659985713 22329.53247528371
2020 6 3982.9762720960703 26312.508747379783
2020 7 6186.857768619582 32499.366515999365
2020 8 7734.064561315102 40233.43107731447
2020 9 8626.959936033585 48860.39101334805
2020 10 1875.2785040257659 50735.66951737382
2020 11 2892.5949785298367 53628.264495903655
2020 12 9759.190723570313 63387.455219473966
2021 1 5244.862758310722 5244.862758310722
2021 2 9220.48497116908 14465.347729479803
2021 3 5152.546555396691 19617.894284876493
2021 4 3299.031336561684 22916.925621438175
2021 5 5121.299684340593 28038.225305778768
2021 6 3832.999066405176 31871.224372183944
2021 7 990.7525335243572 32861.9769057083
2021 8 5136.297979768015 37998.27488547632
2021 9 6661.398407489703 44659.67329296602
2021 10 4753.874282198473 49413.54757516449
2021 11 2868.3896272169386 52281.93720238143
2021 12 9283.22244433161 61565.15964671304
2022 1 3591.9303468459952 3591.9303468459952
2022 2 5889.249521919544 9481.17986876554
2022 3 5004.062815063874 14485.242683829412
2022 4 6031.3037251994065 20516.54640902882
2022 5 2061.9100365816644 22578.45644561048
2022 6 3352.388693019741 25930.845138630222
2022 7 3035.637988660025 28966.48312729025
2022 8 3012.864179219632 31979.34730650988
2022 9 714.2772860936053 32693.624592603486
2022 10 7312.486382396779 40006.110975000265
2022 11 2827.9535211497687 42834.06449615004
2022 12 7743.448826402661 50577.513322552695
2023 1 1768.8703870648492 1768.8703870648492
2023 2 5399.6425902119745 7168.512977276823
2023 3 8902.81520609791 16071.328183374733
2023 4 7581.313501504696 23652.64168487943
2023 5 4729.071371167251 28381.71305604668
2023 6 2298.0761152846508 30679.78917133133
2023 7 110.93981363924365 30790.728984970574
2023 8 4812.665727444904 35603.39471241548
2023 9 3078.05338362241 38681.44809603789
2023 10 6460.677208700334 45142.12530473823
2023 11 7115.878340216006 52258.00364495423
2023 12 4672.11106622031 56930.11471117454
2024 1 3313.5569192540393 3313.5569192540393
2024 2 6402.108452311308 9715.665371565347
2024 3 8332.243447404553 18047.9088189699
2024 4 9743.962082127724 27791.870901097624
2024 5 3336.9365485551007 31128.807449652726
2024 6 3743.326463850276 34872.133913503
2024 7 7841.149794562068 42713.283708065064
2024 8 3027.1098667150854 45740.39357478015
2024 9 9824.899028839613 55565.29260361976
2024 10 3634.617462350955 59199.91006597072
2024 11 4357.112969377906 63557.023035348626
2024 12 8426.748711940721 71983.77174728934
2025 1 9661.387179837524 9661.387179837524
2025 2 7946.5337978395255 17607.92097767705
2025 3 9917.628673822737 27525.549651499787
2025 4 1421.2821364843696 28946.831787984156
2025 5 4406.819910356055 33353.65169834021
2025 6 5078.031532004526 38431.68323034474
2025 7 1549.6162797975032 39981.29951014224
2025 8 6144.529288592362 46125.828798734605
2025 9 1565.302551833032 47691.13135056764
2025 10 8276.416441111263 55967.5477916789
2025 11 1580.1872128072202 57547.73500448612
2025 12 9109.985497584019 66657.72050207014
1
2
3
SELECT y, mo, rev, sum(rev) OVER (ORDER BY y, mo ROWS BETWEEN 3 PRECEDING AND 2 FOLLOWING) AS uptorev
FROM rev
ORDER BY y, mo;
y mo rev uptorev
2020 1 5107.982357905938 13443.78904115277
2020 2 1466.238517288412 20170.62970928514
2020 3 6869.56816595842 22329.53247528371
2020 4 6726.84066813237 26312.508747379783
2020 5 2158.9027659985713 27391.384158093428
2020 6 3982.9762720960703 33659.21020212011
2020 7 6186.857768619582 35416.601972195276
2020 8 7734.064561315102 30565.039808088677
2020 9 8626.959936033585 31298.73202061994
2020 10 1875.2785040257659 37074.94647209418
2020 11 2892.5949785298367 36132.95146178533
2020 12 9759.190723570313 37619.371871639305
2021 1 5244.862758310722 34144.958491002406
2021 2 9220.48497116908 35568.71132353832
2021 3 5152.546555396691 37797.41602934908
2021 4 3299.031336561684 31871.224372183944
2021 5 5121.299684340593 27617.11414739758
2021 6 3832.999066405176 23532.92715599652
2021 7 990.7525335243572 25041.77900808953
2021 8 5136.297979768015 26496.621953726317
2021 9 6661.398407489703 24243.71189660266
2021 10 4753.874282198473 29693.935274529096
2021 11 2868.3896272169386 32295.113087850736
2021 12 9283.22244433161 33048.06463000226
2022 1 3591.9303468459952 31390.729037576435
2022 2 5889.249521919544 32668.158480577367
2022 3 5004.062815063874 31861.67888994209
2022 4 6031.3037251994065 25930.845138630222
2022 5 2061.9100365816644 25374.552780444254
2022 6 3352.388693019741 22498.167437744345
2022 7 3035.637988660025 18208.381908774074
2022 8 3012.864179219632 19489.564565971446
2022 9 714.2772860936053 20255.60805053955
2022 10 7312.486382396779 24646.668183922473
2022 11 2827.9535211497687 23379.900582327296
2022 12 7743.448826402661 25766.678993319638
2023 1 1768.8703870648492 33955.216913323944
2023 2 5399.6425902119745 34224.044032431855
2023 3 8902.81520609791 36125.16188244935
2023 4 7581.313501504696 30679.78917133133
2023 5 4729.071371167251 29021.858597905728
2023 6 2298.0761152846508 28434.881735138653
2023 7 110.93981363924365 22610.119912663155
2023 8 4812.665727444904 21489.483619858795
2023 9 3078.05338362241 23876.29058890755
2023 10 6460.677208700334 26250.32553984321
2023 11 7115.878340216006 29452.942645458003
2023 12 4672.11106622031 31042.38537032441
2024 1 3313.5569192540393 36296.57543410655
2024 2 6402.108452311308 39579.860307533934
2024 3 8332.243447404553 35800.91851587303
2024 4 9743.962082127724 34872.133913503
2024 5 3336.9365485551007 39399.72678881103
2024 6 3743.326463850276 36024.728203214814
2024 7 7841.149794562068 37517.383784649865
2024 8 3027.1098667150854 31408.039164873095
2024 9 9824.899028839613 32428.2155856959
2024 10 3634.617462350955 37111.63783378635
2024 11 4357.112969377906 38931.87521906181
2024 12 8426.748711940721 43851.29915018625
2025 1 9661.387179837524 43944.02879516937
2025 2 7946.5337978395255 41730.69346930278
2025 3 9917.628673822737 41780.400410280934
2025 4 1421.2821364843696 38431.68323034474
2025 5 4406.819910356055 30319.912330304716
2025 6 5078.031532004526 28517.907821057554
2025 7 1549.6162797975032 20165.581699067847
2025 8 6144.529288592362 27020.71600369474
2025 9 1565.302551833032 24194.083306145905
2025 10 8276.416441111263 28226.0372717254
2025 11 1580.1872128072202 26676.420991927895
2025 12 9109.985497584019 20531.891703335536

Lecture <2025-03-11 Tue>

  • relational algebra (homework)

    • join (outer versus inner)
  • SQL query

    • when to use WITH
    • WHERE versus HAVING
    • RECURSIVE query

Joins

employee

name salary dept aid
Peter 50000 HR 1
Bob 30000 IT 2
Alice 100000 CEO 1
Felix 100 HR NULL

address

aid street city
1 gggg c1
2 hhhh c2
3 jjjj c1
  • crossproduct \(R \times S\) (\(\rho_{name,salary,dept,aid1}(employee) \times address\))
name salary dept aid1 aid street city
Peter 50000 HR 1 2 hhhh c2s
  • natural join \(R \bowtie S)
name salary dept aid street city
Peter 50000 HR 1 gggg c1
  • theta join \(R \bowtie_{\theta} S\) \(\rho_{name,salary,dept,aid1}(employee) \bowtie_{aid1 \neq aid} address\)
  • theta join \(R =\bowtie_{\theta} S\) \(\rho_{name,salary,dept,aid1}(employee) \bowtie_{aid1 = aid} address\)
name salary dept aid1 aid street city
Felix 100 HR NULL NULL NULL NULL

Joins in SQL

1
SELECT * FROM student LIMIT 3;
id name dept_name tot_cred
00128 Zhang Comp. Sci. 104
12345 Shankar Comp. Sci. 34
19991 Brandt History 82
1
SELECT * FROM takes LIMIT 3;
id course_id sec_id semester year grade
00128 CS-101 1 Fall 2009 A
00128 CS-347 1 Fall 2009 A-
12345 CS-101 1 Fall 2009 C
  • natural join
1
2
3
SELECT name, dept_name, course_id
FROM student s NATURAL JOIN takes t
LIMIT 3;
name dept_name course_id
Zhang Comp. Sci. CS-101
Zhang Comp. Sci. CS-347
Shankar Comp. Sci. CS-101

\(\sigma_{\theta}(R \times S) \equiv R \bowtie_{\theta} S\)

  • natural join
1
2
3
SELECT name, dept_name, course_id
FROM student s JOIN takes t ON (s.id = t.id)
LIMIT 1;
name dept_name course_id
Zhang Comp. Sci. CS-101
1
2
3
4
SELECT s.name, dept_name, course_id
FROM student s, takes t
WHERE s.id = t.id
LIMIT 1;
name dept_name course_id
Zhang Comp. Sci. CS-101
1
2
SELECT name, dept_name, course_id
FROM student s LEFT OUTER JOIN takes t ON (s.id = t.id);
name dept_name course_id
Zhang Comp. Sci. CS-101
Zhang Comp. Sci. CS-347
Shankar Comp. Sci. CS-101
Shankar Comp. Sci. CS-190
Shankar Comp. Sci. CS-315
Shankar Comp. Sci. CS-347
Brandt History HIS-351
Peltier Physics PHY-101
Levy Physics CS-101
Levy Physics CS-101
Levy Physics CS-319
Williams Comp. Sci. CS-101
Williams Comp. Sci. CS-190
Sanchez Music MU-199
Brown Comp. Sci. CS-101
Brown Comp. Sci. CS-319
Aoi Elec. Eng. EE-181
Bourikas Elec. Eng. CS-101
Bourikas Elec. Eng. CS-315
Tanaka Biology BIO-101
Tanaka Biology BIO-301
Snow Physics
Snow Physics
1
2
SELECT s1.id AS s1, s2.id AS s2
FROM student s1 FULL OUTER JOIN student s2  ON (s1.dept_name = s2.dept_name AND s1.id != s2.id);
s1 s2
00128 76543
00128 54321
00128 12345
12345 76543
12345 54321
12345 00128
19991
44553 12121
44553 70557
44553 45678
45678 12121
45678 70557
45678 44553
54321 76543
54321 12345
54321 00128
55739
70557 12121
70557 45678
70557 44553
76543 54321
76543 12345
76543 00128
76653 98765
98765 76653
98988
12121 70557
12121 45678
12121 44553
19991
98988
55739

WHERE vs HAVING

  • WHERE before aggregation or for queries without aggregation
  • HAVING selection after aggregation
1
2
3
4
5
SELECT count(*), dept_name
FROM student
WHERE tot_cred > 60
GROUP BY dept_name
HAVING count(*) > 1;
count dept_name
2 Elec. Eng.
1
2
3
4
5
6
SELECT *
FROM (SELECT avg(tot_cred), dept_name
      FROM student
      WHERE tot_cred > 60
      GROUP BY dept_name) sub
WHERE avg >22;
avg dept_name
122.0000000000000000 Biology
81.0000000000000000 Elec. Eng.
104.0000000000000000 Comp. Sci.
82.0000000000000000 History
123.0000000000000000 Physics
1
2
SELECT count(*) OVER (PARTITION BY dept_name ORDER BY tot_cred ASC), dept_name, id, tot_cred
FROM student;
count dept_name id tot_cred
1 Biology 98988 122
1 Comp. Sci. 12345 34
2 Comp. Sci. 54321 56
3 Comp. Sci. 76543 60
4 Comp. Sci. 00128 104
1 Elec. Eng. 76653 62
2 Elec. Eng. 98765 100
1 History 19991 82
1 Music 55739 40
1 Physics 70557 2
2 Physics 45678 48
3 Physics 44553 58
4 Physics 12121 123

universial quantification

1
2
3
4
5
6
7
8
SELECT *
FROM student s
WHERE NOT EXISTS (SELECT *
                  FROM course c
                  WHERE NOT EXISTS (SELECT *
                                    FROM takes t
                                    WHERE t.id = s.id
                                          AND c.course_id = t.course_id));
id name dept_name tot_cred

HW1 Q1

\[ \pi_{hotel,city,roomnr,price}(\sigma_{floor = 1}(room)) \]

Lecture <2025-03-18 Tue>

./university.jpg

  • order
supplier customer item
Bob Peter Chainsaw
Bob Alice Fork
Felix Peter Fork
Felix Alice Chainsaw
supplier customer item
Bob Peter Chainsaw
Bob Alice Fork
Felix Peter Fork
Felix Peter Chainsaw
Felix Alice Chainsaw
  • supplies-to
supplier customer
Bob Peter
Bob Alice
Felix Peter
Felix Alice
  • orders-item
customer item
Peter Chainsaw
Peter Fork
Alice Chainsaw
Alice Fork
  • supplies-item
supplier item
Bob Chainsaw
Bob Fork
Felix Chainsaw
Felix Fork

Lecture <2025-03-20 Thu>

student

name major zip city
Peter CS 60653 Chicago
Bob CS 60653 Chicago
Alice CS 60615 Chicago
Gertrud Bio 60777 Schaumburg
Lisa Bio 60777 Schaumburg
Hubert CS 55555 New York

\(major \not\rightarrow city\)

\(zip \not\rightarrow name\)

\(\sigma_3: zip \rightarrow city\) (holds on the instance, but rural areas may violate that)

\(\sigma_1: name \rightarrow major, zip, city\) (is a key, at least on the instance, but probably not in general)

\(\sigma_2: name, major \rightarrow zip, city\)

student2(name,major,zip) \(\pi_{name,major,zip}(student)\)

name major zip
Peter CS 60653
Bob CS 60653
Alice CS 60615
Gertrud Bio 60777
Lisa Bio 60777
Hubert CS 55555

address(city,zip) = \(\pi_{zip,city}(student)\) \(zip \rightarrow city\)

zip city
60653 Chicago
60615 Chicago
60777 Schaumburg
55555 New York

\(student2 \bowtie address\)

name major zip city
Peter CS 60653 Chicago
Bob CS 60653 Chicago
Alice CS 60615 Chicago
Gertrud Bio 60777 Schaumburg
Lisa Bio 60777 Schaumburg
Hubert CS 55555 New York

\(name \rightarrow major, zip, city\)

\(name \rightarrow major\)

  • \(zip \rightarrow city\)
  • \(zip, major \rightarrow city, major\)
  • \(major, city \rightarrow major\)

\(UIN \rightarrow Name\)

UIN Name
1 Peter
1 Bob

\(A,B \rightarrow A, C\)

\(A_1, \ldots, A_n\)

\(A_i, A_j \rightarrow A_i\)

\(S = \{e_1, e_2, e_3 \}\)

e_1 e2 e3 subset
0 0 0 {}
0 0 1 {e3}
0 1 0 {e2}
0 1 1 {e2,e3}
1 0 0 {e1}
1 0 1 {e1,e3}
1 1 0 {e1,e2}
1 1 1 {e1,e2,e3}
a b \(a \rightarrow b\)
1 1 1
1 0 0
0 0 1
0 1 1

\(\Sigma = \{ a \rightarrow b; bc \rightarrow d; b \rightarrow c \}\)

  • \(a \rightarrow c\) (transitivity)
  • \(a \rightarrow d\)

–\(b \rightarrow c\) (augmentation) \(b \rightarrow bc\)

  • \(b \rightarrow bc\) (transitivity) \(b \rightarrow d\)
  • \(b \rightarrow d\) (transitivty) \(a \rightarrow d\)

Lecture <2025-04-08 Tue>

A B C
1 2 4
2 2 4
3 3 5

\(R_1 = \{A,B\}\) and \(R_2 = \{B,C\}\)

A B
1 2
2 2
3 3
B C
2 4
3 5

\(R_1 \bowtie R_2\)

A B C
1 1 1
2 1 2
A B
1 1
2 1
B C
1 1
1 2
A B C
1 1 1
1 1 2
2 1 1
2 1 2
inst salary dept budget
Boris Glavic 0 CS 50k
Chris Kanich 100000 CS 50k
  • Candidate key \(inst\)

\(inst \rightarrow salary, dept, budget\) \(dept \rightarrow budget\)

Lecture <2025-04-10 Thu>

\[ S_1 = r_1(A), r_2(A) \]

\[ S_2 = r_2(A), r_1(A) \]

Lecture <2025-04-15 Tue>

\(S_1 = w_1(A), w_2(B)\)

\(S_2 = w_2(A), w_1(B)\)

Conflicting operations

\(T_1: w_1(A), w_1(B)\) \(T_2: r_2(B), r_2(A)\)

\(S_1 = o_1: w_1(A), o_2: r_2(B), o_3: r_2(A), o_4: w_1(B)\)

  • Conflicting operations:

    • \((o_1 < o_3)\)
    • \((o_2 < o_4)\)

\(S_2 = o_2: r_2(B), o_1: w_1(A), o_3: r_2(A), o_4: w_1(B)\)

  • Conflicting operations:

    • \((o_1 < o_3)\)
    • \((o_2 < o_4)\)

\(S_3 = w_1(A), w_1(B), r_2(B), r_2(A)\)

  • Conflicting operations:

    • \((o_1 < o_3)\)
    • \((o_4 < o_2)\)

\(S_4 = r_2(B), r_2(A), w_1(A), w_1(B)\)

  • Conflicting operations:

    • \((o_3 < o_1)\)
    • \((o_2 < o_4)\)

number of serial schedules

  • \(n!\)
  • \(T_1, T_2, T_3\)
  • \(T_1, T_3, T_3\)
  • \(T_2, T_1, T_3\)
  • \(T_2, T_3, T_1\)
  • \(T_3, T_1, T_2\)
  • \(T_3, T_2, T_1\)

Lecture <2025-04-17 Thu>

Lecture <2025-04-22 Tue>

Schedules

example 1

\[ S = w_1(A), r_2(B), r_3(B), w_2(B), r_4(A), c_1, c_2, c_3, c_4 \]

  • conflict serializable as the graph does not have cycles
  • recoverability related properties

    • recoverable
    • not cascadeless
    • not strict

../cg1.jpg

example 2

\[ S = w_1(A), w_4(C), r_2(B), r_3(B), w_2(B), r_4(A), w_3(B), r_1(C), c_1, c_2, c_3, c_4 \]

  • not conflict serializable as the graph has cycles
  • recoverability related properties:

    • not recoverable
    • not cascadeless
    • not strict

../cg2.jpg

Lecture <2025-04-24 Thu>

Lecture <2025-04-29 Tue>

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
DROP TABLE IF EXISTS r;
CREATE TABLE r (a int, b int, c int, d int);

INSERT INTO r (
SELECT a, round(random() * 100000) AS b, round(random() * 100000) AS c, round(random() * 100) AS d
FROM generate_series(1,1000000) g(a))
;

DROP TABLE IF EXISTS s;
CREATE TABLE s (a int, b int, c int, d int);

INSERT INTO s (
SELECT a, round(random() * 100000) AS b, round(random() * 100000) AS c, round(random() * 100) AS d
FROM generate_series(1,1000000) g(a))
;

ANALYZE;
DROP TABLE
CREATE TABLE
INSERT 0 1000000
DROP TABLE
CREATE TABLE
INSERT 0 1000000
ANALYZE
1
SELECT count(*) FROM r;
count
1000000
1
CREATE INDEX r_b_idx ON r(b);
CREATE INDEX
1
EXPLAIN (ANALYZE true, BUFFERS true) SELECT sum(c) FROM s WHERE b <= 30000;
QUERY PLAN
Finalize Aggregate (cost=11928.34..11928.35 rows=1 width=8) (actual time=19.158..20.142 rows=1 loops=1)
Buffers: shared hit=5406
-> Gather (cost=11928.13..11928.34 rows=2 width=8) (actual time=19.088..20.138 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=5406
-> Partial Aggregate (cost=10928.13..10928.14 rows=1 width=8) (actual time=17.414..17.414 rows=1 loops=3)
Buffers: shared hit=5406
-> Parallel Seq Scan on s (cost=0.00..10614.33 rows=125517 width=4) (actual time=0.010..14.062 rows=99993 loops=3)
Filter: (b <= 30000)
Rows Removed by Filter: 233341
Buffers: shared hit=5406
Planning:
Buffers: shared hit=37
Planning Time: 0.217 ms
Execution Time: 20.192 ms
1
ALTER TABLE r ADD CONSTRAINT pk_r PRIMARY KEY(a);
ALTER TABLE
1
SELECT * FROM r WHERE a = 1000000;
a b c d
1000000 55211 70312 24

Lecture <2025-05-01 Thu>

ER model

  • weak entity and identifying relationship

./weakent.jpg

  • in this case probably ssn would be a key for worker, but contract is only identified through the startdate + the worker it belongs to (assuming that a worker cannot have concurrent contracts). In this case contract should be a weak entity identified by (double diamond) to worker.

FD and Normalization Algorithms

Canonical Cover

  • union rule

    • \(\Sigma = \{ a \rightarrow b,c; a \rightarrow d\}\) can be merged into \(a \rightarrow b,c,d\)
  • extraneous attributes

    • remove from LHS or RHS based on attribute closure test (see slides)
  • fixed point computation

\(\Sigma = \{ A \rightarrow B, B \rightarrow C, AB \rightarrow CD\}\)

iteration 1

  • no FDs with same LHS (no union rule applicable)
  • claim that \(C\) is redundant in \(AB \rightarrow CD\)

    • \(AB^+\) using \(\Sigma' = \{ A \rightarrow B, B \rightarrow C, AB \rightarrow D\}\) \(AB^+ = ABCD \supseteq CD\)
  • So \(C\) redundant

iteration 2

\(\Sigma' = \{ A \rightarrow B, B \rightarrow C, AB \rightarrow D\}\)

  • no FDs with same LHS (no union rule applicable)
  • claim that \(B\) is extraneous in \(AB \rightarrow D\)

    • \(A^+ = ABCD \supseteq D\)
    • yes

\(\Sigma'' = \{ A \rightarrow B, B \rightarrow C, A \rightarrow D\}\)

iteration 3

  • union (yes) \(\Sigma''' = \{ A \rightarrow BD, B \rightarrow C\}\)
  • claim that \(B\) is extraneous in \(A \rightarrow BD\)

    • \(A^+\) with \(\Sigma'''' = \{ A \rightarrow D, B \rightarrow C\}\)
    • \(A^+ = AD \not\supseteq BD\)
    • B is not extraneous

iteration 4

\(\Sigma''' = \{ A \rightarrow BD, B \rightarrow C\}\)

  • union (no same LHS)
  • still nothing extraneous

break (have canonical cover)

  • return \(\Sigma_C = \{ A \rightarrow BD, B \rightarrow C\}\)

Decomposition 3NF decomposition

  • input needs canonical cover to be correct
  • create one table per FD
  • if necessary create table for some candidate key
  • remove tables whose attributes are subsets of another table

\(\Sigma_C = \{ A \rightarrow BD, B \rightarrow C\}\)

\(R_1 = (ABD)\) - contains candidate key A \(R_2 = (BC)\)

  • some candidate key? \(A\) because \(A^+ = ABCD\) and single attribute superkeys are always
  • if we would have \(R_i = (ABC)\) and \(R_j = BC\) as \(BC \subseteq ABC\) we would remove \(R_j\)

Decomposition BCNF

\(\Sigma_C = \{ A \rightarrow BD, B \rightarrow C\}\)

\(R = (ABCD)\)

\(B \rightarrow C\) with \(B\) not being a superkey, split fragment into two

\(R_1 = (ABD)\) \(R_2 = (BC)\)

iteration 2

  • is \(R_1\) in BCNF? yes
  • is \(R_2\) in BCNF? yes
Last updated on 24 Aug 2020
Published on 24 Aug 2020