Lecture Notes for CS480
Overview
Lecture
https://www.cs.uic.edu/~bglavic/cs480/2024-spring/homework/
Lecture
Crossproduct
\(S_1 = \{e_1, e_2 \}\) \(S_2 = \{a_1, a_2\}\)
\(S_1 \times S_2\)
| S1 | S2 |
|---|---|
| e1 | a1 |
| e1 | a2 |
| e2 | a1 |
| e2 | a2 |
Relations
\(\mathbb{D}_1 = \{ 0,1 \}\) \(\mathbb{D}_2 = \{ red, green, blue \}\)
\(\mathbf{Shirts}(color: \mathbb{D}_2, is-muscle: \mathbb{D}_1)\)
\(\mathbb{D}_2 \times \mathbb{D}_1 = \{ (red, 0), (green, 0), (blue, 1), \dlots \}\)
| color | is-muscle |
|---|---|
| green | 1 |
| red | 0 |
| blue | 1 |
| blue | 0 |
Candidate Keys
- Lastname + Firstname is unique
- Lastname or Firstname by itself is not
| Lastname | Firstname | Age |
|---|
- ID is unique
- Lastname + Firstname is unique
- Lastname or Firstname by itself is not
| ID | Lastname | Firstname | Age |
|---|
Lecture
Relational algebra
Selection example
$\sigma_{city = 'Chicago'}(Add)$
- table
Add
| state | city | zip |
|---|---|---|
| IL | Chicago | 66014 |
| IL | Schaumberg | 66077 |
| CA | San Francisco | 11555 |
- result
| state | city | zip |
|---|---|---|
| IL | Chicago | 66014 |
Lecture
Lecture
Project Use Case
Ideas
- library (books) - 9 (14)
- sports statistics - 9 (9)
- public transportation - 7
- inventory management - 2
- game ratings / store - 7
- online retailer - 7
- crime statistics - 2
- hospital management - 8
- stock market - 5
relational algebra
|
|
| id | name | dept_name | tot_cred |
|---|---|---|---|
| 00128 | Zhang | Comp. Sci. | 102 |
| 12345 | Shankar | Comp. Sci. | 32 |
| 19991 | Brandt | History | 80 |
| 23121 | Chavez | Finance | 110 |
| 44553 | Peltier | Physics | 56 |
| 98988 | Tanaka | Biology | 120 |
\(\gamma_{dept\_name; count(id)}(student)\)
\(\gamma_{count(id)}(\sigma_{dept\_name=CS}(student))\) \(\sigma_{tot\_cred > 100}(student)\)
\(\sigma_{dept\_name = History \lor dept\_name = Finance}(student)\)
\(\sigma_{dept\_name = History \land tot\_cred < 20}(student)\)
\(\pi_{tot\_cred / 10, name}(student)\)
\(\pi_{name}(\sigma_{dept\_name = History}(student))\)
\[ \pi_{name, course\_id}(student \bowtie_{id = id'} \rho_{id' \gets id}(takes)) \]
|
|
| id | course_id | sec_id | semester | year | grade |
|---|---|---|---|---|---|
| 00128 | CS-101 | 1 | Fall | 2009 | A |
| 00128 | CS-347 | 1 | Fall | 2009 | A- |
| 12345 | CS-101 | 1 | Fall | 2009 | C |
|
|
| course_id | title | dept_name | credits |
|---|---|---|---|
| BIO-101 | Intro. to Biology | Biology | 4 |
| BIO-301 | Genetics | Biology | 4 |
| BIO-399 | Computational Biology | Biology | 3 |
|
|
| id | name | dept_name | salary |
|---|---|---|---|
| 10101 | Srinivasan | Comp. Sci. | 65000.00 |
| 12121 | Wu | Finance | 90000.00 |
| 15151 | Mozart | Music | 40000.00 |
\(\pi_{name}(student) \cup \pi_{name}(instructor)\)
\(\pi_{name}(instructor) - \pi_{name}(student)\)
\(\pi_{name}(instructor) \cap \pi_{name}(student)\)
Lecture
|
|
| time_slot_id | day | start_hr | start_min | end_hr | end_min |
|---|---|---|---|---|---|
| A | M | 8 | 0 | 8 | 50 |
| A | W | 8 | 0 | 8 | 50 |
| A | F | 8 | 0 | 8 | 50 |
| B | M | 9 | 0 | 9 | 50 |
| B | W | 9 | 0 | 9 | 50 |
| B | F | 9 | 0 | 9 | 50 |
| C | M | 11 | 0 | 11 | 50 |
| C | W | 11 | 0 | 11 | 50 |
| C | F | 11 | 0 | 11 | 50 |
| D | M | 13 | 0 | 13 | 50 |
| D | W | 13 | 0 | 13 | 50 |
| D | F | 13 | 0 | 13 | 50 |
| E | T | 10 | 30 | 11 | 45 |
| E | R | 10 | 30 | 11 | 45 |
| F | T | 14 | 30 | 15 | 45 |
| F | R | 14 | 30 | 15 | 45 |
| G | M | 16 | 0 | 16 | 50 |
| G | W | 16 | 0 | 16 | 50 |
| G | F | 16 | 0 | 16 | 50 |
| H | W | 10 | 0 | 12 | 30 |
|
|
| count | length |
|---|---|
| 4 | 75 |
| 15 | 50 |
| 1 | 150 |
- return all student
|
|
| id | name | dept_name | tot_cred |
|---|---|---|---|
| 00128 | Zhang | Comp. Sci. | 102 |
| 12345 | Shankar | Comp. Sci. | 32 |
| 19991 | Brandt | History | 80 |
| 23121 | Chavez | Finance | 110 |
| 44553 | Peltier | Physics | 56 |
| 45678 | Levy | Physics | 46 |
| 54321 | Williams | Comp. Sci. | 54 |
| 55739 | Sanchez | Music | 38 |
| 70557 | Snow | Physics | 0 |
| 76543 | Brown | Comp. Sci. | 58 |
| 76653 | Aoi | Elec. Eng. | 60 |
| 98765 | Bourikas | Elec. Eng. | 98 |
| 98988 | Tanaka | Biology | 120 |
- give all info for student with id 12345
|
|
| id | name | dept_name | tot_cred |
|---|---|---|---|
| 12345 | Shankar | Comp. Sci. | 32 |
- names of students with tot_cred smaller than 10
|
|
| name |
|---|
| Snow |
- number of students that have taken more than 50 credits
|
|
| count |
|---|
| 9 |
- for each credit amount, the number of students with this many credits (per 10 credits)
|
|
| ?column? | count |
|---|---|
| 110 | 1 |
| 30 | 1 |
| 50 | 2 |
| 40 | 1 |
| 0 | 1 |
| 100 | 2 |
| 60 | 3 |
| 120 | 1 |
| 80 | 1 |
- the smallest, average, and maximum number of credits over all student
|
|
| min | avg | max |
|---|---|---|
| 0 | 65.6923076923076923 | 120 |
- the smallest, average, and maximum number of credits over all student per dept
|
|
| dept_name | min | avg | max |
|---|---|---|---|
| Finance | 110 | 110.0000000000000000 | 110 |
| History | 80 | 80.0000000000000000 | 80 |
| Physics | 0 | 34.0000000000000000 | 56 |
| Music | 38 | 38.0000000000000000 | 38 |
| Comp. Sci. | 32 | 61.5000000000000000 | 102 |
| Biology | 120 | 120.0000000000000000 | 120 |
| Elec. Eng. | 60 | 79.0000000000000000 | 98 |
|
|
| dept_name | diff |
|---|---|
| Music | 0 |
| History | 0 |
| Finance | 0 |
| Biology | 0 |
| Elec. Eng. | 38 |
| Physics | 56 |
| Comp. Sci. | 70 |
|
|
| cmax | id | name | dept_name | tot_cred |
|---|---|---|---|---|
| 0 | 00128 | Zhang | Comp. Sci. | 102 |
| 0 | 12345 | Shankar | Comp. Sci. | 32 |
| 0 | 19991 | Brandt | History | 80 |
| 0 | 23121 | Chavez | Finance | 110 |
| 0 | 44553 | Peltier | Physics | 56 |
| 0 | 45678 | Levy | Physics | 46 |
| 0 | 54321 | Williams | Comp. Sci. | 54 |
| 0 | 55739 | Sanchez | Music | 38 |
| 0 | 70557 | Snow | Physics | 0 |
| 0 | 76543 | Brown | Comp. Sci. | 58 |
| 0 | 76653 | Aoi | Elec. Eng. | 60 |
| 0 | 98765 | Bourikas | Elec. Eng. | 98 |
| 0 | 98988 | Tanaka | Biology | 120 |
|
|
| a |
|---|
| 102 |
| 32 |
| 80 |
| 110 |
| 56 |
| 46 |
| 54 |
| 38 |
| 0 |
| 58 |
| 60 |
| 98 |
| 120 |
|
|
| name |
|---|
| Snow |
Lecture
|
|
| id | name | dept_name | tot_cred |
|---|---|---|---|
| 98988 | Tanaka | Biology | 120 |
| 98765 | Bourikas | Elec. Eng. | 98 |
| 55739 | Sanchez | Music | 38 |
| 45678 | Levy | Physics | 46 |
| 19991 | Brandt | History | 80 |
| 76653 | Aoi | Elec. Eng. | 60 |
| 00128 | Zhang | Comp. Sci. | 102 |
| 76543 | Brown | Comp. Sci. | 58 |
| 54321 | Williams | Comp. Sci. | 54 |
| 12345 | Shankar | Comp. Sci. | 32 |
| 23121 | Chavez | Finance | 110 |
| 44553 | Peltier | Physics | 56 |
|
|
| id | name | dept_name | tot_cred |
|---|---|---|---|
| 70557 | Snow | Physics | 0 |
|
|
| id | name | dept_name | tot_cred |
|---|---|---|---|
| 70557 | Snow | Physics | 0 |
|
|
| id | name | dept_name | salary | numst |
|---|---|---|---|---|
| 10101 | Srinivasan | Comp. Sci. | 65000.00 | 1 |
| 12121 | Wu | Finance | 90000.00 | 1 |
| 15151 | Mozart | Music | 40000.00 | 1 |
| 22222 | Einstein | Physics | 95000.00 | 1 |
| 32343 | El Said | History | 60000.00 | 1 |
| 33456 | Gold | Physics | 87000.00 | 1 |
| 45565 | Katz | Comp. Sci. | 75000.00 | 1 |
| 58583 | Califieri | History | 62000.00 | 1 |
| 76543 | Singh | Finance | 80000.00 | 1 |
| 76766 | Crick | Biology | 72000.00 | 1 |
| 83821 | Brandt | Comp. Sci. | 92000.00 | 1 |
| 98345 | Kim | Elec. Eng. | 80000.00 | 1 |
|
|
| id | name | dept_name | salary |
Lecture
Relational Algebra
orders
| custid | item | quantity | price | shippingfee |
|---|---|---|---|---|
| 1 | Toothpaste | 100 | 5.99 | 30 |
| 1 | Shampoo | 1 | 12.99 | 0 |
| 2 | Shampoo | 3 | 12.99 | 3 |
-
per customer compute to the total money they owe us
- for each order the quantity times price plus the shipping fee (\(\pi\))
- then summed up across all orders (\(\gamma\)) - group by
custid
- show only customers where the total is smaller 500 (\(\sigma\))
\(orderprices \leftarrow \rho_{custid, orderprice}(\pi_{custid, quantity * price + shippingfee}(orders))\) \(orderprices \leftarrow \rho_{orderprice \leftarrow quantity * price + shippingfee}(\pi_{custid, quantity * price + shippingfee}(orders))\)
| custid | orderprice |
|---|---|
| 1 | 629.00 |
| 1 | 12.99 |
| 2 | 38.97 |
\(totalcustorders \leftarrow \rho_{totalorders \leftarrow sum(orderprice)}(\gamma_{custid; sum(orderprice)}(orderprices))\)
| custid | totalorders |
|---|---|
| 1 | 641.99 |
| 2 | 38.97 |
\(\sigma_{totalorders < 500}(totalcustorders)\)
| custid | totalorders |
|---|---|
| 2 | 38.97 |
- get largest and average totalorders (maximum totalorders across all operators) (\(\gamma\))
\(maxttl \leftarrow \rho_{maxtotal,avgtotal}(\gamma_{max(totalorders),avg(totalorders)}(totalcustorders))\)
| maxtotal | avgtotal |
|---|---|
| 641.99 | 340.48 |
\(\pi_{maxtotal,avgtotal,custid}(\sigma_{maxtotal = totalorders}(maxttl \times totalcustorders))\)
\(\pi_{maxtotal,avgtotal,custid}(maxttl \bowtie_{maxtotal = totalorders} totalcustorders)\)
| maxtotal | avgtotal | custid |
|---|---|---|
| 641.99 | 340.48 | 1 |
- get customer with maximum totalorders
- equivalences
\(R \bowtie (S \bowtie T) \equiv (R \bowtie S) \bowtie T\)
Lecture
|
|
| CREATE TABLE |
|---|
| INSERT 0 4 |
|
|
| f | t |
|---|---|
| a | b |
| b | c |
| b | d |
| d | e |
- paths of length 2
|
|
| f | t |
|---|---|
| a | d |
| a | c |
| b | e |
- paths of length up to 2
|
|
| f | t |
|---|---|
| a | b |
| b | c |
| a | d |
| a | c |
| d | e |
| b | d |
| b | e |
- paths of length up to 3
|
|
| f | t |
|---|---|
| b | c |
| b | e |
| d | e |
| a | e |
| a | c |
| a | b |
| a | d |
| b | d |
|
|
| course_id | prereq_id | cnt |
|---|---|---|
| BIO-301 | BIO-101 | 1 |
| BIO-399 | BIO-101 | 1 |
| CS-190 | CS-101 | 1 |
| CS-315 | CS-101 | 1 |
| CS-319 | CS-101 | 1 |
| CS-347 | CS-101 | 1 |
| EE-181 | PHY-101 | 1 |
|
|
| f | t | cnt |
|---|---|---|
| a | b | 1 |
| b | c | 1 |
| b | d | 1 |
| d | e | 1 |
| a | c | 2 |
| a | d | 2 |
| b | e | 2 |
|
|
| INSERT 0 1 |
|
|
| f | t | cnt |
|---|---|---|
| a | b | 1 |
| b | c | 1 |
| b | d | 1 |
| d | e | 1 |
| a | a | 1 |
| a | b | 2 |
| a | a | 2 |
| a | c | 2 |
| a | d | 2 |
| b | e | 2 |
| a | b | 3 |
| a | a | 3 |
| a | c | 3 |
| a | d | 3 |
| a | e | 3 |
| a | b | 4 |
| a | a | 4 |
| a | c | 4 |
| a | d | 4 |
| a | e | 4 |
| a | b | 5 |
| a | a | 5 |
| a | c | 5 |
| a | d | 5 |
| a | e | 5 |
Lecture
|
|
| schemaname | count |
|---|---|
| public | 21 |
| pg_catalog | 64 |
| information_schema | 4 |
Lecture
- all student of the biology department
|
|
| id | name | dept_name | tot_cred |
|---|---|---|---|
| 98988 | Tanaka | Biology | 120 |
- all student of the biology department with at least 20 credits
|
|
| id | name | dept_name | tot_cred |
|---|---|---|---|
| 98988 | Tanaka | Biology | 120 |
- count student of the biology department with at least 20 credits
|
|
| count |
|---|
| 1 |
- find student with the least credits (return credits)
|
|
| maxttl |
|---|
| 120 |
- find all student information with the least credits (return credits)
|
|
| id | name | dept_name | tot_cred | maxttl |
|---|---|---|---|---|
| 98988 | Tanaka | Biology | 120 | 120 |
|
|
| id | name | dept_name | tot_cred |
|---|---|---|---|
| 98988 | Tanaka | Biology | 120 |
|
|
| id | name | dept_name | tot_cred |
|---|---|---|---|
| 98988 | Tanaka | Biology | 120 |
- every student that has the max credits in their department
|
|
| id | name | dept_name | tot_cred |
|---|---|---|---|
| 00128 | Zhang | Comp. Sci. | 102 |
| 19991 | Brandt | History | 80 |
| 23121 | Chavez | Finance | 110 |
| 44553 | Peltier | Physics | 56 |
| 55739 | Sanchez | Music | 38 |
| 98765 | Bourikas | Elec. Eng. | 98 |
| 98988 | Tanaka | Biology | 120 |
|
|
| id | name | dept_name | tot_cred |
|---|---|---|---|
| 00128 | Zhang | Comp. Sci. | 102 |
| 19991 | Brandt | History | 80 |
| 23121 | Chavez | Finance | 110 |
| 44553 | Peltier | Physics | 56 |
| 55739 | Sanchez | Music | 38 |
| 98765 | Bourikas | Elec. Eng. | 98 |
| 98988 | Tanaka | Biology | 120 |
|
|
| QUERY PLAN |
|---|
| Seq Scan on student s1 (cost=0.00..7053.62 rows=2 width=152) (actual time=0.062..0.142 rows=7 loops=1) |
| Filter: (tot_cred = (SubPlan 1)) |
| Rows Removed by Filter: 6 |
| SubPlan 1 |
| -> Aggregate (cost=15.63..15.64 rows=1 width=32) (actual time=0.007..0.007 rows=1 loops=13) |
| -> Seq Scan on student s2 (cost=0.00..15.62 rows=2 width=12) (actual time=0.003..0.005 rows=3 loops=13) |
| Filter: ((s1.dept_name)::text = (dept_name)::text) |
| Rows Removed by Filter: 10 |
| Planning Time: 1.681 ms |
| Execution Time: 0.255 ms |
|
|
| QUERY PLAN |
|---|
| Hash Join (cost=21.75..38.61 rows=2 width=152) (actual time=0.069..0.075 rows=7 loops=1) |
| Hash Cond: ((s1.tot_cred = (max(s2.tot_cred))) AND ((s1.dept_name)::text = (s2.dept_name)::text)) |
| -> Seq Scan on student s1 (cost=0.00..14.50 rows=450 width=152) (actual time=0.031..0.032 rows=13 loops=1) |
| -> Hash (cost=18.75..18.75 rows=200 width=90) (actual time=0.023..0.024 rows=7 loops=1) |
| Buckets: 1024 Batches: 1 Memory Usage: 9kB |
| -> HashAggregate (cost=16.75..18.75 rows=200 width=90) (actual time=0.017..0.019 rows=7 loops=1) |
| Group Key: s2.dept_name |
| Batches: 1 Memory Usage: 40kB |
| -> Seq Scan on student s2 (cost=0.00..14.50 rows=450 width=70) (actual time=0.001..0.002 rows=13 loops=1) |
| Planning Time: 1.001 ms |
| Execution Time: 0.140 ms |
- get for each students the buildings and room number they are taking classes sorted by student
|
|
| id | building | room_number |
|---|---|---|
| 00128 | Packard | 101 |
| 00128 | Taylor | 3128 |
| 12345 | Packard | 101 |
| 12345 | Taylor | 3128 |
| 12345 | Watson | 120 |
| 19991 | Painter | 514 |
| 23121 | Packard | 101 |
| 44553 | Watson | 100 |
| 45678 | Packard | 101 |
| 45678 | Watson | 100 |
| 54321 | Packard | 101 |
| 54321 | Taylor | 3128 |
| 55739 | Packard | 101 |
| 76543 | Packard | 101 |
| 76543 | Taylor | 3128 |
| 76653 | Taylor | 3128 |
| 98765 | Packard | 101 |
| 98765 | Watson | 120 |
| 98988 | Painter | 514 |
- courses (get all information) without prerequisites
|
|
| course_id | title | dept_name | credits |
|---|---|---|---|
| MU-199 | Music Video Production | Music | 3 |
| BIO-101 | Intro. to Biology | Biology | 4 |
| CS-101 | Intro. to Computer Science | Comp. Sci. | 4 |
| FIN-201 | Investment Banking | Finance | 3 |
| PHY-101 | Physical Principles | Physics | 4 |
| HIS-351 | World History | History | 3 |
|
|
| course_id | title | dept_name | credits | indicator | course_id | prereq_id |
|---|---|---|---|---|---|---|
| MU-199 | Music Video Production | Music | 3 | |||
| HIS-351 | World History | History | 3 | |||
| FIN-201 | Investment Banking | Finance | 3 | |||
| PHY-101 | Physical Principles | Physics | 4 | |||
| BIO-101 | Intro. to Biology | Biology | 4 | |||
| CS-101 | Intro. to Computer Science | Comp. Sci. | 4 |
|
|
| course_id | title | dept_name | credits |
|---|---|---|---|
| MU-199 | Music Video Production | Music | 3 |
| HIS-351 | World History | History | 3 |
| FIN-201 | Investment Banking | Finance | 3 |
| PHY-101 | Physical Principles | Physics | 4 |
| BIO-101 | Intro. to Biology | Biology | 4 |
| CS-101 | Intro. to Computer Science | Comp. Sci. | 4 |
|
|
| course_id | title | dept_name | credits |
|---|---|---|---|
| BIO-101 | Intro. to Biology | Biology | 4 |
| CS-101 | Intro. to Computer Science | Comp. Sci. | 4 |
| FIN-201 | Investment Banking | Finance | 3 |
| HIS-351 | World History | History | 3 |
| MU-199 | Music Video Production | Music | 3 |
| PHY-101 | Physical Principles | Physics | 4 |
|
|
| course_id | title | dept_name | credits |
|---|---|---|---|
| BIO-101 | Intro. to Biology | Biology | 4 |
| CS-101 | Intro. to Computer Science | Comp. Sci. | 4 |
| FIN-201 | Investment Banking | Finance | 3 |
| HIS-351 | World History | History | 3 |
| MU-199 | Music Video Production | Music | 3 |
| PHY-101 | Physical Principles | Physics | 4 |
Lecture
|
|
| CREATE FUNCTION |
|
|
| id | name | dept_name | tot_cred |
|---|---|---|---|
| 98988 | Tanaka | Biology | 120 |
|
|
|
|
<connection object at 0x1099a8c10; dsn: 'user=postgres password=xxx dbname=cs480_slides host=127.0.0.1 port=5450', closed: 0>
None
Cursor
- Cursors allow execution of SQL code
|
|
<cursor object at 0x10ada3040; closed: 0>
|
|
('Zhang', 'Comp. Sci.')
|
|
department Physics has 3 students department Biology has 1 students department Elec. Eng. has 2 students department Finance has 1 students department Comp. Sci. has 6 students department History has 1 students department Music has 1 students
|
|
('Zhang', 'Comp. Sci.')
|
|
Lecture
Music collection - music player
- albums: title, artist (one more), release date, genre (one or more),
- tracks: title, tracks belong to albums and have a track number within this album, length, need artists, writing credits
- artists: name, birthdate, wikipedia entry url
- album collections contain one or more albums
- playlists: have name, and contain songs (with order)
-
constraints:
- each album should have at least one tracks
- each track should have a name
- tracks should positive consecutive numbers
Lecture
Lecture
Lecture
Relational algebra to SQL
R(A,B,C) S(B,D,E)
\[ \sigma_{A=5}(\pi_{A,B}(R)) \bowtie \pi_{B,D}(S) \]
UNIONandUNION ALL
|
|
Testing FDs
- test $A \to C$
|
|
| DROP TABLE |
|---|
| CREATE TABLE |
| INSERT 0 4 |
|
|
| id | a | b | c |
|---|---|---|---|
| x1 | 1 | 1 | 1 |
| x2 | 1 | 2 | 3 |
| x3 | 1 | 3 | 4 |
| x4 | 2 | 3 | 5 |
- $\sigma_1: A \to C$
|
|
| ?column? |
|---|
| f |
- $\sigma_2: A,B \to C,D$
|
|
- $\sigma_3: A \to B,C$
|
|
| ?column? |
|---|
| f |
Lecture
Lecture
Lecture
Lecture
Postgres page layout
- https://www.postgresql.org/docs/current/storage-page-layout.html
- use postgres module to inspect raw page storage: https://www.postgresql.org/docs/16/pageinspect.html
|
|
| DROP TABLE |
|---|
| CREATE TABLE |
| INSERT 0 1000 |
|
|
| count |
|---|
| 1000 |
- install the extension that allows for inspection
|
|
- get page header data for page 0
|
|
| lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid |
|---|---|---|---|---|---|---|---|---|
| 0/1668F7A8 | 0 | 0 | 928 | 960 | 8192 | 8192 | 4 | 0 |
- get tuples on the page 0
|
|
| lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 8160 | 1 | 32 | 4461 | 4463 | 0 | (4,97) | 2 | 256 | 24 | \x010000002f000000 | ||
| 2 | 8128 | 1 | 32 | 4461 | 4463 | 0 | (4,98) | 2 | 256 | 24 | \x0200000006000000 | ||
| 3 | 8096 | 1 | 32 | 4461 | 0 | 0 | (0,3) | 2 | 2304 | 24 | \x0300000007000000 | ||
| 4 | 8064 | 1 | 32 | 4461 | 0 | 0 | (0,4) | 2 | 2304 | 24 | \x0400000024000000 | ||
| 5 | 8032 | 1 | 32 | 4461 | 0 | 0 | (0,5) | 2 | 2304 | 24 | \x0500000063000000 | ||
| 6 | 8000 | 1 | 32 | 4461 | 0 | 0 | (0,6) | 2 | 2304 | 24 | \x060000002b000000 | ||
| 7 | 7968 | 1 | 32 | 4461 | 0 | 0 | (0,7) | 2 | 2304 | 24 | \x070000005e000000 | ||
| 8 | 7936 | 1 | 32 | 4461 | 0 | 0 | (0,8) | 2 | 2304 | 24 | \x0800000041000000 | ||
| 9 | 7904 | 1 | 32 | 4461 | 0 | 0 | (0,9) | 2 | 2304 | 24 | \x090000003c000000 | ||
| 10 | 7872 | 1 | 32 | 4461 | 0 | 0 | (0,10) | 2 | 2304 | 24 | \x0a00000034000000 |
|
|
| UPDATE 2 |
|
|
| UPDATE 3 |
|
|
| lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 0 | 3 | 0 | ||||||||||
| 2 | 0 | 3 | 0 | ||||||||||
| 3 | 8160 | 1 | 32 | 4461 | 4464 | 0 | (0,227) | 16386 | 256 | 24 | \x0300000007000000 | ||
| 4 | 8128 | 1 | 32 | 4461 | 4464 | 0 | (0,228) | 16386 | 256 | 24 | \x0400000024000000 | ||
| 5 | 8096 | 1 | 32 | 4461 | 4464 | 0 | (4,99) | 2 | 256 | 24 | \x0500000063000000 | ||
| 6 | 8064 | 1 | 32 | 4461 | 0 | 0 | (0,6) | 2 | 2304 | 24 | \x060000002b000000 | ||
| 7 | 8032 | 1 | 32 | 4461 | 0 | 0 | (0,7) | 2 | 2304 | 24 | \x070000005e000000 | ||
| 8 | 8000 | 1 | 32 | 4461 | 0 | 0 | (0,8) | 2 | 2304 | 24 | \x0800000041000000 | ||
| 9 | 7968 | 1 | 32 | 4461 | 0 | 0 | (0,9) | 2 | 2304 | 24 | \x090000003c000000 | ||
| 10 | 7936 | 1 | 32 | 4461 | 0 | 0 | (0,10) | 2 | 2304 | 24 | \x0a00000034000000 |
Lecture
|
|
| CREATE TABLE |
|---|
| INSERT 0 1000000 |
|
|
| CREATE INDEX |
|---|
| CREATE INDEX |
|
|
| ANALYZE |
|
|
| QUERY PLAN |
|---|
| Finalize Aggregate (cost=11625.11..11625.12 rows=1 width=8) (actual time=15.214..16.839 rows=1 loops=1) |
| -> Gather (cost=11624.89..11625.10 rows=2 width=8) (actual time=15.139..16.835 rows=3 loops=1) |
| Workers Planned: 2 |
| Workers Launched: 2 |
| -> Partial Aggregate (cost=10624.89..10624.90 rows=1 width=8) (actual time=11.086..11.086 rows=1 loops=3) |
| -> Parallel Seq Scan on unc (cost=0.00..10614.33 rows=4224 width=4) (actual time=0.007..10.962 rows=3339 loops=3) |
| Filter: (c < 1000) |
| Rows Removed by Filter: 329995 |
| Planning Time: 0.503 ms |
| Execution Time: 16.879 ms |
|
|
| QUERY PLAN |
|---|
| Finalize Aggregate (cost=6990.86..6990.87 rows=1 width=8) (actual time=30.605..33.521 rows=1 loops=1) |
| -> Gather (cost=6990.64..6990.85 rows=2 width=8) (actual time=30.475..33.513 rows=3 loops=1) |
| Workers Planned: 2 |
| Workers Launched: 2 |
| -> Partial Aggregate (cost=5990.64..5990.65 rows=1 width=8) (actual time=23.652..23.653 rows=1 loops=3) |
| -> Parallel Index Scan using clust_idx on unc (cost=0.42..5776.36 rows=85712 width=4) (actual time=0.053..18.336 rows=66666 loops=3) |
| Index Cond: (a < 200000) |
| Planning Time: 1.492 ms |
| Execution Time: 33.609 ms |
|
|
| count |
|---|
| 99931 |
|
|
| QUERY PLAN |
|---|
| Aggregate (cost=2570.54..2570.55 rows=1 width=8) (actual time=7.501..7.503 rows=1 loops=1) |
| -> Bitmap Heap Scan on unc (cost=16.21..2568.03 rows=1004 width=4) (actual time=0.347..7.333 rows=919 loops=1) |
| Recheck Cond: (b < 100) |
| Heap Blocks: exact=846 |
| -> Bitmap Index Scan on unc_idx (cost=0.00..15.96 rows=1004 width=0) (actual time=0.182..0.183 rows=919 loops=1) |
| Index Cond: (b < 100) |
| Planning Time: 2.479 ms |
| Execution Time: 7.604 ms |
|
|
| QUERY PLAN |
|---|
| Finalize Aggregate (cost=12131.02..12131.03 rows=1 width=8) (actual time=49.272..52.012 rows=1 loops=1) |
| -> Gather (cost=12130.80..12131.01 rows=2 width=8) (actual time=49.154..52.007 rows=3 loops=1) |
| Workers Planned: 2 |
| Workers Launched: 2 |
| -> Partial Aggregate (cost=11130.80..11130.81 rows=1 width=8) (actual time=42.669..42.669 rows=1 loops=3) |
| -> Parallel Seq Scan on unc (cost=0.00..10614.33 rows=206588 width=4) (actual time=0.012..31.747 rows=166694 loops=3) |
| Filter: (b < 50000) |
| Rows Removed by Filter: 166639 |
| Planning Time: 2.867 ms |
| Execution Time: 52.157 ms |
|
|
| QUERY PLAN |
|---|
| Aggregate (cost=365.90..365.91 rows=1 width=8) (actual time=0.944..0.945 rows=1 loops=1) |
| -> Bitmap Heap Scan on unc (cost=5.18..365.88 rows=10 width=4) (actual time=0.041..0.938 rows=12 loops=1) |
| Recheck Cond: (b < 10) |
| Filter: (a < 100000) |
| Rows Removed by Filter: 82 |
| Heap Blocks: exact=92 |
| -> Bitmap Index Scan on unc_idx (cost=0.00..5.17 rows=100 width=0) (actual time=0.017..0.017 rows=94 loops=1) |
| Index Cond: (b < 10) |
| Planning Time: 1.716 ms |
| Execution Time: 1.019 ms |
|
|
| UPDATE 174266 |
|
|
| ANALYZE |
|
|
| QUERY PLAN |
|---|
| Finalize Aggregate (cost=13166.77..13166.78 rows=1 width=8) (actual time=31.870..34.994 rows=1 loops=1) |
| -> Gather (cost=13166.56..13166.77 rows=2 width=8) (actual time=31.764..34.989 rows=3 loops=1) |
| Workers Planned: 2 |
| Workers Launched: 2 |
| -> Partial Aggregate (cost=12166.56..12166.57 rows=1 width=8) (actual time=26.906..26.906 rows=1 loops=3) |
| -> Parallel Seq Scan on unc (cost=0.00..11961.33 rows=82090 width=4) (actual time=0.012..22.929 rows=66666 loops=3) |
| Filter: (a < 200000) |
| Rows Removed by Filter: 266667 |
| Planning Time: 1.070 ms |
| Execution Time: 35.060 ms |
Lecture
Join algorithms
Hash join and nested loop (small table r)
|
|
| DROP TABLE |
|---|
| CREATE TABLE |
| INSERT 0 1000 |
| DROP TABLE |
| CREATE TABLE |
| INSERT 0 1000000 |
| ANALYZE |
|
|
| QUERY PLAN |
|---|
| Gather (cost=1028.50..12267.84 rows=1000 width=32) (actual time=0.582..50.892 rows=1000 loops=1) |
| Output: r.a, r.b, r.c, r.d, s.a, s.b, s.c, s.d |
| Workers Planned: 2 |
| Workers Launched: 2 |
| Buffers: shared hit=5616 |
| -> Hash Join (cost=28.50..11167.84 rows=417 width=32) (actual time=27.469..43.645 rows=333 loops=3) |
| Output: r.a, r.b, r.c, r.d, s.a, s.b, s.c, s.d |
| Hash Cond: (s.a = r.a) |
| Buffers: shared hit=5616 |
| Worker 0: actual time=41.046..41.047 rows=0 loops=1 |
| Buffers: shared hit=1987 |
| Worker 1: actual time=41.084..41.085 rows=0 loops=1 |
| Buffers: shared hit=1926 |
| -> Parallel Seq Scan on public.s (cost=0.00..9572.67 rows=416667 width=16) (actual time=0.013..19.361 rows=333333 loops=3) |
| Output: s.a, s.b, s.c, s.d |
| Buffers: shared hit=5406 |
| Worker 0: actual time=0.014..18.685 rows=348615 loops=1 |
| Buffers: shared hit=1885 |
| Worker 1: actual time=0.015..18.182 rows=337440 loops=1 |
| Buffers: shared hit=1824 |
| -> Hash (cost=16.00..16.00 rows=1000 width=16) (actual time=0.328..0.328 rows=1000 loops=3) |
| Output: r.a, r.b, r.c, r.d |
| Buckets: 1024 Batches: 1 Memory Usage: 55kB |
| Buffers: shared hit=18 |
| Worker 0: actual time=0.352..0.352 rows=1000 loops=1 |
| Buffers: shared hit=6 |
| Worker 1: actual time=0.381..0.381 rows=1000 loops=1 |
| Buffers: shared hit=6 |
| -> Seq Scan on public.r (cost=0.00..16.00 rows=1000 width=16) (actual time=0.118..0.201 rows=1000 loops=3) |
| Output: r.a, r.b, r.c, r.d |
| Buffers: shared hit=18 |
| Worker 0: actual time=0.157..0.229 rows=1000 loops=1 |
| Buffers: shared hit=6 |
| Worker 1: actual time=0.187..0.262 rows=1000 loops=1 |
| Buffers: shared hit=6 |
| Planning: |
| Buffers: shared hit=141 |
| Planning Time: 2.009 ms |
| Execution Time: 50.986 ms |
|
|
| QUERY PLAN |
|---|
| Nested Loop (cost=0.00..22515424.50 rows=111111111 width=32) (actual time=76.346..50025.108 rows=20945 loops=1) |
| Output: r.a, r.b, r.c, r.d, s.a, s.b, s.c, s.d |
| Join Filter: ((r.a >= (s.a - 10)) AND (r.a <= (s.a + 10))) |
| Rows Removed by Join Filter: 999979055 |
| Buffers: shared hit=5412 |
| -> Seq Scan on public.s (cost=0.00..15406.00 rows=1000000 width=16) (actual time=0.024..68.647 rows=1000000 loops=1) |
| Output: s.a, s.b, s.c, s.d |
| Buffers: shared hit=5406 |
| -> Materialize (cost=0.00..21.00 rows=1000 width=16) (actual time=0.000..0.023 rows=1000 loops=1000000) |
| Output: r.a, r.b, r.c, r.d |
| Buffers: shared hit=6 |
| -> Seq Scan on public.r (cost=0.00..16.00 rows=1000 width=16) (actual time=0.006..0.063 rows=1000 loops=1) |
| Output: r.a, r.b, r.c, r.d |
| Buffers: shared hit=6 |
| Planning: |
| Buffers: shared hit=113 |
| Planning Time: 0.750 ms |
| JIT: |
| Functions: 6 |
| Options: Inlining true, Optimization true, Expressions true, Deforming true |
| Timing: Generation 0.307 ms, Inlining 47.066 ms, Optimization 16.492 ms, Emission 12.743 ms, Total 76.609 ms |
| Execution Time: 50043.921 ms |
|
|
| DROP TABLE |
|---|
| CREATE TABLE |
| INSERT 0 1000000 |
| DROP TABLE |
| CREATE TABLE |
| INSERT 0 1000000 |
| ANALYZE |
less memory
|
|
| QUERY PLAN |
|---|
| Merge Join (cost=230130.67..399330.10 rows=10947027 width=32) (actual time=290.547..1346.343 rows=9999389 loops=1) |
| Merge Cond: (r1.b = s.b) |
| -> Sort (cost=115063.84..117563.84 rows=1000000 width=16) (actual time=146.832..238.683 rows=1000000 loops=1) |
| Sort Key: r1.b |
| Sort Method: quicksort Memory: 63639kB |
| -> Seq Scan on r r1 (cost=0.00..15406.00 rows=1000000 width=16) (actual time=0.136..46.769 rows=1000000 loops=1) |
| -> Sort (cost=115063.84..117563.84 rows=1000000 width=16) (actual time=140.882..390.812 rows=9999395 loops=1) |
| Sort Key: s.b |
| Sort Method: quicksort Memory: 63639kB |
| -> Seq Scan on s (cost=0.00..15406.00 rows=1000000 width=16) (actual time=0.129..42.435 rows=1000000 loops=1) |
| Planning Time: 8.273 ms |
| JIT: |
| Functions: 7 |
| Options: Inlining false, Optimization false, Expressions true, Deforming true |
| Timing: Generation 0.176 ms, Inlining 0.000 ms, Optimization 0.194 ms, Emission 2.616 ms, Total 2.986 ms |
| Execution Time: 1543.345 ms |
|
|
| work_mem |
|---|
| 2MB |
| effective_cache_size |
| 2MB |
Merge join (small table r)
|
|
| DROP TABLE |
|---|
| CREATE TABLE |
| INSERT 0 1000000 |
| DROP TABLE |
| CREATE TABLE |
| INSERT 0 1000000 |
| ANALYZE |
|
|
| QUERY PLAN |
|---|
| Hash Join (cost=27906.00..57062.00 rows=1000000 width=32) (actual time=151.138..503.430 rows=1000000 loops=1) |
| Hash Cond: (r.a = s.a) |
| -> Seq Scan on r (cost=0.00..15406.00 rows=1000000 width=16) (actual time=0.014..35.329 rows=1000000 loops=1) |
| -> Hash (cost=15406.00..15406.00 rows=1000000 width=16) (actual time=149.873..149.874 rows=1000000 loops=1) |
| Buckets: 1048576 Batches: 1 Memory Usage: 55067kB |
| -> Seq Scan on s (cost=0.00..15406.00 rows=1000000 width=16) (actual time=0.015..38.646 rows=1000000 loops=1) |
| Planning Time: 7.518 ms |
| Execution Time: 521.277 ms |
|
|
| QUERY PLAN |
|---|
| Hash Join (cost=27906.00..168891.05 rows=10932905 width=32) (actual time=174.887..2121.716 rows=9997000 loops=1) |
| Hash Cond: (r.b = s.b) |
| -> Seq Scan on r (cost=0.00..15406.00 rows=1000000 width=16) (actual time=0.024..41.629 rows=1000000 loops=1) |
| -> Hash (cost=15406.00..15406.00 rows=1000000 width=16) (actual time=174.398..174.399 rows=1000000 loops=1) |
| Buckets: 1048576 Batches: 1 Memory Usage: 55067kB |
| -> Seq Scan on s (cost=0.00..15406.00 rows=1000000 width=16) (actual time=0.004..34.842 rows=1000000 loops=1) |
| Planning Time: 2.192 ms |
| JIT: |
| Functions: 10 |
| Options: Inlining false, Optimization false, Expressions true, Deforming true |
| Timing: Generation 0.214 ms, Inlining 0.000 ms, Optimization 0.663 ms, Emission 11.283 ms, Total 12.161 ms |
| Execution Time: 2420.994 ms |
less memory
- 5MB for sorting and hashing and 10MB for buffer pool size
|
|
| QUERY PLAN |
|---|
| Gather (cost=1028.50..12267.84 rows=1000 width=32) (actual time=0.869..50.570 rows=1000 loops=1) |
| Workers Planned: 2 |
| Workers Launched: 2 |
| Buffers: shared hit=5616 |
| -> Hash Join (cost=28.50..11167.84 rows=417 width=32) (actual time=27.042..42.886 rows=333 loops=3) |
| Hash Cond: (s.a = r.a) |
| Buffers: shared hit=5616 |
| -> Parallel Seq Scan on s (cost=0.00..9572.67 rows=416667 width=16) (actual time=0.012..19.056 rows=333333 loops=3) |
| Buffers: shared hit=5406 |
| -> Hash (cost=16.00..16.00 rows=1000 width=16) (actual time=0.359..0.359 rows=1000 loops=3) |
| Buckets: 1024 Batches: 1 Memory Usage: 55kB |
| Buffers: shared hit=18 |
| -> Seq Scan on r (cost=0.00..16.00 rows=1000 width=16) (actual time=0.107..0.213 rows=1000 loops=3) |
| Buffers: shared hit=18 |
| Planning: |
| Buffers: shared hit=141 |
| Planning Time: 1.244 ms |
| Execution Time: 50.697 ms |
index join
|
|
| DROP TABLE |
|---|
| CREATE TABLE |
| INSERT 0 1000001 |
| DROP TABLE |
| CREATE TABLE |
| INSERT 0 1000000 |
| ANALYZE |
|
|
| QUERY PLAN |
|---|
| Merge Join (cost=30539.45..47292.07 rows=1000000 width=32) |
| Merge Cond: (r1.a = s.a) |
| -> Index Scan using r_pkey on r r1 (cost=0.42..31389.44 rows=1000001 width=16) |
| -> Index Scan using s_pkey on s (cost=0.42..31389.42 rows=1000000 width=16) |
|
|
| QUERY PLAN |
|---|
| Merge Join (cost=25665.20..41340.04 rows=1000000 width=8) |
| Merge Cond: (r1.a = s.a) |
| -> Index Only Scan using r_pkey on r r1 (cost=0.42..25980.44 rows=1000001 width=4) |
| -> Index Only Scan using s_pkey on s (cost=0.42..25980.42 rows=1000000 width=4) |
|
|
| correlation |
|---|
| 1 |
only one table clustered
|
|
| DROP TABLE |
|---|
| CREATE TABLE |
| INSERT 0 1000001 |
| DROP TABLE |
| CREATE TABLE |
| INSERT 0 1000000 |
| ANALYZE |
|
|
| correlation |
|---|
| 0.012891567 |
|
|
| QUERY PLAN |
|---|
| Hash Join (cost=32789.00..65469.02 rows=1000000 width=32) |
| Hash Cond: (r1.a = s.a) |
| -> Seq Scan on r r1 (cost=0.00..15406.01 rows=1000001 width=16) |
| -> Hash (cost=15406.00..15406.00 rows=1000000 width=16) |
| -> Seq Scan on s (cost=0.00..15406.00 rows=1000000 width=16) |
|
|
| QUERY PLAN |
|---|
| Hash Join (cost=32789.00..65469.01 rows=333333 width=32) |
| Hash Cond: (r1.a = s.a) |
| Join Filter: (r1.b < s.b) |
| -> Seq Scan on r r1 (cost=0.00..15406.01 rows=1000001 width=16) |
| -> Hash (cost=15406.00..15406.00 rows=1000000 width=16) |
| -> Seq Scan on s (cost=0.00..15406.00 rows=1000000 width=16) |
Lecture
- distributed databases (5)
- NoSQL (3)
- query compilation + main memory databases (vector-databases) 3)
- statistics - cost estimate (2)
- query optimization - join reordering (1)
- disk-resident hash index (1)
- MVCC
- Aries (1)
- light-weight indexes (zone maps) + partitioning
Lecture
relational algebra -> SQL transaction
- $\Pi_{a_1, \ldots, a_n}(Q)$
|
|
- $\sigma_C(R)$
|
|
- $R \cup S$
|
|
- $R \bowtie S$
|
|
- $\gamma_{f(a); G}(R)$
|
|
- R(A,B,C), S(D,E,F), T(G,H)
- $(\gamma_{sum(A), E}(R \bowtie_{C=D} S)) \bowtie_{E = G} T$
|
|
ER to SQL
|
|