There will be several quizzes / homeworks during course.

Homework 1 - Get Acquainted with W3C Prov

The purpose of this homework is to get some practical experience with the W3C Prov standard.

Task 0 - Read the Prov Primer and Work through the Prov Tutorial

Due: 09/12 Deliverable: nothing

Task 1 - Design a provenance graph

Due: 09/12 Deliverable: a Prov graph serialized as a JSON file (send via email)

In this task you should pick a simple example process and model it as a PROV-JSON document. Then it through Prov Checker to ensure standard compliance.

Task 2 - Design a fine-grained provenance graph for a query

Due: 09/12 Deliverable: a Prov graph serialized as a JSON file (send via email)

In this task you should create a provenance graph for a SQL query (if you do not have any background in SQL, then have a look at these slides. Consider the table of temperature probe readings storing for each measurement the probe that took the measurement, the time the measurement took place, and the measured temperature.

Probe Time Temperature
P1 11:00 45
P1 12:00 47
P1 1:00 54
P1 2:00 56
P2 11:00 47
P2 12:00 49
P2 1:00 52
P2 2:00 48

The following query is used to compute average temperatures between 11:00 am and 12:00pm for each sensor, returning the average for sensors where the minimum temperature is larger than a threshold.

SELECT Probe, avg(Temperature) AS avgtemp
FROM measurements
WHERE Time BETWEEN '11:00' AND '12:00'
GROUP BY Probe
HAVING min(Temperature) > 46;

Evaluated over the instance of the measurement table shown above, this query returns:

Probe avgtemp
P2 48

Create a Prov graph modeling this scenario. Entities should be tracked at the granularity of rows. Use collections to model which rows belong to a table.