# Exercise Sheet — Summarizing Synthetic Reading Sessions

## Before you run the example

1. Open `synthetic_reading_sessions.csv`. What are the two numerical variables?
2. Predict whether the mean reading time will be above or below the median reading time.
3. Explain why an invented dataset can be useful for learning without representing real learners.

## Run and verify

From this release directory, run:

```bash
python3 summarize_reading_sessions.py synthetic_reading_sessions.csv
```

Compare the terminal output with `expected-output.txt`. A matching result proves that the published teaching example is reproducible on your machine; it does not prove a general conclusion about reading behaviour.

## Interpret the summaries

4. State the sample size, mean reading time, median reading time, and sample standard deviation.
5. Interpret the correlation cautiously. Why should eight invented observations not be treated as research evidence?
6. Identify one additional variable a future real study might measure, without collecting that information in this teaching repository.

## Publisher workflow prompt

7. Describe why the dataset, script, expected output, exercise sheet, answer guidance, and errata/changelog should be released together under one pinned version.
