Flowman Project 'weather' version 1.0
Description: This is a simple but very comprehensive example project for Flowman using publicly available weather data. The project will demonstrate many features of Flowman, like reading and writing data, performing data transformations, joining, filtering and aggregations. The project will also create a meaningful documentation containing data quality tests.
Generated at 2022-10-13T08:43:08.6
Mapping 'weather/aggregates'
Description: This mapping calculates the aggregated metrics per year and per country
Outputs
Relation 'weather/aggregates'
Description: The aggregate table contains min/max temperature value per year and country
Physical Resources
[file] |
file:/tmp/weather/aggregates |
Sources
[file] |
file:/tmp/weather/measurements/year=2013 |
[file] |
file:/tmp/weather/stations |
Schema
Quality Check |
Result |
Remarks |
There has only to be one entry per country and year PRIMARY KEY (country,year)
|
ERROR
|
|
Relation 'weather/measurements'
Description: This model contains all individual measurements
Physical Resources
[file] |
file:/tmp/weather/measurements |
Sources
[file] |
s3a://dimajix-training/data/weather/2013 |
Schema
No |
Column Name |
Data Type |
Constraints |
Description |
Source Columns |
Quality Checks |
1 |
usaf |
INT |
|
The USAF (US Air Force) id of the weather station |
[weather/measurements_raw].raw_data
|
|
2 |
wban |
INT |
|
The WBAN id of the weather station |
[weather/measurements_raw].raw_data
|
|
3 |
date |
DATE |
|
The date when the measurement was made |
[weather/measurements_raw].raw_data
|
|
4 |
time |
STRING |
|
The time when the measurement was made |
[weather/measurements_raw].raw_data
|
|
5 |
report_type |
STRING |
|
The report type of the measurement |
[weather/measurements_raw].raw_data
|
|
6 |
wind_direction |
INT |
|
The direction from where the wind blows in degrees |
[weather/measurements_raw].raw_data
|
IS NOT NULL
|
ERROR
|
|
(wind_direction >= 0 AND wind_direction <= 360) OR wind_direction_qual <> 1
|
ERROR
|
|
|
7 |
wind_direction_qual |
STRING |
|
The quality indicator of the wind direction. 1 means trustworthy quality. |
[weather/measurements_raw].raw_data
|
|
8 |
wind_observation |
STRING |
|
|
[weather/measurements_raw].raw_data
|
|
9 |
wind_speed |
FLOAT |
|
The wind speed in m/s |
[weather/measurements_raw].raw_data
|
|
10 |
wind_speed_qual |
STRING |
|
The quality indicator of the wind speed. 1 means trustworthy quality. |
[weather/measurements_raw].raw_data
|
|
11 |
air_temperature |
FLOAT |
|
The air temperature in degree Celsius |
[weather/measurements_raw].raw_data
|
|
12 |
air_temperature_qual |
STRING |
|
The quality indicator of the air temperature. 1 means trustworthy quality. |
[weather/measurements_raw].raw_data
|
IS NOT NULL
|
ERROR
|
|
IS IN (0,1,2,3,4,5,6,7,8,9)
|
ERROR
|
|
|
13 |
year |
INT |
NOT NULL |
The year of the measurement, used for partitioning the data |
|
IS NOT NULL
|
ERROR
|
|
IS BETWEEN 1901 AND 2022
|
ERROR
|
|
|
Quality Check |
Result |
Remarks |
The measurement has to refer to an existing station FOREIGN KEY (usaf,wban) REFERENCES stations(usaf,wban)
|
ERROR
|
|
Relation 'weather/stations'
Description: The 'stations' table contains meta data on all weather stations
Physical Resources
[file] |
file:/tmp/weather/stations |
Sources
[file] |
s3a://dimajix-training/data/weather/isd-history |
Schema
Quality Check |
Result |
Remarks |
PRIMARY KEY (usaf,wban)
|
ERROR
|
|
Target 'weather/aggregates'
Description: Write aggregated measurements per year
Phases
CREATE
BUILD
TRUNCATE
VERIFY
DESTROY
Target 'weather/measurements'
Description: Write extracted measurements per year
Phases
CREATE
BUILD
TRUNCATE
VERIFY
DESTROY
Target 'weather/metrics'
Description: Collect relevant metrics from measurements, to be published to a metrics collector
Target 'weather/stations'
Description: This build target is used to write the weather stations
Phases
CREATE
BUILD
TRUNCATE
VERIFY
DESTROY