Enriching matchup metadata#
After a catalogue has been populated by
find_and_catalogue(), each matchup
STAC Item carries a small set of core properties (time of overpass, bounding
box, sensor pair, …). The enrichment system lets you compute additional
properties — area, time difference, solar elevation, land fraction, and
anything you define yourself — and write them back to the item JSON so they
persist in the catalogue and can be used to filter matchups.
Overview#
An enricher is any callable with the signature:
def my_enricher(matchup: Matchup) -> Dict[str, Any]: ...
It receives a Matchup and returns a plain
dict of property names to values.
enrich() iterates all matchup items, reconstructs
the Matchup domain object for each, calls every enricher, and merges the
resulting dicts back into item.properties. If the item has a
self_href the updated JSON is written to disk immediately so the properties
survive the process and can be ingested into the central catalogue.
Quick start#
from eomatch.mu_stac import MatchupCatalogue
from eomatch.enrich import enrich
from eomatch.enrich.time_diff import time_diff
from eomatch.enrich.geometric import geometric
from eomatch.enrich.solar_elevation import solar_elevation
from eomatch.enrich.land_fraction import land_fraction
catalogue = MatchupCatalogue.open("/data/my_catalogue")
n = enrich(
catalogue,
enrichers=[time_diff, geometric, solar_elevation, land_fraction],
)
print(f"Enriched {n} matchup item(s).")
Or from the command line:
eomatch-enrich \
--catalogue /data/my_catalogue \
--enricher eomatch.enrich.time_diff.time_diff \
--enricher eomatch.enrich.geometric.geometric \
--enricher eomatch.enrich.solar_elevation.solar_elevation \
--enricher eomatch.enrich.land_fraction.land_fraction
Add --overwrite to replace properties that were written in an earlier run.
Add -v / --verbose for debug-level logging.
Built-in enrichers#
All built-in enrichers live as submodules of eomatch.enrich.
solar_elevation and land_fraction require the optional enrich
extra:
pip install 'eomatch[enrich]'
Module |
Property added |
Notes |
|---|---|---|
|
|
Signed seconds: |
|
|
Area on the WGS-84 ellipsoid via |
|
|
Solar elevation at the collocation centroid midpoint. Positive =
daytime, negative = night-time. Requires |
|
|
Fraction of the collocation region over land (0.0 = ocean, 1.0 =
land), computed from Natural Earth 110 m polygons. Requires
|
Filtering by enriched properties#
Once properties have been written to the catalogue, pass a properties
filter to get_events() to
restrict results to matchups that meet your criteria:
# Keep only matchups with an overpass time difference under 15 minutes
events = catalogue.get_events(
properties={"time_diff_s": {"lt": 900}}
)
# Daytime ocean matchups with a large overlap area
events = catalogue.get_events(
properties={
"solar_elevation_deg": {"gt": 0},
"land_fraction": {"lt": 0.05},
"collocation_area_km2": {"gt": 5000},
}
)
Supported filter operators:
Operator |
Meaning |
|---|---|
|
property < threshold |
|
property ≤ threshold |
|
property > threshold |
|
property ≥ threshold |
|
property == value (also the default when the condition is a plain value) |
|
property != value |
|
property is a member of a list |
Writing a custom enricher#
Any callable that accepts a Matchup and returns
a dict qualifies as an enricher. To make it available on the CLI, place it
in an importable module and pass the dotted path:
# my_package/enrichers.py
from typing import Any, Dict
def cloud_cover(matchup) -> Dict[str, Any]:
"""Estimate cloud cover fraction from product metadata."""
fractions = [
getattr(p, "cloud_cover", None) for p in matchup.products
]
valid = [f for f in fractions if f is not None]
return {"cloud_cover_mean": sum(valid) / len(valid) if valid else None}
eomatch-enrich \
--catalogue /data/my_catalogue \
--enricher my_package.enrichers.cloud_cover
Enrichers that raise an exception are logged as warnings and skipped for that item — other enrichers in the same run are unaffected.
The overwrite flag#
By default, enrich() skips keys that already exist
in item.properties. Pass overwrite=True (Python) or --overwrite
(CLI) to replace existing values:
# Re-run after updating the land_fraction enricher
enrich(catalogue, enrichers=[land_fraction], overwrite=True)
Ingesting enriched properties into the central catalogue#
Enriched properties are written to the local item.properties dict and
persisted to the on-disk JSON immediately. The next time you run
eomatch-ingest the updated Items — including all new properties — are
pushed to the central pgSTAC catalogue, where they become queryable via the
CQL2 filter extension:
# Local enrichment
eomatch-enrich \
--catalogue /data/my_catalogue \
--enricher eomatch.enrich.time_diff.time_diff \
--enricher eomatch.enrich.land_fraction.land_fraction
# Push to central catalogue
eomatch-ingest --catalogue /data/my_catalogue