statcast-bigquery ingests MLB Statcast pitch-by-pitch data into BigQuery idempotently, and ships the documentation a SQL or LLM agent needs to query it. Its headline feature is a round-trip integrity proof: it reconstructs each team's season win-loss-run-diff from the raw pitch rows and reconciles that against MLB's official standings — so you can prove no games are missing.
pip install statcast-bigquery
gcloud auth application-default login
statcast-bigquery sync \
--start 2024-04-01 --end 2024-10-31 \
--table myproject.mydataset.statcast_pitches
# resumable multi-season backfill
statcast-bigquery sync --start 2015-04-01 --end 2026-05-11 \
--chunk-by year --resume \
--table myproject.mydataset.statcast_pitches