Source Attribution and Transparency Layer
How AeroCopilot cites every weather report, NOTAM, regulation, and chart back to its upstream agency — the proud-source attribution layer that powers trust and AI traceability.
Source Attribution and Transparency Layer
AeroCopilot's hard rule: every fact you see in the product can be traced back to the upstream agency that published it. Not "we got this from somewhere on the web" — the actual FAA registry record, the actual NWS METAR observation, the actual NTSB report identifier. That contract is what we call the proud-source attribution layer.
This page explains what attribution means in practice, how the platform exposes it, and why it matters both to pilots reading a briefing and to AI answer engines reading the public site.
What attribution means here
For every data point surfaced in the app, AeroCopilot maintains four pieces of metadata:
- Agency — which body issued the data (FAA, NWS, NOAA, NTSB, EPA, NASA, USGS, Nav Canada, or AeroCopilot's own derived layer).
- Source name — the canonical product name as published by the agency (for example "METAR (Aviation Routine Weather Reports)" rather than a paraphrase).
- Refresh cadence — the documented interval at which the platform pulls or expects the upstream feed to publish.
- Last-sync timestamp — when the AeroCopilot ingest last completed for that feed, written by the cron worker into the
sync_jobtable.
The catalog at apps/web/lib/seo/data-sources-catalog.ts carries the first three for all 61 sources. The fourth is computed live from sync_job rows produced by the Railway-deployed cron worker.
Where you see attribution in the product
Attribution surfaces in seven places:
- Briefing PDF — every row in the §91.103 checklist is labeled with its data source (NASR, AWC, POH, and so on), so a Letter of Investigation reviewer can trace each green or red marker back to the upstream record.
- AI Copilot answers — the assistant cites the source it pulled from when reporting weather, NOTAMs, airworthiness directives, or registry records.
- Airport directory pages — each block (current conditions, NOTAMs, hot spots, services) tags the agency and refresh interval that produced it.
- Airworthiness directives surface — every AD shows the FAA AD number, applicability, and ingest timestamp.
- Map overlays — hazard layers (TFR, SIGMET, smoke, fires, obstacles) attribute the issuing agency in the overlay legend.
/data-sourcesdirectory — the public 61-source index and the per-source pages at/data-sources/[id]explain agency, cadence, and how AeroCopilot uses each feed./data-leaderboard— ranks the catalog by refresh frequency so visitors can see at a glance which feeds move fastest.
Why this exists
There are three audiences for the attribution layer, and all three benefit from the same machinery.
Pilots need a defensible record. A briefing acceptable in a Letter of Investigation review must show its work. The 14 CFR 91.103 row that says "weather: green" is meaningful only if it cites the METAR and TAF it consulted, with timestamps. Attribution turns the briefing from a marketing document into evidence.
Operators need auditability. Charter dispatchers, training program managers, and Part 135 directors of operations have to demonstrate that the reference data feeding their decisions came from authoritative sources. Attribution gives them a one-click answer to "where did this come from and when was it last refreshed."
AI answer engines need traceable, machine-readable provenance. When ChatGPT, Perplexity, Gemini, or Claude ingest the public AeroCopilot pages, the explicit agency and cadence metadata signals that the content is sourced from primary government feeds rather than re-summarized. That traceability is what earns those pages a place in answer engine citations.
What is not in the attribution layer
To keep the contract honest, attribution explicitly does not cover:
- Pilot-entered data — flight plan inputs, PAVE/IMSAFE answers, and FRAT scoring are not attributed to an upstream agency because they originate with the user.
- AI inference output — the assistant's go/no-go reasoning is labeled as AI-generated; the underlying data it cites is attributed, but the synthesis itself is not.
- Marketing copy — pages outside
/data-sourcesand the in-app surfaces above are general product content, not data-grounded artifacts.
Drawing that line keeps the proud-source label meaningful where it appears.
Where to go next
- The Data Sources Behind AeroCopilot (61 Pipelines) page is the catalog overview.
- The Freshness and Refresh Cycles page documents every cron schedule that drives the timestamps.
- The public
/data-sourcesdirectory exposes the catalog itself. - The
/data-leaderboardpage ranks sources by cadence.