Dice Flights Car Truth Does It Work Hospital Quality Water Safety Drug Effects Food Recalls
Open Primitive

The Case Against
the Translation Layer

On search, AI, and what gets lost when someone else summarizes the record for you.

The problem

The most consequential information about your daily life is public. The government records it, indexes it, and publishes it. The DOT files every flight delay with tail number, cause code, and minute-level precision. NHTSA maintains crash test data for every vehicle sold in the United States, plus a database of every active recall and every fatality investigation. The National Library of Medicine indexes every peer-reviewed study published in the English language. CMS publishes patient outcome scores for every hospital in the country. The EPA tracks contaminant violations at every public water system. The FDA records every adverse drug event and every food recall.

Almost no one reads it. Not because it's hidden. Because it's not readable.

A flight delay report is a 47-column CSV. A crash test filing is a structured XML document. A clinical trial abstract assumes you understand what a hazard ratio is. The record exists. The interface to the record has never been built for a person making a decision.

Two industries promised to fix this. Neither did.

In 1998, Brin and Page published "The Anatomy of a Large-Scale Hypertextual Web Search Engine," introducing PageRank — the insight that a link from one document to another was an implicit citation, and that a document cited by many authoritative sources was likely authoritative itself.1 The algorithm was elegant. The mission was genuine: organize the world's information and make it universally accessible.

Then they discovered advertising.

The search index that exists today is not optimized for accuracy. It is optimized for engagement, which correlates with accuracy only loosely and degrades over time as publishers learn to optimize for the index instead of the reader. A search for "is vitamin D effective" returns manufacturer pages, supplement retailers, health media that depends on supplement advertising, and a small number of actual studies buried behind pagination. The signal and the noise are presented with identical authority.

Search doesn't show you what is true. It shows you what has been linked to. Those are not the same thing.

The deeper problem is structural. Search returns documents. The federal record is not a document — it is a database, a filing system, an XML feed. It does not have inbound links. It does not rank. The DOT's on-time performance data is not findable through search in any useful sense. You can find articles about it. You cannot find it.

What AI did

Large language models trained on human feedback — RLHF, as described by Ouyang et al. in 20222 — are optimized to generate responses that humans rate as helpful, harmless, and honest. In practice, this means responses that sound authoritative, are structured clearly, and do not hedge excessively. Humans rate confident answers higher than uncertain ones, even when the uncertainty is warranted.

The result is a system that presents conclusions without showing its work. Ask an AI whether creatine supplementation improves athletic performance. It will tell you yes, with some qualifications. It will not show you the 53,000 studies in PubMed, the meta-analyses that found effect sizes ranging from negligible to significant depending on the population, the three large RCTs that contradict the headline finding, or the funding sources of the most-cited papers.

It cannot show you these things. It has no access to the primary record. It was trained on text that described the primary record, filtered through the same engagement-optimized search index described above, then further filtered by the preferences of human raters who had no more access to the original data than you do.

An AI translation isn't neutral. Something in the original doesn't survive the trip.

This is not a criticism of the technology. It is a description of what the technology is. RLHF-trained language models are extraordinarily good at synthesizing text into fluent, structured summaries. That capability is genuinely useful for many tasks. It is the wrong tool when the requirement is fidelity to primary sources, because synthesis by definition involves loss.

The translational gap

Consider the specific case of a family deciding whether to purchase a vehicle. The available data is substantial. NHTSA's 5-Star Safety Ratings program generates frontal crash, side crash, and rollover scores for every rated vehicle, expressed as a star rating derived from physical tests with instrumented dummies.3 The agency also maintains a database of every Technical Service Bulletin and Recall Campaign, searchable by make, model, and year. Complaints from owners — more than 10,000 filed monthly — are publicly accessible and full-text searchable.

The typical consumer interaction with this data is to ask a salesperson, read a magazine review, or — increasingly — ask an AI assistant. None of these channels surfaces the primary record. The salesperson has an incentive. The review is written by a journalist who did not read the crash test methodology. The AI summarizes text that described summaries of the original data.

At each step in that chain, something is lost. The specific star ratings. The raw injury probability scores underlying the stars. The open recall status. The pattern of complaints that appears in the database before a recall is issued.

The data that would change the decision is available. The interface that would surface it has never been built.

The sources

Source What it contains Updates
BTS Form 41 / ATADS Flight-level on-time, cancellation, and cause data for all US carriers, reported monthly via the DOT Bureau of Transportation Statistics. Monthly
FAA NASSTATUS Real-time National Airspace System status: ground stops, ground delay programs, and arrival/departure delays at every US airport. Live
NHTSA 5-Star Ratings Frontal, side, and rollover crash test scores for every tested vehicle, derived from probability of serious injury in standardized physical tests. Per model year
NHTSA Complaints & Recalls Every owner-submitted complaint and every active recall campaign, searchable by vehicle. Approximately 10,000 complaints filed per month. Continuous
PubMed / MEDLINE 35+ million citations to peer-reviewed biomedical literature, indexed by the National Library of Medicine, searchable by MeSH term. Daily
CMS Care Compare Star ratings, mortality rates, readmission rates, and patient experience scores for every hospital that accepts Medicare. Filed monthly. Monthly
EPA SDWIS Violation records for every public water system in the US, distinguishing health-based violations from monitoring and reporting failures. Quarterly
FDA FAERS Adverse event reports submitted by patients, physicians, and manufacturers for every drug on the US market. Unverified but unfiltered. Quarterly
FDA Enforcement Every food, drug, and device recall published by the FDA, classified by severity: Class I (serious health risk), Class II, Class III. Continuous

What Open Primitive does

Open Primitive is not an AI. It does not summarize. It does not interpret. It connects to primary-source federal databases and renders their output in a form a person can read and act on.

The design constraint is strict: every number shown must be traceable to a specific filing, a specific database record, a specific row in a government dataset. If the underlying data is ambiguous, the interface shows the ambiguity. If the data is missing, it says so.

The current tools cover seven categories of federal data where the gap between the published record and public awareness is widest and the stakes of the decisions are highest: air travel, vehicle safety, health evidence, hospital quality, water quality, drug adverse events, and food safety.

What it doesn't do

Open Primitive does not tell you what to decide. It shows you what the record says. The difference matters.

A tool that tells you "Alaska Airlines is the safest choice" has made a judgment about how to weight on-time performance against cancellation rates against your specific route and departure time. That judgment belongs to you. A tool that shows you the DOT data, the FAA live status, and the BTS track record for each carrier gives you what you need to make it.

The goal is not to replace judgment. It is to restore the precondition for it: access to the actual record, in a form you can read.

Open questions

Several design problems remain unsolved. Federal databases are not designed for real-time consumer queries — rate limits, schema changes, and data gaps are operational realities, not edge cases. The line between "making data readable" and "making data interpretive" is not always clear; a composite score that ranks airlines involves weighting choices that embed assumptions. As more tools are added, the question of what counts as "primary source" becomes harder to answer cleanly.

These problems are worth naming because they are not decorative. They are the reasons no one has built this before. The commercial incentive runs in the opposite direction: AI answers are cheaper to produce, easier to scale, and more satisfying to users in the short term than raw data interfaces. The correct answer to "is this car safe" is not a star rating. It is a 90-minute reading session. Most people will not do that. Open Primitive tries to make it possible in five minutes.

Whether that is enough is an open question. The goal is to make the primary record accessible enough that more people use it, more often, for the decisions where it matters most.

References

  1. Brin, S. & Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems, 30(1–7), 107–117.
  2. Ouyang, L. et al. (2022). Training language models to follow instructions with human feedback. arXiv 2203.02155.
  3. NHTSA (2024). New Car Assessment Program (NCAP) 5-Star Safety Ratings Methodology. US Department of Transportation.
  4. Bureau of Transportation Statistics (2024). Airline On-Time Statistics and Delay Causes. Form 41 / ATADS. US Department of Transportation.
  5. National Library of Medicine (2024). PubMed Overview. US National Institutes of Health.