Public Data Federation

Nine domains,
one federated graph.

Public data federation on Gyri combines installable seeded catalogs (trials, papers, OT entity rollups, SEC filings, DailyMed) with live keyless HTTP APIs (arXiv, Internet Archive, EIA, USWTDB, CDC, EPA, NASA, and more) — joinable to CRM and comms via multihop GraphQL where interconnect keys exist.

74+ installable public data sources — seeded mirrors, Open Targets rollups, live HTTP APIs, and interconnect keys. Enable federation packs in Settings, then browse domains and see where each source cross-federates.

Related: Gyri Science · Knowledge graph for AI · Native integrations · Federated search guide

Competitor matrix

Gyri vs public-data tools

Start free trial
Federation domains

9 domains · 74 sources
click to explore.

Every source below lists the other domains it multihops into via typed keys, live federators, or entity rollups — not keyword overlap.

Domain 01

Scholarly & trials

580k+ trials, OpenAlex, and PubMed — the literature and NCT spine.

6 sources · explore
Domain 02

Biomedical knowledge graph

Open Targets 26.03 rollups on drugs, proteins, and diseases.

24 sources · explore
Domain 03

Regulatory & filings

SEC section FTS, FDA labels, OpenFDA, and federal awards.

9 sources · explore
Domain 04

Open archives

arXiv, Internet Archive, Wayback, and fatcat on the scholarly spine.

4 sources · explore
Domain 05

Corporate & identity

GLEIF, Wikidata, registries, and public engineering context.

9 sources · explore
Domain 06

Live federators

USWTDB → EIA → SEC multihop — the flagship live chain.

4 sources · explore
Domain 07

Space & orbital APIs

Launch schedules, orbital elements, and NEO catalogs.

3 sources · explore
Domain 08

Geo, health & environment

CDC, EPA, FAA, Open-Meteo, and OpenStreetMap endpoints.

9 sources · explore
Domain 09

Interconnect glue

DOI, NCT, CIK, ChEMBL, UniProt, and Wikidata QID keys.

6 sources · explore
Source catalog

Every source
with cross-federation.

Domain 01

Scholarly & trials

Papers, trials, authors, and citation edges — install the scholarly pack for seeded mirrors plus live OpenAlex merge on every biomedSearch.

6 sources in this domain

Domain 02

Biomedical knowledge graph

Open Targets 26.03 rollups on 22k drugs, 19k proteins, and 28k diseases — GWAS, ClinVar, PharmGKB, expression, and ChEMBL bioactivity in one catalog.

24 sources in this domain

Domain 03

Regulatory & filings

SEC EDGAR item-level FTS, FDA DailyMed SPL sections, OpenFDA FAERS rollups, Federal Register, USAspending, and sponsor ↔ trial joins via CIK and NCT.

9 sources in this domain

Domain 04

Open archives

arXiv preprints, Internet Archive texts, Wayback CDX, and fatcat releases — install the open archives pack to normalize preprints on the scholarly spine.

4 sources in this domain

Domain 05

Corporate & identity

GLEIF, Wikidata, SAM.gov, GitHub, npm, OpenCorporates, OpenSanctions, ProPublica nonprofits, and UK Companies House — entity and registry context.

9 sources in this domain

Domain 06

Live federators

Keyless HTTP multihop — USWTDB → EIA → SEC operator filings is the flagship chain. Optional Google Patents SERP when configured.

4 sources in this domain

Domain 07

Space & orbital APIs

Live launch schedules, NORAD orbital elements, and NEO catalogs — bridgeable to pad weather, provider identity, and public filings via coordinates and agency names.

3 sources in this domain

Domain 08

Geo, health & environment

CDC, EPA CAMPD & FRS, FAA airports, NHTSA, Open-Meteo, Overpass, Nominatim, and USDA FoodData — installable public geo and health HTTP connectors.

9 sources in this domain

Domain 09

Interconnect glue

Shared keys join domains — not keyword overlap. Agents traverse DOI, NCT, CIK, ChEMBL, UniProt, and Wikidata QID in one GraphQL hop chain.

6 sources in this domain

The public seed

Pre-loaded catalog,
ready to query.

Gyri maintains a massive public data seed in Postgres — not a one-off export you rebuild each quarter. Papers, trials, SEC filings, and typed entities index for full-text search, biomedSearch, and secFilingSearch from day one. Live API federators merge fresh hits and passively ingest them back into the mirror.

580k+ studies

Clinical trials

ClinicalTrials.gov bulk mirror plus AACT enrichment — NCT IDs, sponsors, phases, conditions, and PubMed links in Postgres FTS.

22k drugs · 19k proteins

Open Targets graph

OT 26.03 rollups: associations, GWAS, ClinVar, PharmGKB, expression, FAERS, cancer biomarkers, and ChEMBL bioactivity on entity rows.

OpenAlex + PubMed

Scholarly literature

Papers, authors, institutions, citation edges, and live OpenAlex search merge — passive ingest on every query.

Item-level FTS

SEC EDGAR

10-K, 10-Q, 8-K filings with parsed sections — secFilingSearch across Business, Risk Factors, and MD&A.

6.7k+ DailyMed

FDA labels

SPL section rollups on drug entities — boxed warning, indications, adverse reactions, indexed for biomedSearch FTS.

EIA → SEC multihop

Live federation chain

Install the power-grid federator pack: wind turbine → power plant → operator → live SEC filings in one GraphQL query.

The problem

Public data without
federation fails.

01

Public data lives in silos

Trials live on ClinicalTrials.gov, papers on PubMed, filings on EDGAR, targets on Open Targets — researchers manually tab-hop for one diligence question.

02

Web search cannot multihop

Perplexity and Google return snippets — not typed joins from NCT ID → sponsor → latest 10-K Risk Factors with source links.

03

DIY mirrors rot immediately

Teams rebuild OpenAlex dumps and OT exports quarterly. Gyri maintains seeded catalogs, live federators, and interconnect keys as platform infrastructure you install once.

With Gyri

Seed + live APIs
+ your stack.

Installable packs

Enable sources in Settings

Turn on scholarly, regulatory, archives, and live federator packs for your workspace — biomedSearch, secFilingSearch, and keyword_search work before you connect private sources.

OT entity rollups

Evidence graph, not flat files

Open Targets enrichments live on BiomedEntity rows — traverse otAssociation, gwas, pharmacogenomics, and chemblTargets in GraphQL.

Live + local hybrid

Fresh results, persistent graph

Live OpenAlex, arXiv, EDGAR, and EIA API calls merge with the local mirror — top hits passively ingest so the next query compounds.

Join your stack

Public + private federation

Link public entities to Pipedrive deals, Gmail threads, and Slack decisions — cited synthesis across open and operational data.

Native integrations

Private stack connectors
live & planned.

Join public entities to CRM, email, Slack, and docs. 145 native connectors ship today; HubSpot, Salesforce, Notion, Linear, and more are on the build queue.

Anthropic (Claude) BYOK Claude agents + MCP Live
OpenAI BYOK GPT stored agents Live
LLM Gateway OpenRouter · LiteLLM · Gemini slugs Live
Cursor Cloud Agents Workflow external agents + MCP Live
Microsoft 365 Outlook, Calendar, OneNote Live
Google Workspace Gmail, Calendar, Drive Live
Pipedrive CRM deals pipeline Live
HubSpot CRM, marketing, sequences Live
Zoho CRM SMB CRM & automation Live
Dynamics 365 Microsoft enterprise CRM Live
Notion Wikis & project docs Live
Insightly CRM opportunities & projects Live
Slack Team comms Live
GitHub Engineering context Live
Linear Issue tracking Live
Jira Issues & sprints Live
Asana Work management Live
monday.com Boards & workflows Live
ClickUp Tasks & docs hub Live
GitLab Repos, CI & issues Live
Datadog Metrics & APM Live
Sentry Error tracking Live
Zoom Meetings & recordings Live
Microsoft Teams Chat, meetings, channels Live
Confluence Team wikis & specs Live
Airtable Structured ops bases Live
Google Ads Paid search metrics Live
Google Analytics Web & app traffic Live
Google Search Console Organic search & SEO Live
Mixpanel Product analytics Live
Amplitude Behavioral analytics Live
PostHog Product OS & flags Live
Segment Customer data pipeline Live
Hotjar Heatmaps & recordings Live
FullStory Session replay Live
Pendo In-app guides & NPS Live
Grafana Dashboards & alerts Live
Resend Transactional email Live
Stripe Payments & billing Live
Xero Accounting & AR Live
Greenhouse ATS & hiring Live
DocuSign E-signatures Live
Figma Design files & comments Live
Miro Whiteboards & workshops Live
Mailchimp Email campaigns Live
Zapier No-code automations Live
Klaviyo Lifecycle email & SMS Live
Meta Ads Facebook & Instagram ads Live
Apollo Contact enrichment Live
Gong Call intelligence Live
Outreach Sales engagement Live
Salesloft Revenue workflows Live
Calendly Meeting booking Live
Intercom Support tickets Live
Zendesk Ticketing & help center Live
PostgreSQL Database federation Live
MySQL Database federation Live
MongoDB Database federation Live
SQLite Database federation Live
SQL Server Database federation Live
Snowflake Database federation Live
BigQuery Database federation Live
Redis Database federation Live
DynamoDB Database federation Live
Elasticsearch Database federation Live
Freshsales Sales CRM & sequences Live
Close Inside sales dialer CRM Live
Copper Google Workspace CRM Live
Attio Flexible relationship CRM Live
Folk Lightweight team CRM Live
Affinity Relationship intelligence Live
monday CRM Pipeline on monday boards Live
Front Shared inbox & routing Live
Missive Team email collaboration Live
Discord Community & team chat Live
Twilio SMS & voice logs Live
SendGrid Deliverability & sends Live
Coda Docs + tables hybrid Live
SharePoint Intranet & file libraries Live
Dropbox Cloud files & Paper Live
Box Enterprise content Live
Evernote Notes & capture Live
Google Docs Docs, Sheets, Slides Live
Trello Kanban boards Live
Basecamp Projects & message boards Live
Wrike Enterprise PMO Live
Smartsheet Sheets-driven PM Live
Height Autonomous issue tracking Live
Bitbucket Atlassian git hosting Live
CircleCI CI/CD pipelines Live
Jenkins Build automation Live
PagerDuty Incident response Live
Vercel Deployments & previews Live
Netlify Web deploys & forms Live
Terraform IaC state & modules Live
Freshdesk Omnichannel support Live
Help Scout Shared mailbox support Live
Gorgias E-commerce support Live
ServiceNow ITSM & workflows Live
Statuspage Incidents & uptime Live
Gainsight Customer success platform Live
LinkedIn Ads B2B paid social Live
Braze Customer engagement Live
Customer.io Behavioral messaging Live
ActiveCampaign Automation & CRM email Live
Marketo Enterprise marketing automation Live
Hootsuite Social scheduling Live
Buffer Social publishing Live
NetSuite ERP & finance Live
Bill.com AP & spend Live
Brex Corporate cards & spend Live
Ramp Spend management Live
PayPal Payments & payouts Live
Square POS & invoicing Live
Chargebee Subscription billing Live
ZoomInfo B2B contact database Live
Clearbit Firmographic enrichment Live
Lusha Contact data Live
Chorus Conversation analytics Live
6sense Intent & ABM Live
Lever Recruiting CRM Live
Ashby Modern ATS Live
BambooHR HRIS & PTO Live
Rippling HR, IT & payroll Live
Gusto Payroll & benefits Live
Deel Global contractors Live
Workday Enterprise HCM Live
Lattice Performance & engagement Live
PandaDoc Proposals & contracts Live
Ironclad CLM workflows Live
Dropbox Sign Sign & track docs Live
Canva Brand & marketing assets Live
Mural Visual collaboration Live
Loom Async video updates Live
Lucid Diagrams & flowcharts Live
Snowflake Warehouse queries Live
Databricks Lakehouse & notebooks Live
Tableau Dashboards & viz Live
Looker Semantic BI layer Live
Power BI Microsoft analytics Live
BigQuery Google warehouse Live
Make Visual integrations Live
n8n Self-hosted workflows Live
Typeform Forms & surveys Live
Cal.com Open scheduling Live
Use cases

From diligence
to deal desk.

Competitive diligence

SEC 8-K press releases cross-linked to ClinicalTrials.gov NCT IDs and OpenAlex citation graphs — cited brief for BD.

Science & IP

Open Targets drug → protein → GWAS multihop, plus arXiv preprints and optional Google Patents SERP — for R&D and investor memos.

Account research

SEC sponsor graph + Wikidata crosswalks joined to your CRM account — public identity without manual lookup.

Power & infrastructure

Wind turbine → EIA plant → SEC operator filings multihop — live public-data federation in one query.

Space & launch tracking

Upcoming launch → pad weather window and launch provider → Wikidata / SEC context — plus CelesTrak element lookup for on-orbit payloads.

Try it

One multihop query
across public sources.

biomedSearch → sponsor → secFilings
secFilingSearch(query: "pipeline risk") → cited sections
USWTDB → EIA plant → SEC operator (live multihop)

See the Science page for biomedical depth · How Gyri works for federation architecture

Start free trial

Query public data on Gyri.

Free trial — install public data packs, live federators, and cited answers joined to your stack.

Frequently asked questions

What public data does Gyri federate?

Gyri ships 74+ installable public sources: ClinicalTrials.gov (580k+ mirror), OpenAlex, PubMed, Open Targets rollups, SEC EDGAR with section FTS, FDA DailyMed, arXiv, Internet Archive, EIA, USWTDB, CDC, EPA, FAA, Open-Meteo, OpenStreetMap, NASA/CelesTrak/Launch Library live APIs, and optional Google Scholar/Patents SERP. Enable packs in Settings → Public data.

What is the Gyri public data seed?

The public seed is a pre-loaded Postgres catalog you install into your workspace — papers, trials, filings, and entities indexed for full-text and graph search. Live API calls merge with the mirror and passively ingest new hits.

Can I join public data to my CRM and email?

Yes. Multihop GraphQL links public entities (sponsors, CIKs, NCT IDs) to Pipedrive deals, Gmail threads, and Slack messages — cited synthesis across open and private sources.

Do I need API keys for public federation?

Core public federators (SEC EDGAR, arXiv, Internet Archive, ClinicalTrials.gov mirror, OpenAlex live search) are keyless. Optional enrichments (Google Scholar, Patents) use SERP API keys when configured.

Gyri think verbs

26 MCP think verbs —
not API wrappers.

Gyri federates into one graph. Agents orient with recall and ground, capture with reflect, and commit with promote — the full epistemic lifecycle via MCP think.

think verb=recall

Semantic orient — load prior cited conclusions before you re-derive from scratch.

think verb=ground

FTS-only orient on live federated CRM, email, and Slack — zero embed cost.

think verb=consider

Classify novelty against the workspace corpus before you commit a thesis.

think verb=reflect

Capture drafts, tensions, and questions as typed thoughts on the graph.

think verb=traverse

Multihop GraphQL in one think call — deal → contact → email → ticket.

think verb=fetch

Hydrate full rows by ref when search or traverse surfaced the right node.

think verb=promote

Gated commit — draft thought → durable cited insight the whole team recalls.

think verb=wish

Natural-language buildout — queue workspace types, connectors, and workflows.

Full catalog via MCP think verb=catalog · Why Gyri think verbs beat API wrappers → · MCP for Claude & Cursor