Your agent found a compelling paragraph about "pricing pressure at Acme Corp." The problem: Acme was never mentioned by name in the retrieved email — the model inferred it from a similar account story three deals away. Or the opposite failure: keyword search returned every Slack message containing "Glean" but missed the thread where a champion wrote "evaluating their workplace search vendor" — semantically obvious to a human, invisible to exact-match retrieval.
Both outcomes trace to the same architectural mistake: treating enterprise retrieval as a single technique. Vector-only RAG misses exact strings, SKUs, ticket IDs, and competitor names spelled correctly once in a CRM note. Keyword-only search misses paraphrases, relationship hops, and the connective tissue between a deal, its contacts, and their comms history.
Hybrid search AI enterprise platforms combine a keyword leg (precise lexical matches across federated sources), a graph leg (typed records and multihop traversal), and rank fusion that assembles a small, cited context pack for agents. That pattern is how Gyri keeps MCP agents grounded — and why GTM operators should demand it before trusting AI on customer-facing workflows.
This article walks through retrieval failure modes, how each leg works, how results merge, what changes for agents, and how to tune the stack without a search-engine PhD.
Retrieval failure modes: why one technique is never enough
Before choosing vendors or tuning prompts, name the failures you are trying to prevent. GTM retrieval breaks in predictable ways:
Semantic-only (vector RAG) failures
Vector search retrieves chunks similar in embedding space to the query. It excels at fuzzy questions — "accounts showing renewal hesitation" — when the language in source records varies.
It fails when precision matters:
- Exact identifiers. Contract clause numbers, product SKUs, security questionnaire IDs, and competitor trademarks appear once, verbatim. Embeddings may rank a paraphrase higher than the canonical string.
- Rare strings. A prospect's internal project codename or a custom CRM field value may appear in only one email. Similarity to generic "project update" chunks drowns it out.
- Negative evidence. "We are not moving forward with the Glean pilot" is semantically close to positive Glean evaluation threads — dangerous for competitive intel.
- Relationship blindness. Vectors do not follow edges. Similarity does not answer "which contacts on this deal emailed after the pricing call."
Teams building custom RAG stacks often discover these gaps in week three of a pilot — after reps report that the bot "almost" knows the account. See RAG vs Knowledge Graph for Company AI for when vectors alone hit a ceiling.
Keyword-only failures
Full-text and keyword search (PostgreSQL tsvector, Elasticsearch BM25, etc.) fix the precision problem. They return records containing the tokens you asked for.
They fail when language diverges:
- Paraphrase and synonym gaps. "Workplace search incumbent" does not match "Glean" unless someone indexed aliases or your query expands terms manually.
- Cross-record questions. "Show me every open enterprise deal where support volume spiked" requires joining CRM opportunity status to ticket counts — not keyword ranking within one index.
- Bucket silos. A keyword hit in Gmail does not automatically surface the linked HubSpot opportunity unless identity keys and graph bridges exist.
- Over-retrieval. Searching a common company name returns hundreds of mentions; the agent context window fills with noise before the one thread that mattered appears.
Classic enterprise search portals stop here: ten blue links, no synthesis, no graph. Useful for lookup. Insufficient for agents.
Graph-only failures
A typed knowledge graph — deals, contacts, companies, insights, bridges to email and Slack — answers multihop questions search boxes cannot. GraphQL traversal can walk deal → contacts → messages → tickets in one request, as described in Multihop GraphQL for Business Intelligence.
Graphs fail when entry points are wrong:
- Cold start on identity. If keyword retrieval cannot resolve "Acme" to
company:acct_4821, graph traversal has nowhere to begin. - Unstructured signal in bodies. Commitments live in email prose, not CRM picklists. The graph needs keyword retrieval to find candidate messages before edges attach them to accounts.
- Scale without ranking. Traversing every contact on every open deal is expensive; you still need relevance ranking on the final candidate set.
The insight for operators: keyword search plus graph is not "use two products." It is one retrieval pipeline where each leg covers the other's blind spots.
| Failure | Vector-only | Keyword-only | Graph-only | Hybrid (keyword + graph) |
|---|---|---|---|---|
| Exact competitor name in CRM note | Weak | Strong | N/A without entry | Strong |
| Paraphrased churn signal in email | Strong | Weak | Needs text hits | Strong |
| Deal → contact → email hop | Fails | Fails | Strong | Strong |
| Common name over-retrieval | Noisy | Noisy | Risk if unranked | Ranked fusion |
| Citable record identity | Chunk IDs | Record IDs | Record IDs | Record IDs |
The keyword leg: precision across federated sources
The keyword leg is Gyri's first pass for keyword search knowledge base workflows: find candidate records with lexical match, scoped to the user's permissions, grouped by source type.
What keyword search should return
Mature keyword retrieval for business AI returns typed hits, not anonymous text blobs:
- Documents and chunks — enablement decks, security PDFs, Notion pages
- Email messages — thread ID, participants, date, snippet
- CRM records — accounts, deals, contacts with stable IDs
- Insights — persisted synthesis from prior agent runs
- Workspace records — custom types your RevOps team defines (competitors, playbooks, health scores)
- Comms — Slack messages with permalinks
Each hit should carry a score within its bucket (email vs deal vs insight), a snippet suitable for agent peeks, and a ref the graph can hydrate (get / citation hydration) on the next step.
Query semantics operators care about
Enterprise keyword layers need more than naive AND:
- Token AND by default —
"Acme pricing"requires both terms, reducing noise. - Explicit OR —
"Glean OR Coveo"for competitive sweeps across phrasing variants. - Automatic OR retry — when a strict query returns empty, a controlled broadening prevents false "no data" answers that trigger hallucination filling.
- Source filters —
sourceTypeslimit to CRM-only for "what stage is this deal" vs comms-only for "who mentioned budget."
Federated search for business AI covers how live connectors feed these buckets across CRM, Gmail, Slack, and docs without a stale overnight index.
Keyword federators and custom types
Not everything lives in a central Postgres FTS index. Gyri supports keyword federators: named buckets backed by HTTP APIs or workspace-type search configs. Custom record types declare which fields materialize into search_text, whether search is indexed, delegated across a graph bridge, or fetched live from an external system.
That matters for GTM teams with long-tail data: partner portals, product usage APIs, proprietary research databases. The keyword leg reaches them through the same MCP keyword_search surface agents already use — one query, many buckets.
When the keyword leg wins alone
Use keyword-first retrieval when:
- The user or agent supplies an exact name, ID, or competitor string
- Compliance requires matching verbatim contract language
- You are disambiguating entities ("Acme Corp" vs "Acme Labs") before graph expansion
Do not stop there for synthesis questions — but never skip it for entry.
The graph leg: relationships keyword search cannot see
Once keyword retrieval identifies anchor records, the graph leg traverses relationships to assemble operational context.
Typed records and bridges
An agentic knowledge base models business objects explicitly:
- A deal links to contacts, activities, and emails
- A company links to deals, support tickets, and insights
- A competitor node collects mentions from Slack and email via bridges
- An insight persists a prior synthesis with citation closure back to sources
Bridges connect records across systems: this Slack thread ↔ this opportunity ↔ this champion's Gmail thread. Keyword search finds text; bridges explain why that text belongs to the account the rep asked about.
Multihop traversal
GTM questions are inherently multihop:
- Which contacts on this deal sent email after the last pricing call?
- What competitor names appeared in Slack from anyone on the account team this quarter?
- Which support themes correlate with deals stuck in negotiation?
Graph traversal answers these in one query shape instead of chaining six keyword searches manually. Vector similarity cannot reliably perform those joins; keyword search cannot express the join at all without exporting to a spreadsheet.
Example pattern (conceptual — your workspace schema may differ):
```graphql
query AccountContext($companyId: ID!) {
company(id: $companyId) {
name
deals(status: OPEN) {
stage
amount
contacts {
name
emails(since: "30d", limit: 5) { subject snippet citationUrl }
}
}
slackMessages(since: "30d", limit: 10) { channel text permalink }
insights(type: COMPETITIVE, limit: 5) { summary citations { ref url } }
}
}
```
The graph leg returns structured, citable nodes — the substrate for AI answers with citations, not prose invented to connect dots.
Insights as graph-native memory
Keyword search also retrieves prior insights — competitive summaries, churn patterns, call briefs — stored as typed graph nodes with evidence links. That is how agents avoid starting from zero every session: yesterday's research is today's retrieval hit, discoverable by keyword and linked to the entities it describes.
Ranking fusion: assembling the agent context pack
Raw retrieval from both legs can overwhelm any model context window. Semantic + keyword search pipelines need a fusion stage that:
- Dedupes the same email surfaced by keyword hit and graph traversal
- Ranks by relevance, recency, record type, and workflow intent
- Caps token budget so synthesis sees complete records, not truncated noise
- Preserves citation refs for every included item
Typical fusion flow
A practical hybrid pipeline for enterprise agents:
```
User question
→ Keyword pass (anchors + high-precision hits)
→ Graph expand (hops from anchor IDs, bounded depth)
→ Optional semantic rerank (within fused candidate set only)
→ Context pack (ranked, cited, size-limited)
→ Synthesis with claim-to-source mapping
```
Note the order: semantic rerank after keyword and graph narrowing — not instead of them. Using embeddings to retrieve from the entire corpus without lexical anchors reproduces the failure modes above.
Bucket-aware scoring
Scores are not globally comparable across source types. An email snippet ranked 0.82 is not automatically better than a deal record at 0.61 — different buckets, different score distributions. Fusion rules should:
- Boost record types relevant to the workflow (deal notes for pre-call briefs, tickets for renewal risk)
- Apply recency decay for comms; slower decay for canonical docs
- Penalize orphan hits lacking graph linkage to the account in scope
- Prefer insights with fresh citation closure over stale summaries
Operators define golden questions per workflow and inspect which records enter the context pack — not just the final prose.
Agent impact: why hybrid retrieval changes MCP behavior
Agents do not experience retrieval as a UI ranking problem. They experience it as tool success or silent failure on the next MCP turn.
Fewer false "no data" answers
When keyword OR-retry and graph expansion run before synthesis, agents less often conclude the workspace is empty — a trigger for confident fabrication. Partial, cited context beats a blank slate.
Stable tool surface
Gyri exposes hybrid retrieval through MCP tools (keyword_search, GraphQL search, get hydration) and GraphQL multihop queries — the same graph Claude and Cursor agents use in IDE workflows and automated runs. Agents learn one pattern: search → hydrate → synthesize → optionally write back.
Write-back requires correct anchors
Agents that write back — filing insights, updating custom records, triggering workflows — need the right record IDs from retrieval. Keyword-plus-graph fusion returns deal:, insight:, and company: refs agents can pass to mutation tools. Vector chunks without record identity make write-back fragile or impossible.
Workflow examples
| Workflow | Keyword leg role | Graph leg role |
|---|---|---|
| Pre-call brief | Find account name, champion email | Deal → contacts → recent emails → tickets |
| Competitive digest | "Competitor OR alias" in Slack/email |
Link mentions → competitor node → prior insights |
| Renewal risk | Support theme keywords | Account → tickets + CRM health + comms timeline |
| Enablement Q&A | Exact feature name in docs | Doc → linked product insights → recent release notes |
When fusion is tuned for a workflow, agent runs become replayable: the same question pulls the same evidence skeleton — auditable for enablement QA.
Tuning hybrid retrieval without a search team
You do not need to reimplement BM25. You do need operational discipline:
Start with golden questions
Pick five to ten real questions per persona (AE, CS, RevOps). For each, document:
- Expected anchor entities (account, deal, competitor)
- Minimum source types required
- Acceptable recency window
- Citations a human would click to sign off
Run them weekly after connector changes. Regression here predicts field trust better than model upgrades.
Tune keyword before embeddings
Most production issues trace to:
- Missing aliases on competitor and product records
- Custom fields not in
keywordIndexfor workspace types - Over-broad default queries filling context with wrong accounts
- Under-connected bridges (Slack channel not linked to CRM account)
Fix lexical and graph configuration first. Embedding rerankers polish edges; they do not fix wrong entry points.
Cap graph depth and fan-out
Multihop traversal without limits returns the entire account history. Set sensible bounds — last 30 days of email, top five support themes, open deals only — per workflow skill. GraphQL parameters encode those limits so agents inherit them.
Measure what matters
Track:
- Citation click rate on agent answers (trust proxy)
- Empty retrieval rate before synthesis
- Wrong-entity rate in QA sampling (Acme vs Acme Labs)
- Time to brief vs manual tab ritual
If citation clicks are low, fusion is sending the wrong records — not a prompt problem.
When to add semantic rerank
Add embedding rerank inside the fused candidate set when:
- Users ask in natural language without entity names
- Paraphrase diversity is high (international teams, informal Slack)
- Keyword recall is verified good but ordering feels off
Avoid using semantic search as the only retrieval stage for GTM workflows with legal, pricing, or competitive exposure.
The bottom line
Reliable hybrid search AI enterprise architecture combines three ideas:
- Keyword retrieval for precision, federation, and entity resolution across CRM, comms, and docs
- Graph traversal for multihop questions and typed, citable record identity
- Rank fusion that assembles a bounded, deduplicated context pack agents can synthesize and audit
Pure semantic search misses exact matches and relationships. Pure keyword search misses paraphrases and joins. Graph-only retrieval lacks entry points. Together, they form the retrieval substrate beneath an agentic knowledge base — not a chat wrapper on a single index.
If your agents still hallucinate account stories or miss competitor mentions spelled correctly once in CRM, inspect the retrieval pipeline before you swap models. The fix is usually hybrid — keyword search plus graph — tuned to the workflows your revenue team already runs manually across tabs.
To see hybrid retrieval, cited synthesis, and MCP agents on your stack, start your free trial.