The Enterprise Context Problem

3 days ago
20 min read

Why Organizational Knowledge Remains the Binding Constraint on Enterprise AI Value Creation

Eye-level view of a digital dashboard displaying dynamic organizational workflow graphs — Organizational workflow visualization on a digital dashboard

ABSTRACT

Enterprise AI deployments are failing at unprecedented rates. MIT reports that 95% of generative AI pilots deliver no measurable P&L impact (2025). S&P Global finds that 42% of companies have abandoned the majority of their AI initiatives—more than double the rate from just one year prior. RAND documents an 80% overall AI project failure rate, exactly twice the failure rate of non-AI IT projects. These failures are not attributable to model quality—frontier models from OpenAI, Anthropic, and Google have converged in capability and are racing toward commodity pricing. The binding constraint is organizational context: the institutional knowledge of how a specific enterprise operates, makes decisions, allocates resources, and creates value. This paper presents a systematic analysis of why current architectural approaches—API connectors, cloud-based RAG, and manually constructed knowledge graphs—have reached fundamental ceilings. We propose a research framework for what we term the Enterprise Context Engine: a device-level, zero-connector architecture that constructs organizational knowledge graphs autonomously through passive observation of existing work patterns. We analyze the economic case for context infrastructure, position this approach against both enterprise AI incumbents and industrial edge AI deployments, and identify the architectural principles—drawn from distributed computing, identity federation, and endpoint security—that make this approach both technically feasible and commercially viable. The paper concludes with a research agenda and validation methodology for empirical testing.

Keywords: enterprise AI, organizational context, knowledge graphs, edge computing, device-level intelligence, institutional knowledge, distributed inference, context infrastructure, enterprise data fragmentation

1. The Enterprise AI Value Crisis

The enterprise AI industry is experiencing a crisis of value realization that no amount of model improvement can resolve. While adoption metrics paint a picture of momentum—McKinsey reports 88% of organizations now deploy AI in at least one function—the financial impact data tells a fundamentally different story. The gap between AI adoption and AI value creation is not narrowing. It is widening.

1.1 The Evidence of Systemic Failure

Five independent research programs, employing different methodologies and surveying different populations, have converged on the same conclusion: enterprise AI is failing at rates that demand structural explanation, not incremental improvement.

95% Pilot Failure (MIT, 2025): The MIT-NANDA “GenAI Divide” report, based on 150 executive interviews, 350 employee surveys, and analysis of 300 public deployments, found that only 5% of enterprise generative AI pilots achieve rapid revenue acceleration. The remaining 95% stall with no measurable P&L impact. The researchers explicitly attributed this to brittle workflows, weak contextual learning, and misalignment with day-to-day operations—not model quality.
42% Abandonment Rate (S&P Global, 2025): S&P Global surveyed 1,006 IT and business professionals and found that 42% of companies abandoned the majority of their AI initiatives before reaching production—a surge from just 17% one year prior. The average organization scrapped 46% of AI proofs-of-concept. This is not experimentation discovering what does not work; this is systematic failure at the transition from prototype to production.
80% Project Failure (RAND Corporation): RAND’s analysis, based on interviews with 65 senior data scientists and engineers, confirms that over 80% of AI projects fail to reach production—exactly twice the failure rate of non-AI IT projects. The root causes are organizational, not technological: misalignment between AI capabilities and business problems, inadequate data infrastructure, and the persistent gap between what AI needs to know and what organizations can tell it.
60% Data Readiness Abandonment (Gartner, 2026): Gartner predicts that through 2026, organizations will abandon 60% of AI projects not supported by AI-ready data. Sixty-three percent of organizations either do not have—or are unsure whether they have—the data management practices required for AI.
Elusive ROI (Deloitte, 2026): Deloitte surveyed 3,235 leaders across 24 countries. While 85% of organizations increased AI spending and 91% plan to increase again, only 20% report actual revenue growth from AI. ROI payback averages two to four years—three to six times longer than the seven-to-twelve-month payback expected for typical technology investments.

Taken together, these findings describe a market spending $2.5 trillion annually on AI while systematically failing to capture value from it. The question is not whether enterprise AI will succeed—the technology is manifestly capable. The question is what structural barrier prevents capable technology from delivering organizational value.

1.2 The Structural Diagnosis: Context Starvation

Across every major research program examining enterprise AI failure, the root cause converges on a single theme: AI models are starved of organizational context. They possess general intelligence but lack specific knowledge of how the organization they serve actually operates.

Consider what “organizational context” encompasses: not merely data stored in systems, but the institutional understanding of how decisions are made (as opposed to how process documentation says they should be made), which vendor contacts are reliable, what the informal escalation paths are that bypass the org chart, why a particular deal was structured the way it was, and the accumulated judgment that distinguishes experienced employees from new hires. This is the knowledge that Fortune 500 companies lose an estimated $31.5 billion annually failing to capture and share.

IBM estimates that 68% of enterprise data remains completely unanalyzed, and 82% of enterprises experience workflow disruptions from siloed data. But even this understates the problem, because the most valuable organizational knowledge is not “data” in any system at all. It exists as institutional memory—distributed across the behaviors, communications, and decisions of employees who, at turnover rates of 15–20% annually, are continuously walking out the door with irreplaceable context. Sixty-seven percent of IT leaders report explicit concern about knowledge loss from employee departures.

This is the enterprise context problem: AI models that can reason brilliantly about any topic lack the specific organizational knowledge needed to reason usefully about this company, this deal, this decision, this team.

2. The Architectural Ceilings of Current Approaches

Four generations of enterprise AI architecture have attempted to solve the context problem. Each has improved on its predecessor. None has addressed the fundamental challenge.

2.1 The Connector Model: Linear Scaling, Structural Blind Spots

Every major enterprise AI platform—Glean, Moveworks, Microsoft Copilot, Google Agentspace, Salesforce Agentforce—relies on API connectors to access organizational data. Glean, the market leader, maintains over 100 connectors and has demonstrated extraordinary product-market fit, reaching $200M ARR in December 2025, doubling from $100M in just nine months, with a $7.2 billion valuation.

This growth validates the demand for organizational knowledge access. But the connector architecture has three structural limitations that cannot be resolved through engineering effort:

Linear maintenance cost: Each connector represents a permanent engineering liability. When Salesforce changes their API, when Jira ships a new version, when a SaaS vendor deprecates an endpoint, the connector breaks. The average enterprise now runs 342 SaaS applications. No connector-based platform can maintain integrations with the full portfolio, and each new application an organization adopts requires a new integration.
API-shaped data: Connectors capture what APIs expose—structured records and metadata. They cannot capture the decision context that explains why data changed. A Salesforce API returns that a deal moved to Closed-Won; it cannot return that the decision was made in a Slack thread after reviewing a shared Google Doc, informed by a phone call the previous afternoon. The “why” lives outside the transaction system.
The on-premise blind spot: Connectors require cloud-accessible APIs. For government agencies, defense contractors, healthcare organizations, and financial institutions operating in regulated environments, the hardest systems to integrate—on-premise legacy systems, mainframe terminals, custom internal tools—are precisely where the most critical institutional knowledge resides.

2.2 The RAG Ceiling: Semantic Similarity Is Not Organizational Understanding

Standard Retrieval Augmented Generation (RAG) implementations retrieve document chunks based on vector similarity—finding text that “sounds related” to a query. This approach has a well-documented accuracy ceiling. Industry practitioners estimate standard RAG achieves approximately 70% accuracy, with heavily optimized implementations rarely exceeding 80%. For enterprise workflows where decisions carry regulatory, financial, or safety consequences, this accuracy floor is disqualifying.

The deeper limitation is architectural: RAG performs similarity search over flat document chunks. It cannot follow structural relationships. “Show me everything connected to this deal” is not a semantic similarity problem—it is a graph traversal problem. RAG finds text that sounds related. Organizational intelligence requires understanding how entities are actually connected through work patterns, communication flows, and decision chains.

2.3 The Knowledge Graph Construction Stall

Knowledge graphs were expected to provide the relational intelligence that RAG lacks. The market is growing—projected at 24.3% CAGR, reaching $3.54 billion by 2029—and the ROI evidence is compelling: Forrester documented 320% ROI over three years with $9.86M in benefits for organizations that successfully deploy knowledge graph platforms.

But adoption has stalled. BARC Research found that among AI adopters, only 27% had knowledge graphs in production in late 2025—barely an uptick from 26% in early 2024. Earlier-stage evaluations and proofs-of-concept actually declined, indicating that the implementation pipeline has slowed rather than accelerated. The reason is structural: manual knowledge graph construction is prohibitively expensive, requires specialized domain expertise that does not generalize across industries, and produces static artifacts that begin degrading immediately as organizations evolve.

The knowledge graph market has a paradox: the technology delivers extraordinary value when successfully implemented, but the implementation methodology—manual construction by domain experts—cannot scale to match the rate of organizational change.

2.4 The Screen Capture Dead End: Microsoft Recall

In May 2024, Microsoft announced Recall—a Copilot+ PC feature that captured screenshots every five seconds and used on-device AI to make everything searchable. The concept acknowledged the right problem: device-level activity contains organizational context that APIs miss. But the implementation chose the wrong observation mechanism.

The backlash was immediate: cybersecurity researchers called it “a surveillance tool masquerading as a productivity aid,” the UK Information Commissioner’s Office launched an inquiry, and Microsoft pulled the feature before public release. Even after a complete redesign with opt-in consent and biometric authentication, Recall’s enterprise adoption remains limited.

The Recall failure is instructive not because device-level intelligence is wrong—it is because screen capture is the wrong observation layer. Capturing pixels produces unstructured image data requiring multimodal AI interpretation, stores sensitive visual content locally, and creates a searchable archive of everything on screen including personal activity. The lesson is not that device-level context capture cannot work; it is that the observation mechanism must be architecturally appropriate to the enterprise trust model.

Similarly, Rewind AI (later Limitless) built a consumer version of always-on screen and audio recording. Despite strong initial interest, the company pivoted to hardware, was acquired by Meta in December 2025, and the Mac application was discontinued—confirming that screen capture cannot sustain as a standalone product even where consent is implicit.

3. Toward a Solution: The Enterprise Context Engine

We propose a research framework for what we term the Enterprise Context Engine (ECE)—a fundamentally different approach to enterprise AI context that addresses the structural limitations identified above. The ECE is defined by a set of architectural principles, each derived from established precedent in adjacent domains, that collectively solve the context problem without inheriting the ceiling of current approaches.

This section describes what the architecture solves and why each principle is necessary. It does not constitute a complete implementation specification—rather, it defines the research agenda and the architectural constraints that any valid implementation must satisfy.

3.1 Principle 1: Device-Level Interception (Zero-Connector Observation)

Problem addressed: API connectors scale linearly, miss decision context, and cannot reach legacy systems.

Architectural principle: Observe organizational knowledge at the device layer—where every digital interaction in the enterprise ultimately occurs—rather than pulling data through application-specific API integrations.

Every SaaS application an employee interacts with renders in a browser or native application on a managed corporate device. At this layer, application data is already structured: browsers parse HTML into Document Object Models; applications make API calls returning structured JSON; operating systems expose application state through accessibility frameworks. These are the same observation points used by enterprise Data Loss Prevention (DLP) platforms—Symantec, Forcepoint, Digital Guardian, Zscaler—that are already deployed on 80%+ of enterprise endpoints.

The critical distinction from Microsoft Recall is the observation layer. Screen capture operates on pixels—unstructured visual data requiring multimodal interpretation. Device-level interception operates on structured application data that the operating system and browser have already parsed. This is the difference between photographing a database and querying it directly. The enterprise security industry has already demonstrated that this observation model is technically feasible, legally permissible, and organizationally accepted.

A device-level approach provides three structural advantages: it captures context from every application the employee touches (including legacy systems and custom tools with no API); it eliminates connector maintenance entirely; and it captures the full decision trace—the user consulted three reports, referenced a Slack conversation, reviewed an email chain, then made the change—not just the final transaction logged by the API.

3.2 Principle 2: Autonomous Knowledge Graph Construction

Problem addressed: Manual knowledge graph construction cannot scale to match organizational change velocity.

Architectural principle: The organizational knowledge graph must construct and maintain itself through AI-driven entity extraction, relationship inference, and schema evolution—without manual ontology definition or ongoing human curation.

The 27% production adoption rate for knowledge graphs is not a demand problem—the ROI evidence is compelling. It is a construction methodology problem. Manual knowledge graph creation requires domain experts to define ontologies, map entities, specify relationships, and continuously update the graph as the organization evolves. This approach produces static artifacts that begin degrading on day one.

An autonomous construction model uses specialized AI processes to continuously discover entities (people, projects, accounts, products, departments), infer relationships from observed behavioral patterns (communication frequency, co-occurrence in meetings, collaborative editing patterns), reconstruct decision traces across systems, and evolve the graph schema as new organizational patterns emerge. The knowledge graph becomes a living model of the organization that updates in real time as people work—not a snapshot that requires periodic manual refresh.

This model transforms the value equation: rather than investing months in manual construction for each deployment, the graph begins producing value within days of deployment, and its accuracy improves continuously as more behavioral data is observed.

3.3 Principle 3: Tiered Inference and Context Compression

Problem addressed: AI context windows are finite; raw organizational data at scale is infinite.

Architectural principle: The knowledge graph must function as a compression and relevance layer between captured data and AI model inference, maintaining organizational knowledge at multiple levels of abstraction to deliver precisely the context each query requires.

Current frontier models support approximately one million tokens of context. A single active employee generates that volume of meaningful interaction data in one to two weeks. Across a 10,000-person organization, total captured context reaches petabyte scale. Any architecture that attempts to feed raw captured data to a model is architecturally dead on arrival.

The knowledge graph solves this through tiered abstraction: raw events are compressed into interaction summaries, which aggregate into relationship summaries, which distill into entity cards. The compression ratio from raw captured data to the abstraction level that feeds AI inference is typically 1,000:1 or greater. An organization-wide strategic query—“which accounts show risk signals across both sales and engineering?”—assembles perhaps 10,000–15,000 tokens of compressed context, leaving 98.5% of the model’s context window available for reasoning.

Recent research from Google DeepMind further informs the inference architecture. A December 2025 study evaluating 180 agent configurations found that multi-agent systems degraded performance by 39–70% on sequential reasoning tasks, with independent agents amplifying errors 17.2x. This finding argues strongly for a single-agent inference model augmented with domain-specific capabilities, rather than the multi-agent architectures favored by many current enterprise AI platforms.

3.4 Principle 4: Distributed Edge Compute

Problem addressed: Centralized GPU infrastructure is expensive; data sovereignty requirements prohibit cloud transmission of sensitive context.

Architectural principle: Distribute AI inference compute across the corporate device fleet—leveraging idle CPU, RAM, and increasingly available NPU resources on employee endpoints—rather than concentrating processing in centralized cloud or on-premise GPU infrastructure.

This principle draws directly from the BOINC distributed computing model, which demonstrated that volunteer computing can be 420 times cheaper than equivalent cloud computing on Amazon EC2. In a 2,000-person organization where each device contributes modest idle resources, the aggregate pool rivals a mid-range data center at zero incremental hardware cost. IDC projects 94% AI-capable PC penetration within three years, with NPUs becoming standard in corporate hardware refresh cycles.

The distributed model also provides a data sovereignty architecture by default: raw context data processes locally on the device where it was captured. Only compressed knowledge representations—entities, relationships, abstracted activity signals—propagate to the organizational graph. Content never leaves the device. This satisfies the data residency and minimization requirements of GDPR, CCPA, and sector-specific regimes without requiring architectural workarounds.

3.5 Principle 5: Identity-Based Federation

Problem addressed: Individual device-level graphs must aggregate into organizational intelligence without exposing individual behavior data.

Architectural principle: Use the enterprise identity provider—Active Directory, Entra ID, Okta—as the federation backbone, inheriting existing organizational structure, access controls, and lifecycle management rather than building a parallel permission system.

Every enterprise already operates an identity provider that contains the organizational skeleton: reporting chains, department boundaries, security group memberships, role assignments, and location data. This infrastructure solves three problems simultaneously: it defines how individual knowledge graphs aggregate upward (the reporting chain as aggregation path); it governs who can query what (security groups as access control); and it manages the employee lifecycle automatically (SCIM provisioning ensures new hires inherit context and departing employees’ knowledge is preserved as organizational assets, not lost).

The federation model ensures that organizational intelligence emerges from individual context without surveillance. At the team level, a manager sees promoted entity summaries—“The Acme account has high activity density this week across three team members”—not individual interaction logs. At the organizational level, executives access aggregate entity maps and strategic signals with no individual behavior data visible. The identity provider’s existing trust architecture—already vetted by legal, security, and compliance teams—governs the entire system.

3.6 Principle 6: Unified Deployment via Existing Enterprise Tooling

Problem addressed: Enterprises will not adopt a product that requires managing separate components for each architectural function.

Architectural principle: Deploy as a single managed agent—one install, one configuration surface, one update channel—distributed through existing enterprise MDM and policy infrastructure, covering physical endpoints, virtual desktops, mobile devices, and cloud workloads.

The endpoint security industry has already proven this deployment model. CrowdStrike Falcon, SentinelOne, and Carbon Black deploy single agents across heterogeneous enterprise environments—physical laptops, VDI sessions, mobile containers, cloud VMs, Kubernetes clusters—managed through a single console. The ECE inherits this deployment pattern wholesale: one agent that embeds all architectural capabilities, pushed via the MDM infrastructure (Intune, Jamf, SCCM) that IT teams already operate.

Virtual Desktop Infrastructure (VDI) environments represent a particularly strong initial deployment target. The agent bakes into the gold image—when IT provisions 5,000 virtual desktops, all 5,000 boot with the agent pre-installed. VDI also amplifies the distributed compute thesis: virtual desktop environments are notorious for wasted compute, with sessions typically at 5–10% CPU utilization during standard knowledge work.

3.7 Principle 7: Lessons from Industrial Edge AI

Problem addressed: Enterprise edge AI has a track record in industrial contexts; those lessons must transfer to knowledge-worker deployment.

Edge AI has entered mainstream deployment in industrial settings: Schneider Electric runs predictive maintenance on factory controllers; Tesco deploys solar-powered sensors across 3,000 locations; Siemens Energy monitors grid transformers through physics-informed neural networks. These deployments demonstrate that edge inference at enterprise scale is proven and viable.

However, the ECE represents a categorically different class of edge deployment. Industrial edge AI processes structured sensor telemetry—temperature, vibration, RFID signals—through pre-trained classification models. The ECE captures unstructured human workflow and constructs new knowledge through entity extraction and relationship inference. Industrial edge AI performs inference on data it was trained to recognize; the ECE performs discovery on data that has never been structured before.

Four lessons from industrial edge AI transfer directly:

Data quality dominates algorithmic sophistication. Every successful industrial deployment confirms this. For the ECE, the equivalent challenge is entity resolution—reconciling the same entity across different names, systems, and contexts. Entity resolution is the hardest engineering problem in this architecture, and it determines success or failure more than model selection.
Narrow beachhead before company-wide. Every industrial success (Schneider, Tesco, Siemens) started with one specific process. Every failure (McDonald’s AI voice ordering) tried to go broad too fast. The ECE must solve one high-value workflow for one persona before expanding.
Trust architecture is the adoption gate. The ECE handles far more sensitive data than vibration sensors. It demands transparency (users see what was captured), control (users can delete or correct), and revocable consent for federation from individual to organizational levels.
Human-in-the-loop is a feature. The McDonald’s failure is a direct cautionary tale: when AI was given full autonomy over open-ended inputs, the results were catastrophic. The ECE must deploy as inform-only—surfacing context and recommending actions—with autonomy earned incrementally through demonstrated accuracy.

3.8 Principle 8: Regulatory Architecture by Design

Industrial edge AI operates in a comparatively simple regulatory environment—vibration data from a transformer is not personally identifiable information. The ECE operates at the most complex regulatory intersection in enterprise software: employee monitoring law (varying by jurisdiction, with EU Works Council requirements being the most restrictive), data protection (GDPR, CCPA, HIPAA, SOX), and AI governance (the EU AI Act classifies workplace AI systems as high-risk, requiring transparency, human oversight, and impact assessments).

The on-device processing model provides a strong compliance foundation—raw data never leaves the employee’s device. But the federation layer introduces complexity that industrial edge AI never encounters: data residency requirements, cross-border transfer restrictions, and the right to erasure all apply to federated organizational knowledge. Any valid implementation must treat regulatory architecture as a first-class design constraint, not a compliance add-on.

4. Market Failure Analysis: Why Current Vendors Cannot Solve This

If the enterprise context problem is as severe as the evidence indicates, why hasn’t it been solved? This section analyzes why the leading enterprise AI vendors are structurally constrained from implementing the architecture described above.

4.1 The Connector Incumbents

Vendor	Approach	Structural Constraint
Glean	100+ API connectors; enterprise search + agents	Connector architecture is the product. Pivoting to device-level interception would require abandoning the platform’s core engineering investment and rewriting the sales narrative.
Microsoft Copilot	Microsoft 365 Graph API + partner connectors	Limited to data within Microsoft’s ecosystem and approved connectors. Recall’s failure precludes revisiting device-level capture. Constrained by Microsoft’s enterprise customer sensitivity to privacy incidents.
Google Agentspace	Google Workspace + third-party connectors	Same connector dependency as Glean, with narrower initial ecosystem. Google’s cloud-first business model conflicts with on-device processing.
Salesforce Agentforce	CRM data + approved integrations	Scoped to Salesforce data and partner ecosystem. Platform economics incentivize keeping data within Salesforce rather than observing the broader enterprise context.

Each incumbent is trapped by its own architecture: their connector investments are sunk costs, their engineering teams are organized around connector maintenance, and their pricing models assume cloud-processed, API-sourced data. A pivot to device-level interception would require writing off their existing platform infrastructure and rebuilding from scratch—a classic innovator’s dilemma.

4.2 The Adjacent Players

4.2.1 Celonis and Process Mining

Celonis performs task mining and process intelligence on desktop activity, making it the closest architectural neighbor to the ECE. However, Celonis captures process execution sequences—click paths, task completion times, process compliance—rather than organizational knowledge. Its output is a process graph, not a knowledge graph. Celonis answers “how does this process run?”; the ECE answers “what does this organization know?” The domains are adjacent but categorically different.

4.2.2 Screenpipe (Open Source)

Screenpipe provides open-source personal screen and audio capture with local AI processing. It validates the demand for device-level context capture but operates at the individual level without organizational federation, enterprise deployment infrastructure, or the self-constructing knowledge graph that transforms raw observation into organizational intelligence.

4.2.3 Foundation Capital’s Context Graphs Thesis

In December 2025, Foundation Capital published an investment thesis identifying “context graphs” as the next infrastructure layer for enterprise AI—validating the market opportunity from the venture capital perspective. The ECE architecture is aligned with this thesis while proposing a specific architectural approach (device-level interception, autonomous construction) that the thesis does not prescribe.

5. The Economic Case for Context Infrastructure

5.1 The Cost of Context Starvation

The economic impact of missing organizational context manifests across five measurable dimensions:

Cost Category	Annual Impact (10,000 employees)	Source / Basis
Knowledge loss from turnover	$8–12M	Extrapolated from IDC $31.5B aggregate⁷; 42% of institutional knowledge resides solely with individuals
AI project failure	$12–20M	Average enterprise AI portfolio investment, 80% failure rate (RAND)⁴
Redundant discovery	$5–8M	Employees spend 20% of work time searching for information or recreating knowledge that exists elsewhere
Cross-functional blind spots	$3–5M	Engineering and Sales working the same customer issue through different tools, neither aware of the other
Consulting displacement	$2–4M	Organizations hire consultants to answer questions a context-aware AI could answer: how does this organization actually operate?

Conservative aggregate: $30–49M annually for a 10,000-person organization, with the majority of value driven by eliminating knowledge loss and reducing AI project failure rates.

5.2 The Context Multiplier Effect

Glean’s trajectory provides direct market validation: reaching $200M ARR in under four years proves enterprises will pay aggressively for organizational knowledge access. The Forrester Total Economic Impact study on knowledge graph platforms documented 320% ROI over three years. These returns accrue to connector-based, manually constructed approaches with all their documented limitations. An architecture that eliminates connectors, automates graph construction, and captures the decision-level context that current approaches miss structurally should deliver a multiple of this value.

The broader market context reinforces this: worldwide AI spending will reach $2.5 trillion in 2026, but with failure rates above 80%, organizations face over $2 trillion in at-risk investment. The ECE does not compete with AI models or AI applications—it provides the context infrastructure that makes every other AI investment more effective. This is an infrastructure play, not an application play.

5.3 The Infrastructure Thesis

Three structural dynamics favor a context infrastructure approach:

Models are commoditizing; context is not. GPT-4, Claude, Gemini, and their successors are converging in capability and racing toward zero on price. Organizational context—the institutional knowledge of how a specific company operates—is unique, proprietary, and impossible to replicate through model improvement. Context is the durable moat.
Connectors scale linearly; infrastructure scales exponentially. Each new SaaS application requires a new connector integration. A device-level architecture deploys once per device and automatically observes every application the user touches—including applications launched next quarter that don’t exist yet. Zero incremental integration cost.
The security precedent is established. Enterprise endpoints already host 3–5 agents (CrowdStrike, DLP, Zscaler, MDM, UEM) that observe user behavior for security. The ECE is architecturally less invasive than any of them—capturing structured application data, never screenshots or keystrokes—and uses the identical deployment infrastructure.

6. Research Agenda and Validation Methodology

The architecture described above represents a research framework, not a validated product. This section outlines the empirical questions that must be answered to confirm or refute the thesis.

6.1 Core Hypotheses to Validate

Hypothesis	Validation Method	Success Criteria
H1: Device-level interception captures richer context than API connectors	Side-by-side comparison: extract knowledge triples from browser DOM observation vs. Salesforce API for the same user session	DOM-sourced graph contains >30% more relationship edges than API-sourced graph
H2: Small language models (1–3B) can perform entity extraction on device at acceptable accuracy	F1 score measurement on manually labeled test set of enterprise application data	F1 > 0.85 on entity resolution; F1 > 0.80 on relationship extraction
H3: Tiered compression achieves 1,000:1 without information loss for typical enterprise queries	Query equivalence testing: compare model answers using full context vs. compressed graph context	>95% answer equivalence on 100-query benchmark
H4: Distributed compute across idle endpoints provides viable inference capacity	Throughput measurement on 10-device test cluster running distributed embedding generation	Aggregate throughput within 2x of single-server GPU baseline
H5: AD-based federation preserves privacy while enabling organizational queries	Red team exercise: attempt to reconstruct individual behavior from team/org-level graph queries	Zero individual behavior reconstruction from aggregated graph layers

6.2 Phased Validation Approach

Phase 1: Proof of Concept (Weeks 1–4). Build a browser extension POC that captures DOM content from 3–5 SaaS applications and feeds a local knowledge graph. Benchmark extracted knowledge triples against equivalent API connector output. Validate SLM entity extraction accuracy on consumer hardware.

Phase 2: Competitive Positioning (Weeks 2–6). Develop differentiation briefs against Celonis, Glean, and Microsoft Recall. Engage with the Foundation Capital Context Graphs thesis. Publish this paper as thought leadership to validate demand and generate inbound interest.

Phase 3: Technical Validation (Weeks 4–10). Prototype distributed compute coordination across a test cluster. Benchmark tiered inference latency on consumer hardware. Validate the single-agent-with-domain-adapters architecture against multi-agent baselines.

Phase 4: Market Validation (Weeks 8–16). Conduct 10–15 enterprise buyer interviews targeting CISOs and CIOs at mid-market companies. Test the pitch: “You already deploy DLP agents on every endpoint. We use the same infrastructure for knowledge extraction instead of threat detection.” Develop the legal/privacy framework for compliant deployment across jurisdictions.

6.3 Open Research Questions

Entity resolution at enterprise scale: Can SLMs reliably reconcile the same entity across different names, abbreviations, and organizational jargon in real-time? This is the single hardest technical challenge in the architecture.
Graph schema evolution: How does an autonomously constructed knowledge graph adapt its ontology as the organization restructures, acquires companies, or enters new markets?
Trust architecture for knowledge federation: What consent and transparency mechanisms are required for employees to accept organizational knowledge aggregation? Does value delivery (personal productivity gains) offset privacy sensitivity?
Cross-border federation under GDPR: How does the federated graph handle data residency requirements when organizational units span EU and non-EU jurisdictions?
Adversarial robustness: Can the graph be poisoned by deliberate misinformation in observed communications? What validation mechanisms prevent graph contamination?

7. Conclusion

The enterprise AI industry is spending $2.5 trillion annually while failing at rates that no other technology category would tolerate. The root cause is not model quality—frontier models are converging in capability and plummeting in cost. The root cause is organizational context: the institutional knowledge that makes AI useful for a specific enterprise rather than generally capable for anyone.

Current architectural approaches have reached documented ceilings. API connectors capture what systems expose, not what organizations know. RAG finds similar text, not structural relationships. Knowledge graphs deliver extraordinary ROI but cannot be constructed at the rate organizations evolve. Screen capture was rejected by the market on privacy grounds.

The Enterprise Context Engine framework proposes a different path: observe organizational knowledge where it naturally occurs—on the devices where people work—using observation mechanisms the enterprise has already accepted for security. Construct the knowledge graph autonomously rather than manually. Compress organizational context through tiered abstraction to fit within model constraints. Distribute compute across hardware the organization already owns. Federate individual knowledge into organizational intelligence through the identity infrastructure the enterprise already operates.

Each of these principles draws from established precedent: DLP platforms for device-level observation, BOINC for distributed compute, Active Directory for identity federation, endpoint security for unified deployment. The innovation is not any single principle—it is their integration into a coherent architecture for solving the enterprise context problem.

Whether this specific architecture proves viable is an empirical question that the validation methodology in Section 6 is designed to answer. What is not an empirical question is the severity of the problem: a $2.5 trillion market with an 80–95% failure rate is a market waiting for the right infrastructure layer. The organization that builds it will define the next era of enterprise software.

References

1. McKinsey & Company. “The State of AI in 2025: Agents, Innovation, and Transformation.” March 2025.

2. MIT Sloan Management Review & NANDA. “The GenAI Divide: State of AI in Business 2025.” August 2025.

3. S&P Global Market Intelligence. “Voice of the Enterprise: AI & Machine Learning, Use Cases 2025.” March 2025.

4. RAND Corporation. “The Root Causes of Failure for Artificial Intelligence Projects.” RR-A2680-1. 2024.

5. Gartner. “Lack of AI-Ready Data Puts AI Projects at Risk.” February 2025.

6. Deloitte. “State of AI in the Enterprise, 7th Edition.” 2026.

7. IDC. “The High Cost of Not Finding Information.” Cited in Panopto Workplace Knowledge Study.

8. Sinequa. “IT Leaders Survey on Knowledge Loss.” 2022.

9. BARC Research. “Knowledge Graphs in AI: Adoption and Trends.” 2025.

10. Glean, Inc. “Glean Surpasses $200M in ARR for Enterprise AI.” December 2025.

11. MarketsandMarkets. “Knowledge Graph Market — Global Forecast to 2030.” 2025.

12. Productiv. “SaaS Management Report.” 2024.

13. MarketsandMarkets. “Endpoint Security Market Report.” 2025.

14. Google DeepMind. “Evaluating Multi-Agent Systems for Sequential Reasoning Tasks.” December 2025.

15. Emerald Insight. “Knowledge Loss Induced by Organizational Member Turnover.” The Learning Organization, 2023.

16. IBM. “Enterprise Data Utilization Study.”

17. Foundation Capital. “Context Graphs: The Next Infrastructure Layer for Enterprise AI.” December 2025.

18. European Parliament. Regulation (EU) 2024/1689 (AI Act), Article 26.

19. Stardog / Forrester. “Total Economic Impact of Knowledge Graph Platforms.” 2024.

20. Gartner. “Worldwide AI Spending Forecast.” 2026.