ETL structured and unstructured data into a governed, real-time knowledge graph
Tom Sawyer Data Streams is a schema-driven platform to ETL your structured and unstructured data into a single, governed, query-ready knowledge graph. Subscribe to Apache Kafka (or Confluent) topics sourced from databases, files, and APIs; apply precise transformations and filters; then run flows continuously to normalize, enrich, and link changing streams in real time. Persist the knowledge graph in a graph database for scalable sharing and downstream analysis. Integrate with existing pipelines and catalogs, and use the graph as a high-quality context layer for AI, including RAG and reasoning.
Tom Sawyer Data Streams reduces integration efforts across legacy and streaming systems, fits existing pipelines and tools, and delivers a complete, accurate picture for lineage, impact analysis, and operational decisions.
Watch this short introduction to Data Streams for unifying data into a governed, real-time knowledge graph.
Be the first to get early access to Data Streams. Your valuable input will help shape future releases, and our team of graph gurus can provide expert advice and guidance to help accelerate your data transformation objectives.
Enterprises face fragmented sources, legacy systems, and real-time demands. Tom Sawyer Data Streams supports a range of high-impact scenarios for data architects and AI platform engineers, from migration and continuous sync to validating generative AI results and integrating agentic results into a single-source-of-truth database.
Discover how Data Streams addresses common challenges with repeatable patterns and proven tooling.
ETL legacy relational models to a governed, query-ready knowledge graph. Introspect the source schema, design visual flows to rename, merge or split tables, convert properties into edges, and execute once to materialize the analytics-ready knowledge graph without expensive data migrations.
Keep operational systems and streams in lockstep with an always-current knowledge graph. Ingest Kafka topics and database changes, apply SpEL transformations and filters, and continuously normalize, enrich, and link records as they arrive.
Improve LLM accuracy by feeding normalized events into feature stores, embeddings, and graph-backed retrieval. Then validate outputs against authoritative nodes and lineage, and write agent results back to keep a single source of truth.
Create golden entities across silos to power consistent analytics and operations. Normalize identifiers, apply matching rules in SpEL, and persist deduplicated nodes and relationships for trusted 360° views.
Fuse transactions, accounts, devices, and geolocation to surface suspicious patterns in real time. Materialize relationship signals like shared devices or rings, enrich events in flow, and route high-confidence alerts downstream.
Consolidate schemas, datasets, policies, and lineage edges to accelerate impact analysis and compliance. Track how data moves, who uses it, and which rules apply, with the graph persisted in a graph database for broad sharing and auditability.
Connecting to your topics and building your knowledge graph model is made simple with Data Streams. Simply connect to your streams and apply transformations using the visual data flow editor.
View the resulting schema in an easily understandable graph drawing or tree view.
Transform topics to convert them from nodes to edges. Rename topics for consistency and clarity.
View the resulting schema as you work. See a graph or tree view of the schema.
Save your knowledge graph model to a graph database Sink and deploy the streams for continuous execution.
Tom Sawyer Data Streams ETLs both structured and unstructured data by subscribing to Apache Kafka (or Confluent) topics you provision. It then transforms and persists those events into your knowledge graph model. Use your tool of choice, such as Kafka Connect, Change Data Capture tools, or custom producers, to publish from your system of record and subscribe to the topic in Data Streams. You control which topics represent nodes, edges, or attributes, and you can secure access with existing Kafka authentication and encryption settings.
With Kafka, you can create topics from almost any system and feed them into Data Streams, including:
![]()
Create Kafka topics from your data sources for easy integration and transformation in Data Streams.
Tom Sawyer Data Stream's intuitive web-based designer lets you assemble sources, transformations, conditions, and sinks to build and monitor data flows—accelerating delivery while reducing custom code.
Automatic graph layout organizes complex pipelines for instant clarity, so you can iterate faster, validate logic visually, and promote changes with confidence, shortening the path from prototype to production.

Tom Sawyer Data Streams visual data flow editor makes it easy to build and validate your data flow.
Transformations turn raw stream events into a coherent, query-ready model. Incoming topics enter the flow as nodes; you then promote the right connections by converting selected nodes into edges and standardize naming to match your domain vocabulary. The result is a knowledge graph model that reflects how things actually relate, not just how they were recorded upstream.
For technical teams, this approach reduces custom ETL operations and enforces consistency without heavy refactoring. You define intent visually, validate the evolving structure against your schema, and keep semantics stable across sources. Expressions (SpEL) give you precise, lightweight control while keeping the overall flow simple to reason about and easy to review.

With Data Streams transformations, it's easy to build the knowledge graph model you wish you had.
Data Streams persists the model to a graph database for durability, sharing, and governance. The stored graph becomes a query-ready foundation for analytics, visualization, and AI, and is compatible with existing pipelines, catalogs, and security controls. Versionable schemas and clear provenance help teams evolve models safely while maintaining an authoritative, auditable source of truth.
Data Streams supports authentication with OAuth 2.0 and Keycloak to support secure, multi-user environments. Align with enterprise standards for access and encryption while maintaining a clean, auditable boundary around sensitive data.
Once your data flows are running, the resulting knowledge graph lives in your environment and fits into your existing infrastructure. Persisted in a graph database, the model plugs into your existing pipelines, catalogs, governance, AI workflows, and security policies. You can query the model directly, push extracts to warehouses, trigger downstream jobs, and integrate with AI, lineage, MDM, or observability tools.
The model and data are yours, so you can evolve schemas, add sources, and make version changes without vendor lock-in.
Tom Sawyer Data Streams supplies the structured, governed context that modern AI systems require. With your query-ready knowledge graph in place, AI teams gain a governed context layer for training, inference, and agent workflows.
The graph provides clean entities, explicit relationships, and provenance, which are ideal for AI feature engineering, retrieval/grounding, and constraint-aware prompts.
| Prepare and process data within AI pipelines: Normalize, enrich, and link events into structured entities and relationships suitable for feature stores, embeddings, and vector indexes. | |
| Support LLM (large language model) creation for improved accuracy: Provide graph-backed retrieval and grounding during training, fine-tuning, and RAG to reduce hallucinations and tighten answers. | |
| Validate generative AI results: Check outputs against authoritative nodes and edges, enforce policy constraints via graph rules, and trace impact through lineage for reproducible evaluation. | |
| Integrate agentic results into a single source of truth: Write agent plans, tool calls, and derived facts back into the governed graph to maintain a consolidated, provenance-rich record for ongoing learning and operations. |
For added value, pair Data Streams with Tom Sawyer Software’s graph and data visualization and analysis products to explore, analyze, and communicate the value of the graph. Our best-in-class automatic graph layout quickly transforms complex connections into clean, easy-to-understand visualizations.
Explorations is a no-code graph intelligence application for analysts of all levels to connect to graph databases, build queries visually without query language expertise, and instantly explore the results as an interactive knowledge graph. With built-in graph analysis algorithms, users can uncover hidden patterns and gain deeper insights effortlessly.

Perspectives is a powerful, low-code development platform for building standalone applications or embedding advanced data analysis and visualization into existing systems. Use advanced layouts, styling, and UX components to build custom apps and dashboards on top of your graph model.

Data Streams helps teams break through data silos so they can see the full picture. It provides a practical path to build a usable knowledge graph from disparate systems and streams, and to turn that graph into real-time insight for operations and analysis.
Consolidate relational, NoSQL, files, APIs, and event streams via Kafka into one consistent, query-ready model.
Visual flow design and expressive transformations reduce custom ETL work and shorten integration cycles.
Execute flows continuously to normalize, enrich, and link events as they arrive for timely decisions.
Schema-driven mapping, validation, and filtering produce cleaner entities and relationships.
Model relationships for impact analysis, lineage tracing, and root-cause investigation.
Persist the graph to a graph database for shared access, versionable schemas, and clear provenance.
Your model and data live in your database, ready for use with existing pipelines, tools, and policies.
Works with your Kafka authentication and encryption settings.
Start with a single migration or CDC flow and expand as value is proven across teams.
Copyright © 2025 Tom Sawyer Software. All rights reserved. | Terms of Use | Privacy Policy
Copyright © 2025 Tom Sawyer Software.
All rights reserved. | Terms of Use | Privacy Policy