Overview
SourceContent represents document chunks — uploaded documents split into searchable units. Each chunk maintains a reference to its original source, preserving context and traceability. For why structured memory matters over simple chunk search, see the concept overview.Fields
| Field | Type | Description |
|---|---|---|
id | string | Unique identifier |
content | string | Chunk text content |
sourceId | string | Reference to original document |
sourceType | string | Document type (e.g., pdf, markdown, text) |
metadata | object | Additional metadata (page number, section, etc.) |
createdAt | string | Creation timestamp |
Supported Formats
| Format | Extensions | Notes |
|---|---|---|
.pdf | Page-level chunking | |
| Markdown | .md, .mdx | Section-level chunking |
| Text | .txt | Paragraph-level chunking |
Chunking
Documents are automatically split into chunks optimized for semantic search. Each chunk retains metadata linking it back to its position in the original document. No manual configuration is required — the chunking process is fully automatic.Each chunk preserves its relationship to the original source, so you can always trace a search result back to the exact page, section, or paragraph it came from.
Relationship to Knowledge
Facts extracted from SourceContent can become Knowledge. The original SourceContent serves as the source reference, maintaining a clear provenance chain from raw document to confirmed fact.Related Pages
Data Types
Concept overview of all data types
Knowledge API
SourceContent is managed alongside Knowledge

