Skip to main content

Overview

SourceContent represents document chunks — uploaded documents split into searchable units. Each chunk maintains a reference to its original source, preserving context and traceability. For why structured memory matters over simple chunk search, see the concept overview.

Fields

FieldTypeDescription
idstringUnique identifier
contentstringChunk text content
sourceIdstringReference to original document
sourceTypestringDocument type (e.g., pdf, markdown, text)
metadataobjectAdditional metadata (page number, section, etc.)
createdAtstringCreation timestamp

Supported Formats

FormatExtensionsNotes
PDF.pdfPage-level chunking
Markdown.md, .mdxSection-level chunking
Text.txtParagraph-level chunking

Chunking

Documents are automatically split into chunks optimized for semantic search. Each chunk retains metadata linking it back to its position in the original document. No manual configuration is required — the chunking process is fully automatic.
Each chunk preserves its relationship to the original source, so you can always trace a search result back to the exact page, section, or paragraph it came from.

Relationship to Knowledge

Facts extracted from SourceContent can become Knowledge. The original SourceContent serves as the source reference, maintaining a clear provenance chain from raw document to confirmed fact.
Document: "API Specification v2.pdf"
  |-- SourceContent chunk 1: "Authentication section"
  |     --> Knowledge: "API uses OAuth2 authentication"
  |-- SourceContent chunk 2: "Endpoints section"
  |     --> Knowledge: "Base URL is /api/v2/"
  |-- SourceContent chunk 3: "Error codes section"
        --> Knowledge: "Error 429 means rate limit exceeded"

Data Types

Concept overview of all data types

Knowledge API

SourceContent is managed alongside Knowledge