Skip to main content
The Chat API is a Chat Completions interface that automatically injects Neuradex knowledge into LLM conversations. While it offers a familiar OpenAI SDK-style API, your project’s knowledge base is automatically injected as context. Additionally, tools with an execute function are automatically loop-executed by the SDK, so you don’t need to write the logic for returning tool results to the LLM yourself.

Overview

const client = new NdxClient({
  apiKey: process.env.NEURADEX_API_KEY,
  projectId: 'your-project-id',
});

// Text generation (with automatic memory injection)
const stream = client.chat.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Tell me about our return policy' }],
  memory: { enabled: true },
});

// Receive via streaming
for await (const chunk of stream.textStream) {
  process.stdout.write(chunk);
}
When memory is enabled, Neuradex automatically retrieves knowledge and episodes related to the query and injects them into the LLM context. No need to build your own RAG pipeline.

Method List

MethodDescription
create(params)Create a Chat Completion (streaming supported)

create()

Creates a Chat Completion and returns a ChatStream.

Parameters

model
string
required
Model ID to use (e.g., 'gpt-4o', 'gpt-4o-mini')
messages
ChatMessage[]
required
Array of messages
tools
Record<string, ChatTool>
Tool definitions. Include an execute function for automatic execution by the SDK.
maxToolRoundtrips
number
default:"5"
Maximum number of tool auto-execution roundtrips
memory
ChatMemoryOption
Memory injection options
temperature
number
Temperature parameter (0-2)
maxTokens
number
Maximum generation tokens
stream
boolean
default:"true"
Enable/disable streaming
onText
(text: string) => void
Callback for each text chunk
onToolCall
(call) => void
Callback when a tool is called
onToolResult
(result) => void
Callback when a tool result is received

Return value: ChatStream

ChatStream provides a multi-consumption pattern. You can access text, events, or final results from the same stream.
interface ChatStream {
  textStream: AsyncIterable<string>;           // Stream of text chunks
  fullStream: AsyncIterable<ChatStreamEvent>;  // Stream of all events
  text: Promise<string>;                       // Final complete text
  toolCalls: Promise<ToolCallInfo[]>;          // All tool call info
  usage: Promise<ChatUsage | null>;            // Token usage
  finishReason: Promise<string>;               // Completion reason
}

Automatic Memory Injection

When the memory option is enabled, knowledge and episodes related to the user’s query are automatically injected into the LLM context.
const stream = client.chat.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'What is the latest return policy?' }],
  memory: {
    enabled: true,
    maxTokens: 4000,         // Token budget for context
    includeEpisodes: true,   // Include Q&A history
  },
});

console.log(await stream.text);

ChatMemoryOption

enabled
boolean
required
Whether to enable memory injection
maxTokens
number
Maximum tokens for injected context
includeEpisodes
boolean
Whether to include episodes (Q&A history, change history)

Automatic Tool Execution

When a tool definition includes an execute function, the SDK automatically handles tool execution and returns the results to the LLM. It loops automatically until the LLM decides no more tools are needed, or until maxToolRoundtrips is reached.
const stream = client.chat.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Check the weather in Tokyo and tell me if I need an umbrella' }],
  tools: {
    getWeather: {
      description: 'Get current weather for a city',
      parameters: {
        type: 'object',
        properties: {
          city: { type: 'string', description: 'City name' },
        },
        required: ['city'],
      },
      // The SDK automatically executes this and returns the result to the LLM
      execute: async ({ city }) => {
        const res = await fetch(`https://api.weather.example/v1/${city}`);
        const data = await res.json();
        return JSON.stringify(data);
      },
    },
  },
  maxToolRoundtrips: 3,
});

// Get the final answer including tool execution results
console.log(await stream.text);
If a tool without an execute function is called, the stream ends with finishReason: 'tool_calls'. In this case, you can get the called tool info from the toolCalls property and handle it yourself.

Streaming

textStream — Text only

const stream = client.chat.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Hello!' }],
});

for await (const chunk of stream.textStream) {
  process.stdout.write(chunk);  // Real-time display
}

fullStream — All events

When you need to handle tool calls and completion events as well:
const stream = client.chat.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Question' }],
  tools: { /* ... */ },
});

for await (const event of stream.fullStream) {
  switch (event.type) {
    case 'text-delta':
      process.stdout.write(event.textDelta);
      break;
    case 'tool-call':
      console.log(`Tool call: ${event.name}`, event.args);
      break;
    case 'tool-result':
      console.log(`Tool result: ${event.name}`, event.result);
      break;
    case 'roundtrip-complete':
      console.log(`Roundtrip ${event.roundtrip} complete`);
      break;
    case 'finish':
      console.log(`Done: ${event.finishReason}`);
      if (event.usage) {
        console.log(`Tokens: ${event.usage.totalTokens}`);
      }
      break;
    case 'error':
      console.error(`Error: ${event.error}`);
      break;
  }
}

Callback style

You can also process events via callbacks without consuming the stream:
const stream = client.chat.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Question' }],
  onText: (text) => process.stdout.write(text),
  onToolCall: (call) => console.log(`Tool: ${call.name}`),
  onToolResult: (result) => console.log(`Result: ${result.result}`),
});

// Wait for completion
const finalText = await stream.text;

Non-streaming — Final result only

const stream = client.chat.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Give me a short answer' }],
});

// Get final text without consuming the stream
const text = await stream.text;
const usage = await stream.usage;

console.log(text);
console.log(`Tokens used: ${usage?.totalTokens}`);

Type Definitions

ChatMessage

interface ChatMessage {
  role: 'system' | 'user' | 'assistant' | 'tool';
  content: string | null;
  name?: string;
  tool_call_id?: string;
  tool_calls?: ChatToolCall[];
}

ChatTool

interface ChatTool<TArgs = Record<string, unknown>> {
  description: string;
  parameters: Record<string, unknown>;  // JSON Schema
  execute?: (args: TArgs) => Promise<string> | string;
}

ChatStreamEvent

type ChatStreamEvent =
  | { type: 'text-delta'; textDelta: string }
  | { type: 'tool-call'; toolCallId: string; name: string; args: Record<string, unknown> }
  | { type: 'tool-result'; toolCallId: string; name: string; result: string }
  | { type: 'roundtrip-complete'; roundtrip: number }
  | { type: 'finish'; usage: ChatUsage | null; finishReason: string }
  | { type: 'error'; error: string };

ChatUsage

interface ChatUsage {
  promptTokens: number;
  completionTokens: number;
  totalTokens: number;
}

Use Cases

Customer Support Bot with Memory

import { NdxClient } from '@neuradex/sdk';

const client = new NdxClient({
  apiKey: process.env.NEURADEX_API_KEY,
  projectId: process.env.NEURADEX_PROJECT_ID,
});

async function handleCustomerQuery(question: string): Promise<string> {
  const stream = client.chat.create({
    model: 'gpt-4o',
    messages: [
      {
        role: 'system',
        content: 'You are a customer support assistant. Answer politely.',
      },
      { role: 'user', content: question },
    ],
    memory: {
      enabled: true,
      maxTokens: 8000,
      includeEpisodes: true,  // Also reference past Q&A
    },
  });

  return await stream.text;
}

AI Agent with Tools

const stream = client.chat.create({
  model: 'gpt-4o',
  messages: [
    { role: 'system', content: 'You are a task management assistant.' },
    { role: 'user', content: 'Remind me about tomorrow\'s meeting' },
  ],
  tools: {
    createReminder: {
      description: 'Create a reminder',
      parameters: {
        type: 'object',
        properties: {
          title: { type: 'string' },
          datetime: { type: 'string', format: 'date-time' },
        },
        required: ['title', 'datetime'],
      },
      execute: async ({ title, datetime }) => {
        await db.reminders.create({ title, datetime });
        return `Reminder "${title}" set for ${datetime}`;
      },
    },
    searchCalendar: {
      description: 'Search calendar events',
      parameters: {
        type: 'object',
        properties: {
          query: { type: 'string' },
          date: { type: 'string', format: 'date' },
        },
        required: ['query'],
      },
      execute: async ({ query, date }) => {
        const events = await calendar.search(query, date);
        return JSON.stringify(events);
      },
    },
  },
  memory: { enabled: true },
  maxToolRoundtrips: 5,
});

for await (const event of stream.fullStream) {
  if (event.type === 'tool-call') {
    console.log(`Executing: ${event.name}...`);
  }
  if (event.type === 'text-delta') {
    process.stdout.write(event.textDelta);
  }
}

Next Steps

React

Build chat UI with useChat hook

Memory API

Context assembly

Knowledge API

Knowledge management