Chat API - Neuradex

The Chat API is a Chat Completions interface that automatically injects Neuradex knowledge into LLM conversations. While it offers a familiar OpenAI SDK-style API, your project’s knowledge base is automatically injected as context. Additionally, tools with an execute function are automatically loop-executed by the SDK, so you don’t need to write the logic for returning tool results to the LLM yourself.

Overview

const client = new NdxClient({
  apiKey: process.env.NEURADEX_API_KEY,
  projectId: 'your-project-id',
});

// Text generation (with automatic memory injection)
const stream = client.chat.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Tell me about our return policy' }],
  memory: { enabled: true },
});

// Receive via streaming
for await (const chunk of stream.textStream) {
  process.stdout.write(chunk);
}

When memory is enabled, Neuradex automatically retrieves knowledge and episodes related to the query and injects them into the LLM context. No need to build your own RAG pipeline.

Method List

Method	Description
`create(params)`	Create a Chat Completion (streaming supported)

create()

Creates a Chat Completion and returns a ChatStream.

Parameters

model

string

required

Model ID to use (e.g., 'gpt-4o', 'gpt-4o-mini')

messages

ChatMessage[]

required

Array of messages

tools

Record<string, ChatTool>

Tool definitions. Include an execute function for automatic execution by the SDK.

maxToolRoundtrips

number

default:"5"

Maximum number of tool auto-execution roundtrips

memory

ChatMemoryOption

Memory injection options

temperature

number

Temperature parameter (0-2)

maxTokens

number

Maximum generation tokens

stream

boolean

default:"true"

Enable/disable streaming

onText

(text: string) => void

Callback for each text chunk

onToolCall

(call) => void

Callback when a tool is called

onToolResult

(result) => void

Callback when a tool result is received

Return value: ChatStream

ChatStream provides a multi-consumption pattern. You can access text, events, or final results from the same stream.

interface ChatStream {
  textStream: AsyncIterable<string>;           // Stream of text chunks
  fullStream: AsyncIterable<ChatStreamEvent>;  // Stream of all events
  text: Promise<string>;                       // Final complete text
  toolCalls: Promise<ToolCallInfo[]>;          // All tool call info
  usage: Promise<ChatUsage | null>;            // Token usage
  finishReason: Promise<string>;               // Completion reason
}

Automatic Memory Injection

When the memory option is enabled, knowledge and episodes related to the user’s query are automatically injected into the LLM context.

const stream = client.chat.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'What is the latest return policy?' }],
  memory: {
    enabled: true,
    maxTokens: 4000,         // Token budget for context
    includeEpisodes: true,   // Include Q&A history
  },
});

console.log(await stream.text);

ChatMemoryOption

enabled

boolean

required

Whether to enable memory injection

maxTokens

number

Maximum tokens for injected context

includeEpisodes

boolean

Whether to include episodes (Q&A history, change history)

Automatic Tool Execution

When a tool definition includes an execute function, the SDK automatically handles tool execution and returns the results to the LLM. It loops automatically until the LLM decides no more tools are needed, or until maxToolRoundtrips is reached.

const stream = client.chat.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Check the weather in Tokyo and tell me if I need an umbrella' }],
  tools: {
    getWeather: {
      description: 'Get current weather for a city',
      parameters: {
        type: 'object',
        properties: {
          city: { type: 'string', description: 'City name' },
        },
        required: ['city'],
      },
      // The SDK automatically executes this and returns the result to the LLM
      execute: async ({ city }) => {
        const res = await fetch(`https://api.weather.example/v1/${city}`);
        const data = await res.json();
        return JSON.stringify(data);
      },
    },
  },
  maxToolRoundtrips: 3,
});

// Get the final answer including tool execution results
console.log(await stream.text);

If a tool without an execute function is called, the stream ends with finishReason: 'tool_calls'. In this case, you can get the called tool info from the toolCalls property and handle it yourself.

Streaming

textStream — Text only

const stream = client.chat.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Hello!' }],
});

for await (const chunk of stream.textStream) {
  process.stdout.write(chunk);  // Real-time display
}

fullStream — All events

When you need to handle tool calls and completion events as well:

const stream = client.chat.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Question' }],
  tools: { /* ... */ },
});

for await (const event of stream.fullStream) {
  switch (event.type) {
    case 'text-delta':
      process.stdout.write(event.textDelta);
      break;
    case 'tool-call':
      console.log(`Tool call: ${event.name}`, event.args);
      break;
    case 'tool-result':
      console.log(`Tool result: ${event.name}`, event.result);
      break;
    case 'roundtrip-complete':
      console.log(`Roundtrip ${event.roundtrip} complete`);
      break;
    case 'finish':
      console.log(`Done: ${event.finishReason}`);
      if (event.usage) {
        console.log(`Tokens: ${event.usage.totalTokens}`);
      }
      break;
    case 'error':
      console.error(`Error: ${event.error}`);
      break;
  }
}

Callback style

You can also process events via callbacks without consuming the stream:

const stream = client.chat.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Question' }],
  onText: (text) => process.stdout.write(text),
  onToolCall: (call) => console.log(`Tool: ${call.name}`),
  onToolResult: (result) => console.log(`Result: ${result.result}`),
});

// Wait for completion
const finalText = await stream.text;

Non-streaming — Final result only

const stream = client.chat.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Give me a short answer' }],
});

// Get final text without consuming the stream
const text = await stream.text;
const usage = await stream.usage;

console.log(text);
console.log(`Tokens used: ${usage?.totalTokens}`);

Type Definitions

ChatMessage

interface ChatMessage {
  role: 'system' | 'user' | 'assistant' | 'tool';
  content: string | null;
  name?: string;
  tool_call_id?: string;
  tool_calls?: ChatToolCall[];
}

ChatTool

interface ChatTool<TArgs = Record<string, unknown>> {
  description: string;
  parameters: Record<string, unknown>;  // JSON Schema
  execute?: (args: TArgs) => Promise<string> | string;
}

ChatStreamEvent

type ChatStreamEvent =
  | { type: 'text-delta'; textDelta: string }
  | { type: 'tool-call'; toolCallId: string; name: string; args: Record<string, unknown> }
  | { type: 'tool-result'; toolCallId: string; name: string; result: string }
  | { type: 'roundtrip-complete'; roundtrip: number }
  | { type: 'finish'; usage: ChatUsage | null; finishReason: string }
  | { type: 'error'; error: string };

ChatUsage

interface ChatUsage {
  promptTokens: number;
  completionTokens: number;
  totalTokens: number;
}

Use Cases

Customer Support Bot with Memory

import { NdxClient } from '@neuradex/sdk';

const client = new NdxClient({
  apiKey: process.env.NEURADEX_API_KEY,
  projectId: process.env.NEURADEX_PROJECT_ID,
});

async function handleCustomerQuery(question: string): Promise<string> {
  const stream = client.chat.create({
    model: 'gpt-4o',
    messages: [
      {
        role: 'system',
        content: 'You are a customer support assistant. Answer politely.',
      },
      { role: 'user', content: question },
    ],
    memory: {
      enabled: true,
      maxTokens: 8000,
      includeEpisodes: true,  // Also reference past Q&A
    },
  });

  return await stream.text;
}

AI Agent with Tools

const stream = client.chat.create({
  model: 'gpt-4o',
  messages: [
    { role: 'system', content: 'You are a task management assistant.' },
    { role: 'user', content: 'Remind me about tomorrow\'s meeting' },
  ],
  tools: {
    createReminder: {
      description: 'Create a reminder',
      parameters: {
        type: 'object',
        properties: {
          title: { type: 'string' },
          datetime: { type: 'string', format: 'date-time' },
        },
        required: ['title', 'datetime'],
      },
      execute: async ({ title, datetime }) => {
        await db.reminders.create({ title, datetime });
        return `Reminder "${title}" set for ${datetime}`;
      },
    },
    searchCalendar: {
      description: 'Search calendar events',
      parameters: {
        type: 'object',
        properties: {
          query: { type: 'string' },
          date: { type: 'string', format: 'date' },
        },
        required: ['query'],
      },
      execute: async ({ query, date }) => {
        const events = await calendar.search(query, date);
        return JSON.stringify(events);
      },
    },
  },
  memory: { enabled: true },
  maxToolRoundtrips: 5,
});

for await (const event of stream.fullStream) {
  if (event.type === 'tool-call') {
    console.log(`Executing: ${event.name}...`);
  }
  if (event.type === 'text-delta') {
    process.stdout.write(event.textDelta);
  }
}

Next Steps

React

Build chat UI with useChat hook

Memory API

Context assembly

Knowledge API

Knowledge management

​Overview

​Method List

​create()

​Parameters

​Return value: ChatStream

​Automatic Memory Injection

​ChatMemoryOption

​Automatic Tool Execution

​Streaming

​textStream — Text only

​fullStream — All events

​Callback style

​Non-streaming — Final result only

​Type Definitions

​ChatMessage

​ChatTool

​ChatStreamEvent

​ChatUsage

​Use Cases

​Customer Support Bot with Memory

​AI Agent with Tools

​Next Steps

React

Memory API

Knowledge API

Overview

Method List

create()

Parameters

Return value: ChatStream

Automatic Memory Injection

ChatMemoryOption

Automatic Tool Execution

Streaming

textStream — Text only

fullStream — All events

Callback style

Non-streaming — Final result only

Type Definitions

ChatMessage

ChatTool

ChatStreamEvent

ChatUsage

Use Cases

Customer Support Bot with Memory

AI Agent with Tools

Next Steps