Skip to content

Neuron Instance

The object returned by createNeuron(). Provides methods for chatting, state management, and lifecycle control.

Methods

send(message): AsyncGenerator<string>

Stream tokens from the LLM as an async generator.

ts
for await (const token of neuron.send('Hello!')) {
  element.textContent += token
}
  • Adds the message and response to conversation history automatically
  • Throws if model isn't loaded yet or already generating
  • Use neuron.isLoading to check if ready

complete(message): Promise<string>

Get the full response as a single string. Convenience wrapper around send().

ts
const reply = await neuron.complete('Tell me a joke')
console.log(reply)

stop(): void

Stop generation mid-stream. The send() async generator will exit gracefully.

ts
neuron.stop()

setModel(modelId): Promise<void>

Switch to a different model. Downloads if not cached. Reuses the existing worker.

ts
await neuron.setModel('Llama-3.2-3B-Instruct-q4f16_1-MLC')

setPersonalityDocs(docs): void

Replace personality documents at runtime. Takes effect on the next send() call.

ts
neuron.setPersonalityDocs([
  { type: 'zero-shot', content: 'You are now a chef.' },
  { type: 'knowledge', content: 'You specialize in French cuisine.' },
])

setSystemPrompt(prompt): void

Replace the system prompt entirely. Clears any personality docs.

ts
neuron.setSystemPrompt('You are a helpful assistant.')

setTemperature(value): void

Update sampling temperature for subsequent send() calls. No engine rebuild — the new value applies on the next message.

ts
neuron.setTemperature(0.1)  // strict, factual
neuron.setTemperature(0.8)  // creative, more variation

setMaxTokens(value): void

Update the per-response token cap. Takes effect on the next send().

ts
neuron.setMaxTokens(2048)

setFrequencyPenalty(value): void

Update repetition discouragement (0.02.0, default 0.5). Higher values reduce repeated phrases.

ts
neuron.setFrequencyPenalty(0.7)

setMaxHistoryTurns(value): void

Update how many recent turns are sent in subsequent prompts. Doesn't truncate stored history — only changes what's included in future requests.

ts
neuron.setMaxHistoryTurns(20)

Sampler hot-swap (0.3.0+)

The four set* methods above all hot-swap without touching the engine — ideal for live settings UIs where users tweak temperature mid-chat. For per-instance routing where the model changes, use setModel() which reloads weights but reuses the worker.

getHistory(): Array<{ role, content }>

Get a copy of the conversation history.

ts
const history = neuron.getHistory()
// [{ role: 'user', content: 'Hi' }, { role: 'assistant', content: 'Hello!' }]

// Save to localStorage
localStorage.setItem('chat', JSON.stringify(history))

setHistory(messages): void

Restore conversation history from a saved state.

ts
const saved = JSON.parse(localStorage.getItem('chat') || '[]')
neuron.setHistory(saved)

clearHistory(): void

Clear all conversation history. The next send() starts a fresh conversation.

ts
neuron.clearHistory()

destroy(): void

Terminate the worker, release resources. Call when unmounting a component or leaving a page.

ts
// In Vue
onUnmounted(() => neuron.destroy())

// In React
useEffect(() => () => neuron.destroy(), [])

Properties

All properties are read-only.

isLoading: boolean

true while the model is downloading or initializing. Check before calling send().

ts
if (!neuron.isLoading) {
  for await (const token of neuron.send('Hi')) { ... }
}

isGenerating: boolean

true while the LLM is actively generating a response.

ts
if (neuron.isGenerating) {
  showStopButton()
}

loadProgress: { percent: number, text: string }

Current model loading progress.

ts
// Use with onProgress callback for reactive updates
createNeuron({
  onProgress: (pct, text) => {
    progressBar.style.width = `${pct}%`
    statusText.textContent = text
  },
})

// Or poll (less ideal)
console.log(neuron.loadProgress) // { percent: 45, text: "Loading params_shard_3..." }

Part of the AgentLayerZero platform