The AI Engine is a unified Zustand store that manages all aspects of the AI-powered conversational experience. It combines multiple specialized slices to handle chat interactions, audio processing, 3D scene management, and speech recognition.
The main idea is to provide a single source of truth for the application's state, allowing different components to easily access and modify the state as needed.
You can create multiple scenarios or templates by reusing the same AI Engine store, making it easy to maintain and extend.
Architecture Overview
The AI Engine is built using a modular slice architecture, combining four key components:
Chat Slice: Manages conversation state and message handling
Audio Slice: Handles text-to-speech, audio playback, and lip-sync
Scene Slice: Controls 3D avatar animations and camera positioning
/src
├── /store
│ ├── /aiEngine.js # Main Zustand store combining all slices
│ ├── /audioSlice.js # Manages audio playback and processing
│ ├── /chatSlice.js # Manages chat messages and state
│ ├── /sceneSlice.js # Manages 3D scene and avatar state
│ ├── /speechRecognitionSlice.js # Manages speech recognition state
Core Components
Store Structure (aiEngine.js)
The main store combines all slices into a single unified interface:
To simplify binding and avoid typos, key constants are defined in the src/store/constants.js file. It includes the audio statuses, avatar animations, moods, chat statuses, and more.
Data Flow
Message Processing Pipeline
User Input → Chat Slice
Via text input or speech recognition
Message added to conversation history
API call initiated
AI Response → Chat Slice
Response received from backend
Messages parsed with metadata (animation, expression)
Added to message queue
Audio Generation → Audio Slice
TTS API called for each message
Audio URLs cached
Playback queue managed
Avatar Synchronization → Scene Slice
Avatar FSM triggered on state changes
Animation and mood updated
Camera position adjusted
Playback → Audio & Scene
Audio played sequentially
Lip-sync data processed
Avatar animations synchronized
Integration Points
Backend APIs
/chat: Message processing endpoint
/tts: Text-to-speech generation
Scenario-based routing for context
Frontend Components
Chat interfaces consume message state
3D viewers subscribe to scene state (Vanilla Three.js or React Three Fiber)