===| 💬 DISCORD | 📖 DOCUMENTATION | 🎥 SETUP VIDEO |===
- Crystal LipSync is a lightweight, real-time lip sync + eye animation solution for Unity: lip sync (audio/text/mic), natural blinking, eye movement, look-target tracking, synkinetic brows, and mood-based mouth shaping ... all in one unified setup.
- It ships as pure C# source code (no DLLs, no native plugins, no black boxes). You can inspect, modify, and extend every line.
Why Crystal LipSync?
- Real-time FFT spectral analysis on CPU (no preprocessing, no baking, no waiting)
- Zero external dependencies (no cloud, no API keys, no runtime downloads)
- Works on every Unity platform, including WebGL (v1.3.0+)
- Supports audio-driven lip sync and text-driven lip sync (no voice-over required)
- Designed for indie workflows: fast setup, auto-mapping, and clean integrations
Integrations
- Game Creator 2 (deep integration)
- Crystal LipSync integrates natively with Game Creator 2 via a one-click Setup Wizard and visual scripting instructions.
- Wizard provisions: lip sync, eye blink, eye movement, brow sync, look tracking, jaw bone setup, auxiliary mesh syncing
- Auto-detection + auto-mapping for common character types
- Visual scripting instructions (7) to: play/stop speech, change moods, swap audio sources, toggle blinking, control text lip sync
- Custom GC2 properties for: triggering text lip sync from Typewriter, reading current dialogue text, and routing audio through chosen sources
- Pixel Crushers Dialogue System
- Text-to-LipSync integration included
- Audio-to-LipSync works out of the box (no integration required)
- Yarn Spinner (full text + audio pipeline)
- Includes a dedicated Yarn Spinner assembly with:
- CrystalYarnLipSync: per-character bridge + auto-provisioning
- CrystalYarnTextPresenter: text-driven lip sync during dialogue
- CrystalYarnVoiceOverPresenter: audio-driven lip sync with voice-over clips
- Includes demo scenes for both text and audio workflows.
Core Features
- Real-time audio-driven lip sync
- Feed any AudioSource (voice-over, runtime TTS, streamed clips, microphone) and get smooth, frame-accurate viseme weights every frame.
- Supports 3D spatial audio: if Unity would otherwise produce silent spectrum data, Crystal LipSync can fall back to direct PCM sampling so lip sync stays reliable.
- 15-viseme system (industry standard)
- Maps speech to the professional 15-viseme set for believable mouth animation from close-up dialogue to cinematic sequences.
- Multi-blendshape per viseme (ARKit/FACS-ready)
- Each viseme can drive multiple blendshapes at independent weights (ideal for anatomical rigs).
- Example: "AA" can blend jawOpen (80%) + mouthFunnel (20%).
- Includes built-in ARKit presets that auto-map tuned multi-blendshape combinations. Use the "+" button per viseme to add/adjust blendshapes.
- Combined blendshape + jaw bone mode
- Drive blendshapes and jaw rotation together for more natural speech. The Setup Wizard auto-detects jaw bones and applies model-appropriate axes and max angles (iClone CC, DAZ Genesis, ARKit/FACS, etc.).
- Text-driven lip sync (no audio required & multi-lingual)
- Animate the mouth directly from dialogue text, including common letter groups (th, sh, ch, ee, oo). Timing automatically matches your typewriter reveal speed so mouth motion stays synced to what the player sees.
- Live microphone lip sync
- Capture mic input and drive visemes in real time using the same FFT analyzer (no ML, no phoneme recognition, no cloud). Useful for avatars, social VR, streaming overlays, and player-voice-driven characters. Optional monitoring or silent mode (no echo).
- Eye & Face Animation
- Natural eye movement (alive even when silent)
- Saccades (large gaze shifts every few seconds)
- Micro-saccades (tiny involuntary motion)
- Slow drift (natural fixation instability)
- VOR (Vestibulo-Ocular Reflex): stabilizes gaze during head motion
- Auto-detects eye bones and look blendshapes (ARKit, DAZ, VRM, iClone CC, generic) and picks the best method automatically.
- Hybrid blendshape/bone eye movement
- If a model only supports partial look blendshapes (example: vertical only), Crystal LipSync drives missing axes with eye bone rotation automatically. Full coverage = blendshape mode, no coverage = bone mode, partial coverage = hybrid.
- Look target tracking
- Assign any Transform as a look target. Eyes track smoothly with configurable weight, smoothing, and max deflection. Even at full tracking, a small amount of natural micro-movement remains so the gaze never looks "locked."
- Synkinetic brows
- Brows subtly follow vertical gaze direction (stronger raise, gentler lower) for a natural reflex feel. Auto-detects brow blendshapes across common conventions, and supports multi-blendshape weighting via the same "+" system.
- Natural blinking
- Randomized blink intervals with double blinks, half blinks, and configurable open/close speeds. Auto-detects blink shapes across ARKit, VRM, DAZ, iClone CC, and custom naming. Auxiliary meshes (like eyelashes) can sync automatically.
- Smart Setup & Quality-of-Life
- Multi-tier scoring recognizes naming conventions for:
- VRChat-style rigs
- Reallusion iClone CC3/CC4/CC5
- DAZ Genesis 2/3/8/9
- ARKit/FACS
- VRM / UniVRM
- Generic/custom rigs
- Also selects the correct face mesh automatically when multiple SkinnedMeshRenderers exist, filtering out non-face meshes.
- Auxiliary mesh synchronization
- For characters with separate eyelashes/beards/teeth meshes, the Blendshape Synchronizer mirrors viseme and blink weights from the master mesh to auxiliary meshes in LateUpdate. The Setup Wizard can detect and provision this automatically.
- One-click Setup Wizard
- A single editor window provisions everything:
- AudioSource + Controller
- BlendshapeTarget with auto-mapped visemes (including ARKit multi-blend presets)
- Jaw Bone Target (auto axis config)
- Eye blink + eye movement (auto-detected)
- Brow sync
- Blendshape Synchronizer (aux meshes)
- Text lip sync
- Runs as one Undo action and is safe to re-run (skips existing components).
- Mood system
- Four moods: Neutral, Happy, Angry, Sad ... each with its own mapping set. Switch at runtime from code or visual scripting so the same audio can "read" differently depending on emotion.
- Shareable profiles
- Use ScriptableObject profiles to store analysis settings and per-viseme multipliers. Tune once per voice type and reuse across many characters.
- Lightweight, transparent, and build-friendly
- Pure source code: IL2CPP/AOT-friendly, debugger-friendly, modifiable
- Minimal overhead: one FFT pass + blendshape writes per frame
- No allocations during playback after initialization
- No platform-specific binaries or hidden dependencies
Who it's for
- Indie devs who want professional-looking dialogue without a massive animation budget
- Visual novels / RPGs that need expressive characters even without voice-over
- Game Creator 2 users who want a native-feeling workflow (wizard + instructions)
- DAZ / iClone CC / ARKit / VRChat / VRM creators who want reliable auto-mapping
- Yarn Spinner / PixelCrusher Dialogue System users who want text + voice-over lip sync integrated cleanly