SceneLine: AI-Powered Dubbing Practice Platform

SceneLine: AI-Powered Dubbing Practice Platform ๐ฌ
I recently released a new project called SceneLine โ an AI-powered dubbing practice platform that enables language learners to practice dubbing through immersive film/TV scene dialogues.
Why I Built This
The hardest part of language learning is often not vocabulary or grammar, but language sense โ that natural, authentic way of expression. Traditional listening and reading exercises struggle to develop this, but dubbing practice is a great solution:
- Immersive scenarios โ Real movie/TV dialogues, not textbook sentences
- Multi-character interactions โ Different tones, emotions, and pacing
- Instant feedback โ Know whether your pronunciation is accurate
But traditional dubbing practice has a pain point: no feedback. You say the lines along with the video, but donโt know if youโre doing it right.
SceneLine aims to solve this.
Core Features
๐๏ธ Real-time ASR (Speech Recognition)
- Uses FunASR for speech recognition
- Resident process mode, 10x performance optimization
- Real-time comparison of your pronunciation against the original
๐ 40+ TTS Voices
- Microsoft Edge TTS support
- 40+ voice options
- Filter by gender/locale to find the best reference voice
๐ญ Multi-character Dialogue Practice
- Supports multi-role scenes (like Friends dialogues)
- Individual scoring for each character
- Practice multiple roles by yourself
๐ Practice History & Statistics
- Three view modes: Overview / By Script / Details
- Track your progress curve
- Identify weak areas
๐ Smart Deduplication
- Content hash-based script deduplication
- Automatically merges identical content to avoid repetition
Tech Stack
Frontend
- React + Vite โ Fast development experience
- Tailwind CSS โ Clean UI design
- TypeScript โ Type safety
Backend
- Express + TypeScript โ API service
- FunASR โ Core speech recognition
- node-edge-tts โ TTS wrapper
AI/ML
- FunASR โ Alibaba DAMO Academyโs open-source ASR framework
- ModelScope โ Model hub
- faster-whisper โ Accelerated Whisper ASR
Quick Start
One-click Launch (Recommended)
git clone https://github.com/hugcosmos/SceneLine.git
cd SceneLine
./start.sh
First launch will:
- Ask if youโre in mainland China (auto-configures mirror sources)
- Download ASR model (~2GB, takes 6-9 minutes first time)
Then visit http://localhost:5000
Docker Deployment
docker-compose up -d
System Requirements
- Node.js: 20+
- Python: 3.9-3.11 (ASR dependencies, torch doesnโt support 3.12+)
- Memory: Minimum 4GB (ASR model uses ~2GB)
- Disk: 3GB+ free space
- FFmpeg: For audio format conversion
Project Structure
sceneline/
โโโ server/ # Backend (Express + TypeScript)
โ โโโ lib/ # Core libraries (ASR, TTS)
โ โโโ routes/ # API routes
โโโ client/ # Frontend (React + Vite + Tailwind)
โ โโโ src/pages/ # Page components
โโโ shared/ # Shared type definitions
โโโ models/ # ASR model cache
โโโ tts-cache/ # TTS audio cache
โโโ docker-compose.yml
License
MIT License โ Fully open source, contributions welcome.
Future Plans
- Multi-user mode โ Support multiple users practicing simultaneously with real-time comparison
- Streaming ASR โ Faster real-time recognition with lower latency
- Intelligent scoring system โ More systematic and human-friendly scoring mechanism
- TTS upgrade โ Support for richer voice options
- Multiple TTS providers โ Integration with more API providers (ElevenLabs, iFlytek, Baidu, etc.)
Links
- GitHub: github.com/hugcosmos/SceneLine
- Online Demo: (Coming soon)

Discuss
๐ฌ Join the discussion on GitHub
If youโre also learning languages or interested in AI + education, give it a try and share your feedback!
Made with ๐ by Nicky & AI
๐ฌ Discuss on GitHub