SceneLine Home

SceneLine: AI-Powered Dubbing Practice Platform ๐ŸŽฌ

I recently released a new project called SceneLine โ€” an AI-powered dubbing practice platform that enables language learners to practice dubbing through immersive film/TV scene dialogues.

Why I Built This

The hardest part of language learning is often not vocabulary or grammar, but language sense โ€” that natural, authentic way of expression. Traditional listening and reading exercises struggle to develop this, but dubbing practice is a great solution:

  • Immersive scenarios โ€” Real movie/TV dialogues, not textbook sentences
  • Multi-character interactions โ€” Different tones, emotions, and pacing
  • Instant feedback โ€” Know whether your pronunciation is accurate

But traditional dubbing practice has a pain point: no feedback. You say the lines along with the video, but donโ€™t know if youโ€™re doing it right.

SceneLine aims to solve this.

Core Features

๐ŸŽ™๏ธ Real-time ASR (Speech Recognition)

  • Uses FunASR for speech recognition
  • Resident process mode, 10x performance optimization
  • Real-time comparison of your pronunciation against the original

๐Ÿ”Š 40+ TTS Voices

  • Microsoft Edge TTS support
  • 40+ voice options
  • Filter by gender/locale to find the best reference voice

๐ŸŽญ Multi-character Dialogue Practice

  • Supports multi-role scenes (like Friends dialogues)
  • Individual scoring for each character
  • Practice multiple roles by yourself

๐Ÿ“Š Practice History & Statistics

  • Three view modes: Overview / By Script / Details
  • Track your progress curve
  • Identify weak areas

๐Ÿ“ Smart Deduplication

  • Content hash-based script deduplication
  • Automatically merges identical content to avoid repetition

Tech Stack

Frontend

  • React + Vite โ€” Fast development experience
  • Tailwind CSS โ€” Clean UI design
  • TypeScript โ€” Type safety

Backend

  • Express + TypeScript โ€” API service
  • FunASR โ€” Core speech recognition
  • node-edge-tts โ€” TTS wrapper

AI/ML

  • FunASR โ€” Alibaba DAMO Academyโ€™s open-source ASR framework
  • ModelScope โ€” Model hub
  • faster-whisper โ€” Accelerated Whisper ASR

Quick Start

git clone https://github.com/hugcosmos/SceneLine.git
cd SceneLine
./start.sh

First launch will:

  • Ask if youโ€™re in mainland China (auto-configures mirror sources)
  • Download ASR model (~2GB, takes 6-9 minutes first time)

Then visit http://localhost:5000

Docker Deployment

docker-compose up -d

System Requirements

  • Node.js: 20+
  • Python: 3.9-3.11 (ASR dependencies, torch doesnโ€™t support 3.12+)
  • Memory: Minimum 4GB (ASR model uses ~2GB)
  • Disk: 3GB+ free space
  • FFmpeg: For audio format conversion

Project Structure

sceneline/
โ”œโ”€โ”€ server/          # Backend (Express + TypeScript)
โ”‚   โ”œโ”€โ”€ lib/         # Core libraries (ASR, TTS)
โ”‚   โ””โ”€โ”€ routes/      # API routes
โ”œโ”€โ”€ client/          # Frontend (React + Vite + Tailwind)
โ”‚   โ””โ”€โ”€ src/pages/   # Page components
โ”œโ”€โ”€ shared/          # Shared type definitions
โ”œโ”€โ”€ models/          # ASR model cache
โ”œโ”€โ”€ tts-cache/       # TTS audio cache
โ””โ”€โ”€ docker-compose.yml

License

MIT License โ€” Fully open source, contributions welcome.

Future Plans

  • Multi-user mode โ€” Support multiple users practicing simultaneously with real-time comparison
  • Streaming ASR โ€” Faster real-time recognition with lower latency
  • Intelligent scoring system โ€” More systematic and human-friendly scoring mechanism
  • TTS upgrade โ€” Support for richer voice options
  • Multiple TTS providers โ€” Integration with more API providers (ElevenLabs, iFlytek, Baidu, etc.)

SceneLine Home

Discuss

๐Ÿ’ฌ Join the discussion on GitHub


If youโ€™re also learning languages or interested in AI + education, give it a try and share your feedback!

Made with ๐Ÿ’™ by Nicky & AI