The pipeline
as small parts.

Every step of the network ships in the repo. Use them stand-alone, fork them, embed them in your own product. The website is one way to consume the rails; this is the other.

open source · audit-friendly · runs locally on Apple Silicon
01 · python

publish.py

Flow A — the website pipeline: download → transcribe → diarize → translate → neural TTS → publish to the catalog.

$ python scripts/pipeline/publish.py <youtube_url>
02 · python

publish_clone.py

Flow B / Plan B — cross-lingual voice clone. Each speaker's own voice, speaking Mandarin, via Qwen3-TTS in-context cloning.

$ python scripts/pipeline/publish_clone.py <url>
03 · python

publish_clone_vc.py

Flow C / Plan C — clean native TTS + seed-vc timbre transfer. Clear, standard pronunciation in the speaker's voice; the hard multi-speaker case.

$ python scripts/pipeline/publish_clone_vc.py <vid> --audio <wav>
04 · python

glossary.py

Keeps crypto / AI / web3 / startup proper nouns in English during translation (instruction + source masking + Chinese→English repair).

$ from glossary import repair, instruction

One command, multilingual.

The pipeline is scriptable end-to-end. Point an agent at a URL and it returns a publishable, Chinese-dubbed episode with transcripts and metadata.

# Plan C — clean voice conversion publish_clone_vc.py <vid> --audio out.wav → interviews.js + clones.json updated

Every episode is addressable.

Each episode has a stable URL — /interview/<id> — with audio, bilingual transcript, and metadata. Embed it in your own client.

GET /interview/9nnHC66MBqE { audio, transcript, summary }