NotebookLM is a self-hosted platform designed to create, manage, and enhance audiobooks. This platform leverages various tools and models to parse, transcribe, and improve manuscripts, ultimately providing high-quality audio content.
Module | Description | Tech | Source | Status |
---|---|---|---|---|
Script parser | Convert to scene structured json | LLM Multi-shot Prompt | script_parser.py | Done |
TTS server | Parler-tts server to generate speech for dialogs in scenes | Parler-tts | tts_generator.py | Done |
AudioGen module | Audiocraft/magnet background sound creator | Audiocraft/magnet | In progress | |
Basic audiobook | Linear workflow for full audio creation without scene logic | python | audiobook.py | In progress |
An experiment to build a production grade audiobook content generator system to help publishers build on their IP and reach larger audience.
GPU Model | torch.compile |
Speech Generator (secs) | Script Parser (Local) (secs) | Script Parser (Online) (secs) | Total Time (secs) |
---|---|---|---|---|---|
GTX 1060 | no | 2775.97 | - | 87.37 | 2863.34 |
RTX 4050 | no | 489.93 | - | 93.82 | 583.75 |
RTX 4050 | yes | – | - | – | – |
NotebookLlama: Additional resources and use cases. Learn more
WishList - RTX 4090 GPU based computer for building End to End product inhouse. Buy Me Razer Blade 18 or ROG Zephyrus M16
Specific Speakers - Jon, Lea, Gary, Jenna, Mike, Laura