llm-recipes

NotebookLM - Self-hosted Audiobook Generation Platform

Music from Book
Voices from Book

Overview

NotebookLM is a self-hosted platform designed to create, manage, and enhance audiobooks. This platform leverages various tools and models to parse, transcribe, and improve manuscripts, ultimately providing high-quality audio content.

"Audiobook Generator"

Steps

"Script Parser"

Current Status

Module Description Tech Source Status
Script parser Convert to scene structured json LLM Multi-shot Prompt script_parser.py Done
TTS server Parler-tts server to generate speech for dialogs in scenes Parler-tts tts_generator.py Done
AudioGen module Audiocraft/magnet background sound creator Audiocraft/magnet   In progress
Basic audiobook Linear workflow for full audio creation without scene logic python audiobook.py In progress

Documentation

An experiment to build a production grade audiobook content generator system to help publishers build on their IP and reach larger audience.

Inference Speed

GPU Model torch.compile batch generation + chunking Speech Generator (secs) Script Parser (Local) (secs) Script Parser (Online) (secs) Total Time (secs)
GTX 1060 no no 2775.97 - 87.37 2863.34
GTX 1060 no yes 663.81 - 88.96 683.81
RTX 4050 no no 489.93 - 93.82 583.75
RTX 4050 no yes - - -  
RTX 4050 yes   -

Reference