llm-recipes

NotebookLM - Self-hosted Audiobook Generation Platform

Music from Book
Voices from Book

Overview

NotebookLM is a self-hosted platform designed to create, manage, and enhance audiobooks. This platform leverages various tools and models to parse, transcribe, and improve manuscripts, ultimately providing high-quality audio content.

"Audiobook Generator"

Steps

"Script Parser"

Current Status

Module Description Tech Source Status
Script parser Convert to scene structured json LLM Multi-shot Prompt script_parser.py Done
TTS server Parler-tts server to generate speech for dialogs in scenes Parler-tts tts_generator.py Done
AudioGen module Audiocraft/magnet background sound creator Audiocraft/magnet   In progress
Basic audiobook Linear workflow for full audio creation without scene logic python audiobook.py In progress

Documentation

An experiment to build a production grade audiobook content generator system to help publishers build on their IP and reach larger audience.

Inference Speed

GPU Model torch.compile Speech Generator (secs) Script Parser (Local) (secs) Script Parser (Online) (secs) Total Time (secs)
GTX 1060 no 2775.97 - 87.37 2863.34
RTX 4050 no 489.93 - 93.82 583.75
RTX 4050 yes -

Reference