Open-source personal project2026 · open-source personal project
WakeTheBook
Local-first audiobook studio for EPUB and PDF files
A local-first full-stack tool that turns EPUB and PDF books into narrated audiobooks with chapter review, voice profiles, and resumable rendering. Built to show product thinking, AI workflow design, and practical TTS engineering without relying on cloud APIs.
FastAPIReact 19TypeScriptSQLiteTailwind CSS 4XTTS v2VoxCPM2PyMuPDF
- Rendering model
- Local-first, no cloud audio generation
- Workflow
- 5-step import → extract → review → render
- Voice stack
- XTTS v2 + VoxCPM2
Overview
WakeTheBook was built as a local audiobook production tool, not just a file converter. It gives users one guided flow to import a book, inspect extracted chapters, adjust the structure, choose a voice, and render final audio without sending source files to external services.
Key Highlights
- Supports EPUB and PDF import with extraction, chapter detection, and cleanup before audio rendering.
- Lets the user review, edit, split, merge, or skip chapters before committing to the final render.
- Uses local voice workflows with XTTS voice cloning and VoxCPM2 support instead of a cloud TTS dependency.
- Handles rendering as resumable chapter jobs, so failures do not force a full restart.
Technical Notes
- FastAPI backend orchestrates ingest, extraction, analysis, voice handling, render jobs, and filesystem-backed project storage.
- React + TypeScript frontend wraps the pipeline in a 5-step wizard and output view for non-technical use.
- SQLite stores project state while generated files live under per-project directories for easy inspection and recovery.
- The app keeps the whole processing loop local, including voice previews, chapter audio, manifests, and logs.
Results
- Shows end-to-end product thinking: file ingest, AI-assisted review, local inference, and UX around long-running jobs.
- Demonstrates a practical AI product that is useful on its own and publishable as an open-source portfolio piece.
- Positions the work as both a user-facing tool and a technically credible local AI pipeline.