Offline AI music generation suite. Generate songs, compose MIDI, synthesize vocals, separate stems, create SFX, and master tracks β all locally on your machine.
git clone https://github.com/SysAdminDoc/SlunderStudio.git
cd SlunderStudio
python main.py # Auto-installs dependencies on first runPython 3.10+ required. Core dependencies install automatically. AI models are downloaded on-demand from HuggingFace via the built-in Model Hub.
| Module | Description | AI Engine |
|---|---|---|
| Song Forge | Full song generation from lyrics + style tags | ACE-Step |
| Lyrics Engine | AI-powered lyrics writing with 33 genre templates | Llama 3.2 1B |
| MIDI Studio | Piano roll editor + text-to-MIDI composition | MIDI-LLM |
| Vocal Suite | Singing synthesis, voice conversion, voice cloning | DiffSinger, RVC v2, GPT-SoVITS |
| Stem Separation | Isolate vocals, drums, bass, and other instruments | Demucs (htdemucs) |
| SFX Generator | Text-to-sound-effect generation | Stable Audio Open |
| Mixer | Multi-track mixing with smart mastering (8 presets) | Built-in DSP |
| AI Producer | One prompt to full song β auto-chains all modules | Orchestrator |
| Model Hub | Download, manage, and switch AI models | HuggingFace Hub |
| Projects | Save/load projects with version history and asset tracking | β |
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β AI Producer ββββ>β Lyrics Engineββββ>β Song Forge ββββ>β MIDI Studio β
β (One Prompt)β β (33 genres) β β (ACE-Step) β β (Piano Roll)β
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ ββββββββ¬ββββββββ
β
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ ββββββββΌββββββββ
β Export β<ββββ Mixer β<ββββ SFX Generatorβ β Vocal Suite β
β (WAV/FLAC) β β (Mastering) β β(Stable Audio)β β(DiffSinger) β
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
Every module can route audio to any other module. Generate a song in Song Forge, separate stems in Vocal Suite, add SFX, mix everything in the Mixer, and export a mastered track.
| Preset | Target LUFS | Character |
|---|---|---|
| Balanced | -14.0 | Neutral, general purpose |
| Loud / Radio | -11.0 | Compressed, bright, competitive loudness |
| Warm / Analog | -14.0 | Enhanced lows, rolled-off highs, narrow stereo |
| Bright / Crisp | -14.0 | Enhanced highs, mid presence, wide stereo |
| Hip-Hop / Trap | -12.0 | Heavy sub-bass, punchy compression |
| Cinematic | -16.0 | Dynamic range, wide stereo, gentle compression |
| Lo-Fi | -16.0 | Rolled-off highs, heavy compression, narrow |
| Streaming (Spotify) | -14.0 | Optimized for streaming platform normalization |
Models are downloaded on-demand through the Model Hub. Nothing downloads until you need it.
| Model | Size | Module | Required |
|---|---|---|---|
| ACE-Step | ~3 GB | Song Forge | Recommended |
| Llama 3.2 1B | ~2 GB | Lyrics Engine | Recommended |
| DiffSinger (ONNX) | ~500 MB | Vocal Suite | Optional |
| RVC v2 | ~200 MB/voice | Vocal Suite | Optional |
| Demucs (htdemucs) | ~300 MB | Stem Separation | Optional |
| Stable Audio Open | ~3 GB | SFX Generator | Optional |
All models run entirely on your local machine. No cloud APIs, no subscriptions, no data leaves your computer.
| Component | Minimum | Recommended |
|---|---|---|
| OS | Windows 10 / Linux / macOS | Windows 11 / Ubuntu 22.04+ |
| Python | 3.10 | 3.11+ |
| RAM | 8 GB | 16 GB+ |
| GPU | None (CPU mode) | NVIDIA 8GB+ VRAM (CUDA) |
| Disk | 2 GB (app only) | 20 GB+ (with models) |
GPU acceleration requires PyTorch with CUDA support. The app runs on CPU without any GPU, but generation will be slower.
Settings are stored in ~/.config/SlunderStudio/ (Linux/macOS) or %APPDATA%/SlunderStudio/ (Windows).
SlunderStudio/
βββ settings.json # App preferences
βββ voice_bank.json # Voice model profiles
βββ projects/ # Saved projects with version history
βββ models/ # Downloaded AI models
βββ voices/ # Voice model files
βββ generations/ # All generated outputs
βββ songs/ # Song Forge output
βββ lyrics/ # Lyrics Engine output
βββ midi_studio/ # MIDI generation output
βββ midi_renders/ # FluidSynth renders
βββ vocals/ # DiffSinger output
βββ voice_convert/ # RVC output
βββ voice_clone/ # GPT-SoVITS output
βββ stems/ # Demucs separation output
βββ sfx/ # SFX Generator output
βββ ai_producer/ # AI Producer pipeline output
Create a standalone executable with PyInstaller:
pip install pyinstaller
python build/build.py # One-folder distribution
python build/build.py --onefile # Single .exe (Windows)Output lands in dist/SlunderStudio/.
SlunderStudio/
βββ main.py # Entry point with auto-bootstrap
βββ core/ # Core infrastructure
β βββ audio_engine.py # Playback engine (sounddevice)
β βββ audio_export.py # WAV/FLAC/MP3 export
β βββ lyrics_db.py # Lyrics database with search
β βββ mastering.py # DSP mastering chain
β βββ midi_utils.py # MIDI I/O (pretty_midi wrapper)
β βββ model_manager.py # HuggingFace model downloads
β βββ project.py # Project save/load/versioning
β βββ settings.py # Persistent settings
β βββ voice_bank.py # Voice profile management
β βββ workers.py # Background inference workers
βββ engines/ # AI engine wrappers
β βββ ace_step_engine.py # ACE-Step song generation
β βββ ai_producer.py # One-prompt pipeline orchestrator
β βββ audio_analyzer.py # BPM/key/loudness analysis
β βββ demucs_engine.py # Stem separation
β βββ diffsinger_engine.py # Singing voice synthesis
β βββ fluidsynth_engine.py # MIDI-to-audio rendering
β βββ lyrics_engine.py # LLM lyrics generation
β βββ lyrics_templates.py # 33 genre template definitions
β βββ midi_llm_engine.py # Text-to-MIDI generation
β βββ rvc_engine.py # RVC + GPT-SoVITS voice engines
β βββ sfx_engine.py # Stable Audio Open SFX
β βββ style_tags.py # ACE-Step style tag database
βββ ui/ # PySide6 interface
β βββ main_window.py # Main window with sidebar navigation
β βββ theme.py # Catppuccin Mocha dark theme
β βββ onboarding.py # First-run wizard
β βββ song_forge_view.py # Song generation page
β βββ lyrics_view.py # Lyrics writing page
β βββ lyrics_editor.py # Rich lyrics editor
β βββ midi_studio_view.py # MIDI composition page
β βββ piano_roll.py # QGraphicsView piano roll
β βββ midi_mixer.py # MIDI track mixer
β βββ vocal_suite_view.py # Vocal synthesis page
β βββ stem_mixer.py # Demucs stem mixer
β βββ sfx_view.py # SFX generation page
β βββ mixer_view.py # Multi-track mixer + mastering
β βββ ai_producer_view.py # AI Producer page
β βββ project_manager.py # Project browser
β βββ model_hub.py # Model download manager
β βββ settings_view.py # Settings page
β βββ waveform_widget.py # Audio waveform display
β βββ mood_curve_editor.py # Mood/energy curve editor
β βββ reference_panel.py # Reference audio panel
β βββ seed_explorer.py # Seed variation explorer
β βββ batch_view.py # Batch generation
β βββ toast.py # Toast notifications
βββ assets/templates/ # 33 genre JSON templates
βββ build/build.py # PyInstaller packaging
βββ requirements.txt # Dependencies
βββ LICENSE # MIT License
Q: Do I need a GPU? No. Everything runs on CPU. A CUDA-capable NVIDIA GPU (8GB+ VRAM) dramatically speeds up AI generation but is not required.
Q: How much disk space do models need? About 3 GB for the recommended models (ACE-Step + Llama). The full model suite is approximately 10 GB. Models download on-demand β nothing installs until you request it.
Q: Can I use my own voice models?
Yes. Import RVC .pth models or GPT-SoVITS checkpoints through the Voice Bank. The app auto-detects models in standard directories.
Q: Is any data sent to the cloud? No. All processing is local. The only network traffic is model downloads from HuggingFace, which you initiate manually.
MIT License. See LICENSE for details.
Built by SysAdminDoc with Slunder.
