rtx-3090

Here are 11 public repositories matching this topic...

devnen / qwen3.6-windows-server

One-click Qwen3.6-27B inference on Windows. 158 tok/s on RTX 5090, 72 tok/s on RTX 3090. Native, no WSL, no Docker, no telemetry.

windows privacy vllm local-llm llm-inference qwen offline-ai qwen3 textual-tui rtx-3090

Updated May 14, 2026
Python

Sandermage / genesis-vllm-patches

Star

vLLM patcher for Qwen3.6 on consumer NVIDIA — Qwen3.6-35B-A3B-FP8 (192 tok/s, +68% over stock) + Qwen3.6-27B-int4-AutoRound + 256K context. 126 patches: TurboQuant k8v4 KV, MTP/DFlash spec-decode, FULL cudagraph, hybrid GDN streaming, structured boot summary, one-command installer, 1958 tests. v7.72.2.

Updated May 12, 2026
Python

thc1006 / qwen3.6-speculative-decoding-rtx3090

Star

First public benchmark of llama.cpp speculative decoding on Qwen3.6-35B-A3B with a single RTX 3090 (post PR #19493 merge, 2026-04-19). 19 configurations covering ngram-cache, ngram-mod, and classic draft with vocab-matched Qwen3.5-0.8B. Finding: no variant achieves net speedup on Ampere + A3B MoE. Raw JSON, plots, full reproducibility.

benchmark cuda moe ampere mixture-of-experts inference-benchmark llama-cpp ggml local-llm llm-inference qwen speculative-decoding qwen3 rtx-3090

Updated May 13, 2026
Python

jozsefszalma / homelab

Star

The bare metal in my basement

Updated Dec 4, 2025

ColonistOne / eliza-gemma

Star

ElizaOS v1.x agent running Gemma 3 27B locally via Ollama on an RTX 3090, dogfooding @thecolony/elizaos-plugin against The Colony (thecolony.cc).

social-network quantization gemma dogfood ai-agents llm local-inference ollama elizaos elizaos-plugins rtx-3090 thecolony

Updated Apr 19, 2026
TypeScript

zacangemi / local-llm-infrastructure

Star

Dual RTX 3090 local LLM infrastructure — 48 GB VRAM AI cluster build. Hardware spec, photos, power-limit script. Companion to the blog series at blog.zacharycangemi.com.

cuda homelab dual-gpu local-llm ai-infrastructure rtx-3090

Updated May 6, 2026
Shell

YanissAmz / qwdense-turbo

Star

2.28× faster Claude Code on a local Qwen3.6-27B int4 (RTX 3090) — turbo-64k + long-100k profiles, MTP, tool calling, corruption guards.

vllm local-llm qwen speculative-decoding anthropic-api claude-code rtx-3090

Updated Apr 25, 2026
Python

Yungblut / TAMARA-PROJECT

Star

100% local voice assistant with Tool Calling, neural TTS, and streaming responses. Runs on RTX 3090 with Ollama + Kokoro TTS + FastAPI. Privacy-first AI.

python text-to-speech voice-assistant fastapi local-ai ollama tool-calling pydantic-ai kokoro-tts rtx-3090

Updated Apr 7, 2026
Python

GWD99 / gpu-cpu-tray-monitor

Star

Lightweight GPU & CPU system tray monitor for NVIDIA GPUs (RTX 5090, RTX 6000, RTX 4090, RTX 3090, Tesla, TCC mode). Real-time power, temperature, VRAM & CPU usage badges. Works where HWMonitor, GPU-Z & MSI Afterburner fail.

Updated Feb 19, 2026
Python

laylazaes-beep / qwen3.6-speculative-decoding-rtx3090

Star

Benchmark speculative decoding performance for Qwen3.6-35B-A3B on an RTX 3090 GPU using llama.cpp to evaluate model throughput and structural regressions.

benchmark cuda moe ampere mixture-of-experts inference-benchmark llama-cpp ggml local-llm llm-inference qwen speculative-decoding qwen3 rtx-3090

Updated May 15, 2026

zacangemi / owning-your-agent

Star

Local agentic coding stack: Hermes Agent + Qwen3.5-27B + GLM-4.7-Flash on dual RTX 3090s. Daily-driver agentic work, no cloud, no metering. Companion to blog.zacharycangemi.com.

glm llama-cpp local-llm qwen agentic-ai tool-calling nousresearch agentic-coding rtx-3090 hermes-agent

Updated May 15, 2026
Batchfile

Improve this page

Add a description, image, and links to the rtx-3090 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the rtx-3090 topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rtx-3090

Here are 11 public repositories matching this topic...

devnen / qwen3.6-windows-server

Sandermage / genesis-vllm-patches

thc1006 / qwen3.6-speculative-decoding-rtx3090

jozsefszalma / homelab

ColonistOne / eliza-gemma

zacangemi / local-llm-infrastructure

YanissAmz / qwdense-turbo

Yungblut / TAMARA-PROJECT

GWD99 / gpu-cpu-tray-monitor

laylazaes-beep / qwen3.6-speculative-decoding-rtx3090

zacangemi / owning-your-agent

Improve this page

Add this topic to your repo