Skip to content
View 6's full-sized avatar

Organizations

@1000Memories @nko2 @wealthsimple

Block or report 6

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Review-first terminal diff viewer for agentic coders

TypeScript 3,870 80 Updated May 12, 2026

Pure Rust Inference Engine

Rust 299 32 Updated May 11, 2026

2.24x decode TPS increase On Qwen 3.6 27B @ temp 0.6 | Native MTP Speculative Decoding On Apple Silicon With No External Drafter.

Python 345 12 Updated May 13, 2026

DeepSeek 4 Flash local inference engine for Metal and CUDA

C 8,236 655 Updated May 13, 2026

πŸ“œ Entire CLI hooks into your Git workflow to capture AI agent sessions as you work. Sessions are indexed alongside commits, creating a searchable record of how code was written in your repo.

Go 4,288 333 Updated May 13, 2026

CLI to control iOS and Android devices for AI agents

TypeScript 2,037 119 Updated May 13, 2026

Pure Go implementation of the WebRTC API

Go 16,446 1,841 Updated May 5, 2026

Flash-MoE sidecar slot-bank runtime for large GGUF MoE models on Apple Silicon β€” llama.cpp fork

C++ 98 11 Updated May 11, 2026

πŸŽ™οΈ Give your apps, CLIs, and agents a voice. VoiPi is a universal, zero-dependency, free text-to-speech library for JavaScript.

TypeScript 208 18 Updated Apr 26, 2026

ArtifactFS is a filesystem driver designed to mount large git repos as quickly as possible, hydrating file contents on-the-fly instead of blocking on the initial clone. It's ideal for agents, sandb…

Go 798 32 Updated May 7, 2026

Agent Skill to help convert transformer LLMs to mlx-lm

Python 22 1 Updated Apr 16, 2026

KV cache compression via block-diagonal rotation. Beats TurboQuant: better PPL (6.91 vs 7.07), 28% faster decode, 5.3x faster prefill, 44x fewer params. Drop-in llama.cpp integration.

Python 977 82 Updated Apr 23, 2026

LLM inference server with continuous batching & SSD caching for Apple Silicon β€” managed from the macOS menu bar

Python 13,872 1,179 Updated May 12, 2026

Official implementation of Paper "System-Aware 4-Bit KV-Cache Quantization for Real-World LLM Serving"

Shell 17 3 Updated Apr 17, 2026

Lucebox optimization hub: hand-tuned LLM inference, built for specific consumer hardware.

C++ 2,017 188 Updated May 13, 2026

Lossless DFlash speculative decoding for MLX on Apple Silicon

Python 681 52 Updated May 12, 2026
Python 75 11 Updated May 13, 2026

RTX 6000 Pro Wiki β€” Running Large LLMs (Qwen3.5-397B, Kimi-K2.5, GLM-5) on PCIe GPUs without NVLink

Python 320 23 Updated May 10, 2026

:octocat: Static checker for GitHub Actions workflow files

Go 3,865 218 Updated Apr 19, 2026

Create stunning demos for free. Open-source, no subscriptions, no watermarks, and free for commercial use. An alternative to Screen Studio.

TypeScript 36,029 2,456 Updated May 10, 2026

The Modular Platform (includes MAX & Mojo)

Mojo 26,128 2,824 Updated May 13, 2026

Capability-based sandboxes with fine-grained policies The next-generation isolation primitive β€” brokering access directly within the agent's operating context, with zero setup and zero latency

Rust 2,380 160 Updated May 13, 2026

Full-screen TUI worktree manager in Rust

Rust 1 1 Updated Apr 1, 2026
Go 3 Updated Mar 29, 2026

Replace port numbers with stable, named local URLs. For humans and agents.

TypeScript 9,293 279 Updated May 8, 2026

Dark mode PDFs without destroying your images.

JavaScript 125 3 Updated May 4, 2026
Next