AI Blogs

AI Blogs#

Enabling Speculative Speculative Decoding on MI300X

This is an introduction of speculative speculative decoding method. We enable this method on the AMD Instinct MI300x GPUs and report the results.

./artificial-intelligence/ssd_mi300x/README.html

May 25, 2026

AI Inference on AMD Ryzen™ AI Max Processor

Hands-on: run Qwen3.5 9B–122B on Ryzen™ AI Max+ with 128GB UMA and Ollama, with generation benchmarks and a clear UMA setup path on Ubuntu/ROCm.

./artificial-intelligence/ryzen-uma-llm/README.html

May 22, 2026

From Build to Benchmark: ONNX Model Serving with Triton Inference Server on AMD GPUs

Step-by-step guide to building, deploying, and benchmarking ONNX models with Triton Inference Server and MIGraphX on AMD GPUs

./software-tools-optimization/triton-server-onnx/README.html

May 20, 2026

Diffusion-based Atmospheric Downscaling on AMD Instinct GPUs

Read this blog post to learn about and understand the theory of downscaling models. Also learn how to run a particular model, CorrDiff, on AMD GPUs.

./artificial-intelligence/corrdiff-inference/README.html

Ecosystems & Partners

April 24, 2026

Styled Text Image Generation with Eruku on AMD

Hands-on, reproducible guide to train and run Eruku on LUMI supercomputer, powered by AMD Instinct MI250X GPUs.

./ecosystems-and-partners/eruku-genai/README.html

February 13, 2026

Elevate Your LLM Inference: Autoscaling with Ray, ROCm 7.0.0, and SkyPilot

Learn how to use multi-node and multi-cluster autoscaling in the Ray framework on ROCm 7.0.0 with SkyPilot

./ecosystems-and-partners/ray-rocm7/README.html

February 09, 2026

Building Robotics Applications with Ryzen AI and ROS 2

This blog post gives a walkthrough of how to deploy a robotics application on the AI PC integrated with ROS - the robot operating system. We showcase Ryzen AI CVML Library to do perception tasks like depth estimation and develop a custom ROS 2 node which allows easy integration with the ROS ecosystem and standard components.

./ecosystems-and-partners/ryzenai-cvml-ros/README.html

January 20, 2026

Quickly Developing Powerful Flash Attention Using TileLang on AMD Instinct MI300X GPU

Learn how to leverage TileLang to develop your own kernel. Explore the power to fully utilize AMD GPUs

./ecosystems-and-partners/rocm-tilelang-kernel/README.html

Applications & Models

May 20, 2026

QuickReduce FP4 Quantization and Benchmarking on MI355

Learn how QuickReduce uses FP4 quantization to accelerate all-reduce communication and evaluate its performance on AMD Instinct MI355 GPUs.

./artificial-intelligence/quick-reduce-2/README.html

May 15, 2026

Semantic Fencing of Video Streams Using Embedding Splits from Vision Foundation Models

Learn how to semantically split vision datasets using foundation model embeddings on AMD GPUs to reduce leakage and improve evaluation.

./artificial-intelligence/semantic-fencing/README.html

May 14, 2026

Further Accelerating Kimi-K2.5 on AMD Instinct™ MI325X: W4A8 & W8A8 Quantization with AMD Quark

Quantize Kimi-K2.5 to W4A8 and W8A8 using AMD Quark and serve on MI325X with FlyDSL and AITER for further inference acceleration.

./artificial-intelligence/kimi-k2.5-w4a8/README.html

May 11, 2026

Accelerating ComfyUI Workflows on AMD Instinct™ MI355X GPUs with ROCm

We show that the MI355X delivers better performance than the B200 for ComfyUI after enabling PyTorch attention for gfx950.

./artificial-intelligence/comfyui/README.html

Software Tools & Optimizations

May 07, 2026

vLLM-ATOM: Unlocking Native AMD Performance in the vLLM Ecosystem

Use ATOM as an out-of-tree vLLM plugin to keep vLLM compatibility while enabling AMD-optimized attention, model execution, and multi-model support including Kimi-K2.5.

./software-tools-optimization/vllm-atom/README.html

April 27, 2026

TraceLens: Democratizing AI Performance Analysis

Explore how TraceLens automates profiler trace analysis to pinpoint bottlenecks and optimize AI workloads.

./software-tools-optimization/tracelens/README.html