Real-Time Diffusion Models Thrive on Apple M3 Ultra with New Optimizations

Published on May 19, 2026

Real-time image generation using diffusion models has become standard on NVIDIA platforms. However, progress on non-CUDA systems, particularly Apple Silicon, has been limited. Researchers have aimed to bridge this gap, focusing on the capabilities of the Apple M3 Ultra.

In a recent study, optimization techniques were rigorously tested to achieve real-time img2img transformation. The researchers explored various methods, including CoreML conversion and quantization, across ten phases. Their goal was clear: to unlock the full potential of the M3 Ultra’s 60-core GPU for image generation.

The results were impressive, culminating in real-time performance of 22.7 FPS at 512×512 resolution. Key to this success was a combination of CoreML conversion and a three-thread camera pipeline. The work highlighted that strategies effective on NVIDIA platforms did not translate directly to Apple’s architecture.

This research reshapes the understanding of optimization for diffusion models on Apple Silicon. It uncovered limitations such as the ineffectiveness of some standard techniques, including quantization and parallel inference. As a result, the study provides vital guidelines for future work in this emerging area.

Related News