
Microsoft has officially launched the public preview of Windows ML, a cutting-edge runtime designed to optimize on-device machine learning model inference and simplify deployment for developers.
Announced as the foundation of Windows AI Foundry, this new framework aims to revolutionize how developers create AI-powered applications across the diverse Windows hardware ecosystem.
Windows ML is built to leverage the most suitable silicon for specific workloads on any given device - whether that's an NPU (Neural Processing Unit) for efficient, sustained inference, a GPU for raw processing power, or a CPU for maximum flexibility.
The runtime provides a unified framework that allows developers to confidently target any Windows 11 PC available today.
At its core, Windows ML is powered by the ONNX Runtime Engine (ORT), enabling developers to use familiar ORT APIs. The system supports ONNX as a native model format and accommodates PyTorch to intermediate representations, ensuring seamless integration with existing models and workflows.

Microsoft has developed Windows ML in close partnership with major silicon providers, including AMD, Intel, NVIDIA, and Qualcomm, each of which has created execution providers (EPs) optimized for their hardware. These partnerships ensure maximum performance across the full spectrum of CPUs, GPUs, and NPUs from day one.
Key features of Windows ML include:
- Simplified deployment through infrastructure APIs that eliminate the need for multiple app builds to target different silicon types
- Advanced silicon targeting capabilities that optimize for power efficiency or performance
- Performance improvements of up to 20% compared to other model formats
- Guaranteed conformance and compatibility across hardware
To complement the runtime, Microsoft is also introducing the AI Toolkit for VS Code, which supports model and app preparation through conversion, quantization, optimization, compilation, and profiling tools.
Early adopters have reported significant benefits. Filmora converted a complex AI feature to Windows ML in just three days, while Powder integrated models three times faster than before. Topaz Labs noted that Windows ML will dramatically reduce their installer size "from gigabytes to megabytes."
The Windows ML public preview is available starting today on all Windows 11 machines worldwide. It includes high-level APIs for runtime initialization and dependency management, plus low-level ONNX Runtime APIs for fine-grained control of on-device inference.
Microsoft plans to make Windows ML generally available later this year, positioning it as a cornerstone of the next wave of AI innovation on Windows platforms.