AI Benchmarking Toolkit 1.0 Released: Equipped with a Graphical User Interface, Expanded Model and Task Coverage, and Enhanced Hardware Acceleration Support
The MLCommons consortium has announced the release of MLPerf Client 1.0, a new AI performance testing tool designed to measure local AI model performance on client devices equipped with GPUs and NPUs. This benchmark has been developed by the MLPerf Client working group in collaboration with major hardware and software vendors.
MLPerf Client 1.0 expands upon prior versions by including more AI models, supporting hardware acceleration on a wider range of devices from various vendors, and testing a broader spectrum of user interactions with language models. It also features a user-friendly graphical interface to appeal to casual users.
The benchmark supports testing with AI models such as Meta’s Llama 2 7B Chat, Llama 3.1 8B Instruct, Microsoft’s Phi 3.5 Mini Instruct, and the experimental Phi 4 Reasoning 14B model, representing next-generation large language models with increasing parameter sizes and capabilities.
MLPerf Client 1.0 facilitates hardware acceleration across more devices and vendor platforms, including GPUs and neural processing units (NPUs) commonly found in client devices. This allows it to test execution paths optimized for diverse accelerators, reflecting the rapidly evolving client AI landscape where workloads and best execution hardware vary greatly.
Some experimental workloads in MLPerf Client 1.0 require a GPU with 16GB of VRAM to run, allowing for stress testing of higher-end hardware. The hardware and software stacks of client AI are fluid, and the various ways to accelerate AI workloads locally are numerous.
MLPerf Client 1.0 is available as a free download from GitHub. Support for Intel GPUs is available via ONNX Runtime GenAI-DirectML, Intel NPUs via Microsoft Windows ML and the OpenVINO execution provider, NVIDIA GPUs via Llama.cpp-CUDA, AMD GPUs and NPUs via ONNX Runtime GenAI and the Ryzen AI SDK, and Apple Mac GPUs via Llama.cpp-Metal and MLX.
With MLPerf Client 1.0, users can easily understand the full range of benchmarks they can run on their hardware and choose among them. The GUI version provides real-time monitoring of various hardware resources on a system, making it an essential tool for anyone interested in optimizing the performance of their client AI workloads.
Data-and-cloud-computing technology plays a crucial role in facilitating hardware acceleration across devices for the MLPerf Client 1.0 benchmark, which supports a wide range of ML models and various vendor platforms, including GPUs and neural processing units (NPUs). This advancement in technology allows for testing execution paths optimized for diverse accelerators, reflecting the rapidly evolving client AI landscape.