How do you measure and improve the performance and efficiency of an AI workload? It can be achieved by creating a repeatable process and allowing other people to reproduce the same process. The process can be sub-divided into tasks that can be measured and improved independently. Continuously repeating the process will, in return, deliver a better understanding of the tasks that need improvement. Additionally, the datasets used in the process should also be part of the process for consistency. The process is called benchmarking and relies on methodologies to glue the components together. It will reduce the complexity of running a benchmark.
MLCommons is an organization that aims to accelerate machine learning innovation to benefit everyone. The three pillars of MLCommons are:
- MLCube, are best practices and is a set of standard conventions for creating ML software that can “plug-and-play” on many different systems. MLCube makes it easier for researchers to share innovative ML models, for a developer to experiment with many other models, and for software companies to create infrastructure for models.
- MLPerf is the benchmarking component and provides consistent accuracy, speed, and efficiency measurements.
- Datasets are the raw materials for all of machine learning. Models are only as good as the data they are trained on
In this episode, our guest is David Kanter, Executive Director of MLCommons.
David outlines the challenges of benchmarking and why it is essential to use benchmarks.
David joins host Stephen Foskett and co-host Frederic Van Haren in this episode.
Join us in the podcast and hear more about benchmarking of AI workloads.
A link to this podcast “Utilizing AI” episode can be found here, or click on the video below.