Testing AI Model Performance: A New Benchmark
On Monday, MLCommons, a benchmark group for artificial intelligence, revealed the outcomes of recent tests that evaluate the speed at which high-end hardware can execute AI models.
The Nvidia Corp chip was the best performer in the large language model tests, with a semiconductor made by Intel Corp a close second.
The new MLPerf benchmark is based on a large language model with 6 billion parameters that summarizes CNN news articles. The benchmark simulates the “inference” part of AI data crunching, which powers the software behind generative AI tools.
Nvidia’s most popular conclusions benchmark builds around eight of its flagship H100 chips. Nvidia has dominated the AI model training market, but has yet to conquer the inference market.
“You’ll see that we’re delivering leadership performance across the board, and again, that leadership performance across all workloads,” said Dave Salvator, Nvidia’s director of marketing for accelerated computing.
Intel’s success is based on the Gaudi2 chips produced by the Habana unit purchased in 2019. The Gaudi2 system was about 10% slower than the Nvidia system.
“We are very proud of the results of the conclusions, as we demonstrate the cost-effectiveness advantage of Gaudi2,” said Eitan Medina, Habana’s chief operating officer.
Intel says its system is cheaper than Nvidia’s — about the price of Nvidia’s last generation of 100 systems — but declined to discuss the exact price of the chip.
Nvidia declined to discuss the price of its chip. On Friday, Nvidia said it plans to soon roll out a software update that would double performance in the MLPerf benchmark.
Alphabet’s Google unit previewed the performance of the latest version of the custom chip it announced at its August cloud computing conference. (Reporting by Max A. Cherney in San Francisco; Editing by Leslie Adler)