Alibaba Cloud Claims 82% GPU Reduction With New AI Pooling Tech


Alibaba Cloud Claims 82% GPU Reduction With New AI Pooling Tech

📅 - Alibaba Cloud has unveiled details of a new GPU pooling technology that it claims can reduce the number of Nvidia graphics processing units required to run its large language model (LLM) workloads by as much as 82 percent.

Alibaba outlined the innovation in a research paper titled "Aegaeon: Effective GPU Pooling for Concurrent LLM Serving on the Market," presented this week at the 31st Symposium on Operating Systems Principles (SOSP) in Seoul, South Korea. The paper describes how Alibaba's system, named Aegaeon, dynamically allocates GPU resources across multiple AI inference tasks, enabling far greater efficiency and utilization across its infrastructure.

In beta testing within Alibaba [...][... Check source for end of article ...]
Tags: AI

Reads: 38 | Category: General | Source: Hosting Jurnalist : Hosting Jurnalist | Author:
URL source: https://hostingjournalist.com/news/alibaba-cloud-claims-82-gpu-reduction-with-new-ai-pooling-tech
Want to add a website news or press release ? Just do it, it's free! Use add web hosting news!