Alluxio Helps AI Teams Get More from Every GPU

Alluxio's distributed data platform eliminates data bottlenecks with sub-millisecond data access and terabyte-per-second throughput

Fireworks AI achieves up to 1 TB/s throughput and 10x faster model load times

SAN MATEO, Calif., June 04, 2026 (GLOBE NEWSWIRE) -- Alluxio, the developer of a leading large-scale caching solution for AI, today announced a solution designed to help organizations maximize GPU utilization and improve the efficiency of AI workloads on Oracle Cloud Infrastructure (OCI). By combining Alluxio’s data acceleration capabilities with OCI’s high-performance AI infrastructure, organizations can reduce data bottlenecks and keep GPUs continuously fed with data for training and inference.

As organizations increasingly rely on object storage as the foundation for AI, they often face tradeoffs between maintaining data in place and achieving high-performance access. Traditional approaches can require moving large datasets to align with compute resources, increasing operational complexity and cost. Alluxio helps address these challenges by enabling high-throughput, low-latency data access without requiring data migration, allowing organizations to run AI workloads more efficiently.

Alluxio can be deployed alongside GPU environments on OCI, aggregating local NVMe storage into a distributed caching layer that delivers data access at sub-millisecond latency while delivering terabytes per second of aggregate throughput. This approach enables AI workloads to efficiently access data while maintaining flexibility across storage environments.

Organizations using Alluxio capabilities on OCI can benefit from:

Improved GPU Utilization: Helps reduce data access bottlenecks and enable GPUs to sustain utilization levels above 90 percent
Enhanced Cost Efficiency: Helps keep GPUs more consistently utilized, improving overall resource efficiency
High-Performance Data Access: Provides sub-millisecond latency, high-throughput access to data through a distributed caching layer
Zero Data Migration: Enables access to data stored in OCI Object Storage or S3-compatible environments without copying or reformatting data
Seamless Integration: Supports standard interfaces such as POSIX and S3, allowing existing AI pipelines to run with minimal modification

By reducing the need for manual data movement and complex replication strategies, the solution helps simplify operations for organizations running AI workloads at scale.

Fireworks AI Demonstrates Large-Scale AI Performance
Fireworks AI, an inference cloud platform delivering more than 10 trillion tokens per day, uses Alluxio to support high performance data access across distributed GPU environments, including OCI.

Operating GPU infrastructure across heterogeneous environments, Fireworks requires extremely fast data distribution to keep large-scale inference clusters fully utilized. By deploying Alluxio as a distributed data layer alongside GPU clusters, Fireworks has built a high-performance infrastructure capable of delivering massive datasets to compute environments at unprecedented speed.

“To deliver fast, reliable inference at scale, we needed a more efficient way to manage data across our GPU infrastructure,” said Chenyu Zhao, cofounder at Fireworks AI. “With Alluxio, we’ve reduced data access times and improved overall system performance while maintaining flexibility across environments. Our infrastructure spans heterogeneous GPU environments, and we rely on efficient data access to maintain performance. By using Alluxio alongside GPU clusters—including those on OCI—we’ve built a distributed system capable of serving more than 2 PB of data daily, reducing replica download times for large models from 20 minutes to 2 minutes, and achieving up to 1 TB/s in aggregate throughput. This architecture allows us to maintain industry-leading inference performance without the operational burden of constantly moving data.”

Supporting Efficient AI Infrastructure on OCI
“The goal is simple: maximize the value of every GPU,” said Haoyuan Li, CEO at Alluxio. “OCI provides some of the best GPU price-performance in the industry. By pairing that infrastructure with Alluxio’s distributed data acceleration layer, AI teams can keep GPUs fully utilized and scale compute wherever innovation demands.”

“Oracle Cloud Infrastructure is designed to deliver the performance, scalability, and cost efficiency required for today’s most demanding AI workloads,” said Sachin Menon, Vice President of Cloud Engineering at Oracle Cloud Infrastructure. “By working with partners like Alluxio, we can help customers reduce bottlenecks and run AI training and workloads with more consistent performance.”

Learn more:

About Alluxio
Alluxio is the leading large-scale caching solution for AI that enables organizations to bring data closer to compute for AI, analytics, and cloud workloads through a distributed caching and metadata management layer. Visit www.alluxio.io for more information.

About Fireworks AI
Fireworks AI is the global AI inference cloud and infrastructure platform that enables teams like Cursor, Uber, DoorDash, and Shopify to build, tune, and scale highly optimized generative AI applications. Fireworks provides deep support for hundreds of state-of-the-art open models in text, image, audio, embedding, and multi-modal formats globally. Visit https://fireworks.ai/ for more information.

Contact
Amelia Wong
amelia@alluxio.com

Legal Disclaimer:

EIN Presswire provides this news content "as is" without warranty of any kind. We do not accept any responsibility or liability for the accuracy, content, images, videos, licenses, completeness, legality, or reliability of the information contained in this article. If you have any complaints or copyright issues related to this article, kindly contact the author above.