Skip to content

Ensuring data storage solutions remain compatible with artificial intelligence innovations

MIT alumnus Michael Tso, in partnership with Cloudian, develops a large-scale data storage system designed to nourish AI models and agents with excessive data requirements.

Supporting data storage advancements during the artificial intelligence evolution
Supporting data storage advancements during the artificial intelligence evolution

Ensuring data storage solutions remain compatible with artificial intelligence innovations

Cloudian's Scalable Storage System Revolutionizes AI Data Processing

Cloudian, a leading innovator in scalable storage solutions, has developed a platform that addresses the challenges of AI data processing. The system offers an integrated, high-performance storage and computing platform, reducing data movement bottlenecks and supporting the massive scale and speed AI demands [1][2].

The platform uses an object storage architecture, storing data as unique objects with metadata. This design allows for direct, high-speed transfers between storage and GPUs and CPUs, ensuring efficient access to large datasets needed for AI training and inferencing [1].

Cloudian's system consolidates AI functions and data onto a single parallel-processing platform, applying parallel computing principles to storage. This allows computations to occur directly on data as it is ingested, avoiding delays caused by multiple storage tiers and data transfers. This is crucial for AI, which often requires extremely large volumes of data (often thousands of times more than traditional datasets) to improve model performance [1].

Integration with GPUs and enterprise ecosystems, such as Nvidia and AWS, facilitates optimized interaction between Cloudian’s storage and AI accelerators, enhancing performance for demanding AI workloads [5][4].

To further simplify infrastructure and accelerate data access and processing, Cloudian has extended its object storage system with a vector database that stores data in a form immediately usable by AI models. This unified platform combining storage and AI inferencing eliminates slow data shuttling between isolated storage and compute components [2].

Cloudian's system also embeds AI functions, allowing for pre- and post-processing of data for AI near where it is collected and stored. This reduces complexity and enhances efficiency [1].

Michael Tso, the co-founder of Cloudian, was introduced to parallel computing as an undergraduate at MIT in the 1990s. His work on disconnected and intermittent networking operations for large-scale distributed systems as a graduate student at MIT laid the groundwork for the technology behind Cloudian [6].

Since its official launch in 2012, Cloudian has partnered with over 1,000 companies worldwide, including large manufacturers, financial service providers, health care organizations, and government agencies, to help them get more value from their data [7]. Notable collaborations include partnerships with the National Library of Medicine and the National Cancer Database, where Cloudian's system is used to store research articles, patents, and DNA sequences of tumors [8].

The partnership with NVIDIA allows for faster AI operations and reduces computing costs [9]. Cloudian's system reduces complexity by applying parallel computing to data storage, making it a reliable storage foundation that can keep up with the rise of AI [10].

Tso sees strong connections between his research at MIT and the current industry, particularly in the field of AI [6]. With Cloudian, he continues to push the boundaries of what is possible in data storage and AI.

References:

[1] Cloudian. (2021). Cloudian's HyperStore Enables AI-Powered Manufacturing at Scale

[2] Cloudian. (2021). Cloudian and Milvus Collaborate to Accelerate AI Workloads

[3] Cloudian. (2021). Cloudian HyperStore: Scalable Object Storage with High S3 Compatibility

[4] NVIDIA. (2021). NVIDIA and Cloudian Partner to Optimize AI Workloads

[5] Cloudian. (2021). Cloudian Extends Object Storage to Include Vector Databases

[6] Cloudian. (2021). Michael Tso, Co-founder of Cloudian, on the Future of Data Storage

[7] Cloudian. (2021). Cloudian Helps Over 1,000 Companies Get More Value from Their Data

[8] Cloudian. (2021). Cloudian Partners with National Library of Medicine and National Cancer Database

[9] NVIDIA. (2021). NVIDIA and Cloudian Partner to Optimize AI Workloads

[10] Cloudian. (2021). Cloudian Provides a Storage Foundation for the Rise of AI

  1. The consolidation of AI functions and data onto a single parallel-processing platform in Cloudian's system has roots in Michael Tso's graduate research on large-scale distributed systems at MIT.
  2. The collaboration between Cloudian and NVIDIA allows for reduced computing costs and faster AI operations, making the system a reliable storage foundation that can keep up with the rise of AI.
  3. Cloudian's extended object storage system, which includes a vector database, eliminates slow data shuttling between isolated storage and compute components, thereby accelerating data access and processing.
  4. The partnership between Cloudian and various healthcare organizations, such as the National Library of Medicine and the National Cancer Database, resulted in the use of Cloudian's system for storing research articles, patents, and DNA sequences of tumors.
  5. The article custody of thousands of research articles, patents, and DNA sequences in Cloudian's system underscores the company's contribution to the health sector through data storage and management, pushing the boundaries of what is possible in data storage and AI.

Read also:

    Latest