![](https://oscimg.oschina.net/oscnet/50c9358f-02b3-43f1-916f-32df816454fb.png)
Recently, the Alluxio distributed cache system completed the compatibility test with XSKY's XEOS V6.4 object storage, aiming to solve the challenges of data management and acceleration. The two parties have conducted in-depth product docking and joint development, combining the Alluxio distributed cache system with the many application features of XEOS object storage, and launching an integrated storage joint solution to better support data management and acceleration needs in AI scenarios.
![](https://oscimg.oschina.net/oscnet/0295522a-fb79-4dc9-8110-946eed52250c.jpg)
In recent years, with the gradual promotion and improvement of AI and big data, especially the widespread promotion of large AI models (LLM, multi-modal, Wensheng Video, etc.), the storage and access of extremely large amounts of data has become a major issue faced by enterprises. technical and cost challenges.
Object storage has the characteristics of good scalability, high security, and controllable cost, and has become the most popular mass data storage technology selection at present. However, object storage also faces a series of challenges in terms of data access performance. Especially in AI model training scenarios, the enhancement of data access performance has become a key factor in improving GPU utilization. The market is in urgent need of a system that can effectively combine with object storage and combine An all-in-one solution with object storage advantages and excellent performance.
This cooperation between Alluxio data platform and XEOS will provide users with higher performance, lower cost, and more flexible data management and acceleration solutions, which is expected to promote the application and development of AI technology in various fields.
Alluxio Enterprise AI and XEOS join forces
Alluxio Enterprise AI is a data platform for AI-related scenarios that supports seamless access, management and operation of your data and artificial intelligence workloads in local, cloud, hybrid or multi-cloud environments.
![](https://oscimg.oschina.net/oscnet/fbd1ef37-0988-4b3e-a394-eeb35c0a0c06.png)
Intelligent caching capability: The Alluxio Enterprise AI platform launched by Alluxio can provide a high-performance distributed cache system, so computing applications such as AI engines can improve data I/O by accessing the high-performance Alluxio cache, instead of using relatively slow object storage to read and write data. Its intelligent caching strategy is tailored to the I/O patterns of workloads such as AI, providing high throughput and low latency for the entire AI and other computing workflows. Alluxio can increase GPU utilization to more than 90%, synchronize data with GPU cycles, and accelerate model training and model serving.
No data copies required: Alluxio quickly loads data on demand rather than copying it to local storage. This eliminates the bottleneck on computing performance caused by data loading. Eliminate data copies and improve performance with high-performance, on-demand data access.
Cost savings: Alluxio can be deployed flexibly close to the computing side based on actual computing needs, making full use of idle resources to provide transparent data access acceleration capabilities to improve GPU/CPU utilization on the computing side and achieve better performance with less cost. Effect.
No need to rewrite applications: Alluxio standardizes the data technology stack through a unified namespace, provides a unified access mode across various storage systems, and can provide various API capabilities such as S3/HDFS/POSIX/RESTful. Application developers no longer need to think about where data is stored, and can decouple compute and storage without having to rewrite applications.
XEOS is an enterprise-level object storage product launched by XSKY. It supports seamless expansion, unified management across heterogeneous storage systems, high-performance access, intelligent data management and other functions, helping enterprises to easily build safe, reliable, high-performance, low-cost object storage platforms to meet the growing needs for massive data management. .
![](https://oscimg.oschina.net/oscnet/4f2efdb8-83f3-4117-9d57-2777ce6a8d73.png)
As a comprehensive object storage solution, XEOS is an ideal base for data lakes, especially suitable for data storage and management in AI scenarios. As the underlying storage of the data lake, XEOS has the following advantages:
Unlimited scalability: XEOS supports distributed metadata and storage nodes, and can be easily expanded to hundreds of billions of object storage capacity to meet the needs of massive data accumulation;
High-performance access: Unified metadata services, intelligent multi-level caching and other technologies ensure fast data access performance and meet the needs of various applications in the data lake;
Powerful data management functions: XEOS provides rich data life cycle management, storage classification, compression and other functions, effectively improving storage efficiency and cost performance;
Excellent data security: XEOS uses mechanisms such as EC, replicas, and fault domains, as well as technologies such as encryption, snapshots, and recycle bins to ensure continuous high availability and security of data;
Intelligent ecological support: XEOS is highly integrated with big data, machine learning and other applications, providing graphical tools, custom metadata, data flow and other functions, helping to build end-to-end data intelligent applications.
XEOS and Alluxio are deeply integrated into
a joint solution that fully utilizes the advantages of both.
The integration of XEOS and Alluxio fully utilizes the capabilities of XEOS and Alluxio to achieve higher performance, lower cost, and more flexible data management and acceleration solutions. After completing the basic integration of the two products, both parties not only completed the verification of basic functions, but also explored in-depth combined development of Alluxio and XEOS based on AI scenarios.
Alluxio and XEOS have deeply integrated and debugged metadata interfaces, implemented high-performance data requests based on tens to hundreds of billions of objects, and supported millions of high-performance and low-latency IOPS.
XEOS provides distributed metadata service capabilities, and Alluxio provides stateless and scalable metadata storage and service capabilities, both ensuring good scalability. At the same time, Alluxio and XEOS effectively reduce unnecessary costs in the integration of metadata-related interfaces. Interface calls and unnecessary data transmission greatly improve the interaction performance between metadata interfaces.
Through Alluxio's stateless scalability, each Alluxio node independently supports a considerable amount of metadata requests and caching, using XEOS's own high-performance metadata access interface capabilities to expand metadata services without sacrificing metadata request performance. The support capability, especially under high concurrent data requests of massive small files, can not only reduce the pressure of high concurrent metadata requests on XEOS, but also greatly improve the service performance of metadata requests.
Alluxio and XEOS can make full use of the data set message notification capabilities provided by XEOS. When XEOS senses data changes, it will push notifications to Alluxio in real time. Alluxio can use the message notification mechanism to accurately and quickly determine which data needs to be warmed into the cache.
![](https://oscimg.oschina.net/oscnet/6b01d49d-f2ca-4b72-9d8f-40ec34d3ecd7.png)
This cache preheating mechanism based on message notification can ensure the real-time and accuracy of Alluxio cache. Without manual intervention in the application, Alluxio can automatically sense data changes, quickly complete cache updates and prefetching, greatly improve cache timeliness, ensure data access performance, and greatly reduce the impact of preheating on calculations.
This deep integration not only improves the cache hit rate, but also minimizes unnecessary data movement, greatly optimizing the overall I/O performance and resource utilization efficiency. Through the collaboration of Alluxio and XEOS, AI applications can obtain an excellent data access experience.
The append writing and random writing capabilities provided by XEOS can be highly integrated with the data writing capabilities of the Alluxio cache layer to provide more efficient data writing capabilities.
XEOS 作为 Alluxio 的持久化存储层,不仅提供了海量的存储容量,还支持丰富的写入模式,如追加写、随机写等。Alluxio 可以充分利用 XEOS 的这些写入功能来优化缓存写入功能和性能。对于需要频繁更新的热点数据,Alluxio 可以直接以追加写或随机写的方式将数据写入 XEOS,避免了传统对象存储仅支持覆盖写所带来的性能瓶颈。这大幅提升了 Alluxio 缓存层的写入效率。
通过 Alluxio 和 XEOS 的深度融合,可以最大限度地减少数据在两者之间的不必要移动。这提供了更高效的数据写入能力,提升了整体的写入性能,大幅降低了 Alluxio 自身的写入开销,优化了系统的资源利用效率。
XEOS 凭借其出色的数据生命周期管理和流动能力,为 Alluxio 提供了强大的支撑。在最新的 6.4 版本中,XEOS 通过开放数据流动 API 进一步增强了这些核心优势:
通过 XEOS 全面的数据流动 API,应用程序可以轻松实现跨云厂商、NAS、蓝光/磁带等不同存储介质的数据流动。Alluxio 只需下发数据流动规则,XEOS 就可以负责执行具体的数据复制、分层、归档等操作。这种深度集成大大减少了 Alluxio 在读取数据后再次写入到统一文件系统(UFS)的开销,降低了整体的数据写入开销。
同时,XEOS 还提供了复制、分层、校验、QoS 等丰富的数据管理功能。通过与 Alluxio 的紧密结合,双方可以根据实际需求,优化数据的存储和访问策略,不仅提升了数据访问性能,也大幅提高了整个数据生命周期的管理效率。
XEOS 6.4 版本在数据流动和生命周期管理方面的创新能力,为 Alluxio 构建高性能、高效、智能的 AI 数据管理平台提供了坚实的基础。
针对 AI 的优势场景:
以对象为底座且需要高性能文件访问
Alluxio 和 XEOS 的联合方案,针对尤其是在 AI 场景下,以对象存储作为底座,并且需要高性能文件协议访问的客户具备非常大的价值,包括如下四点:
Alluxio 和 XEOS 结合,可以有效综合 XEOS 的高性能对象存储接口能力以及 Alluxio 贴近计算侧的高性能共享缓存层的能力,为计算应用提供高性能数据访问能力。
XEOS 提供了海量数据存储能力,Alluxio 提供了无状态可扩展的共享缓存能力,都可以随着数据存储以及数据访问规模的增大而扩展,既能支撑海量存储又能提供高性能的大规模数据访问能力。
XEOS 作为成熟的对象存储产品,利用一系列技术实现了更加经济的海量数据存储方案,有效降低了企业的海量数据存储成本;Alluxio 则可以通过灵活的部署策略,有效利用企业计算侧的高性能存储空间提供贴近计算侧的共享缓存能力,在不增加额外硬件成本的基础上实现高性能数据访问能力的构建。
XEOS 构建了安全可靠、高性能、低成本的对象存储平台,Alluxio 基于 XEOS 提供了高效数据缓存以及数据访问平台,Alluxio 某种意义上是 XEOS 的扩展客户端,实现 1+1>2 的架构。
√
可以利用 Alluxio 的统一命名空间实现多个 XEOS 以及 XEOS 和其他文件/对象存储系统的统一访问。
√
可以利用 Alluxio+XEOS 提供更加高效的 POSIX 和 S3 协议的接口能力。
√
可以利用 Alluxio+XEOS 提供更完善的安全能力,包括和 Kerberos、Ranger 等集成。
✦
【添加小助手,了解更多活动详情】
✦
![](https://oscimg.oschina.net/oscnet/3ac21290-8e77-4b38-a87a-3c536f490525.png)
✦
【近期热门】
✦
✦
【宝典集市】
✦
![](https://oscimg.oschina.net/oscnet/f23739da-92ae-4004-8b54-55079d4ccc29.png)
![](https://oscimg.oschina.net/oscnet/fc663431-20ce-42cd-a616-e289f9a793d5.png)
![](https://oscimg.oschina.net/oscnet/ee906ebb-2295-4e82-a0fe-5ebdbfbd9561.png)
![](https://oscimg.oschina.net/oscnet/58b59c82-26b2-4589-a1e8-8d94304c68b4.png)
本文分享自微信公众号 - Alluxio(Alluxio_China)。
如有侵权,请联系 [email protected] 删除。
本文参与“OSC源创计划”,欢迎正在阅读的你也加入,一起分享。