Outlook
- AI Foundation Models AI foundation models require more efficient data collection and preprocessing, higher-performance data loading and storage, and accurate inference knowledge bases. A new AI data paradigm, represented by near-memory computing and vector storage, is rapidly gaining momentum.
- Big Data Big data applications are now moving towards the stage of supporting real-time precise and intelligent decision-making. The new data paradigm represented by near-memory computing will greatly improve the analytical efficiency of the lakehouse big data platform.
- Distributed Databases Open-source distributed databases are serving more mission-critical enterprise applications. A new architecture that features high performance and reliability is being created based on distributed databases and shared storage.
- Cloud Native Multi-cloud has become the new normal for enterprise data centers. Data storage needs to be able to support new cloud-native applications. The STaaS business model is expanding its reach beyond public clouds and into enterprise data centers.
- Unstructured Data By 2025, the global data volume will reach 180 ZB, of which over 80% is unstructured data. Unstructured data is widely used in enterprises and is becoming production and decision-making data.
- Intrinsic Resilience of Storage AI foundation model applications aggregate massive amounts of private domain data, exposing enterprises to increasing data resilience risk. There is an urgent need for a comprehensive data resilience system with intrinsic resilience of storage.
- All-Scenario Inclusive All-Flash Storage All-flash storage features high performance, superb reliability, and optimal TCO. It can be used to build all-flash data centers to replace both high-performance and large-capacity HDDs.
- Data-Centric Architecture AI foundation models are transforming the compute-storage architecture of data centers from CPU-centric to data-centric. A new system architecture and ecosystem are being built.
- AI-Powered Storage AI technologies are becoming increasingly integrated into data storage products and their management, greatly improving the service level of data infrastructure.
- Energy-Saving Storage Energy-saving initiatives for data storage are seeing wider implementation. Data storage typically accounts for over 30% of a data center's energy consumption. Energy consumption indicators are now being integrated into construction standards.
Recommendations
- Adopt data lake construction mode for foundation models and upgrade the performance of the current data lake storage.
- Build forward-looking data infrastructures that include all-flash storage, new data paradigms, and intrinsic resilience of storage.
- Set up a team to design joint solutions for big data platforms and storage and develop a mechanism for regular teamwork.
- Explore the new data paradigm to achieve real-time (T+0) decision-making as big data platforms evolve towards lakehouses on the basis of decoupled storage-compute architecture.
- Consistently promote the decoupled storage-compute architecture for distributed databases.
- Encourage the database team and storage team to jointly incubate a new data paradigm.
- Migrate innovative services that have uncertainties to public clouds while retaining core services in on-prem data centers.
- Build agile and highly reliable container platforms and develop best practices for containerized applications.
- Develop an open and decoupled architecture for cloud construction.
- Enterprise IT teams should strengthen their mass unstructured data processing capabilities.
- Choose professional scale-out storage to build a foundation for mass unstructured data.
- Include storage resilience capabilities in enterprise construction plans.
- Enhance the software and hardware resilience capabilities of storage devices.
- Prioritize the deployment of data resilience capabilities such as encryption and ransomware protection.
- Tailor all-flash storage plans to current and future enterprise data volumes and requirements.
- Seize opportunities to replace legacy storage with all-flash models.
- Keep pace with the evolution of server and storage architectures, make timely adjustments, and seek opportunities from new storage.
- Clearly define service model indicators and SLA requirements, and develop new evaluation standards systems once new platforms and technologies are introduced.
- Leverage the AI capabilities provided by storage vendors and work together on continuous AI capability improvement.
- Shift focus from the current power consumption of a single device to carbon emissions throughout the device's entire lifecycle.
- Comply with unified energy efficiency evaluation standards.
- Promote storage vendors to innovate for lower power consumption.
Trend Analysis
Trend Analysis
AI has developed rapidly and far beyond expectations. AI foundation models are evolving into industry-specific models. Data determines the development of AI. Data storage serves as the carrier of data and has become a form of critical infrastructure for AI foundation models. In addition, data storage is essential for data collection, preprocessing, training, and inference of AI foundation models.
Suggestions
1Build a reliable foundation model infrastructure that attaches equal importance to compute and storage.
2Adopt data lake construction mode for foundation models that share the same data sources as HPC and big data, and upgrade the performance of the current data lake storage.
3Build forward-looking data infrastructures that include all-flash storage, data-centric architecture, data fabric, new data paradigms (vector storage and near-memory computing), and intrinsic resilience of storage.
4The one-stop training/inference HCI appliance is recommended for enterprise segments.
5Create a professional technical team with enhanced professional skills of AI foundation models, particularly in storage aspects.