This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy

Striding Towards
The Intelligent World
White Paper

Data Storage

New Data Paradigm
Unleashes
the
Power of AI

AI foundation models have surpassed our wildest imaginations, propelling us into a world of unparalleled intelligence. The three elements of AI are computing power, algorithms, and data. Computing power and algorithms are tools used to serve AI foundation models, while the scale and quality of data truly determine the height of AI. Data storage enables disparate information to be collated into corpuses and knowledge bases. Together with computing, data storage is becoming the most important part of infrastructure for AI foundation models.

Outlook

  • AI Foundation Models AI foundation models require more efficient data collection and preprocessing, higher-performance data loading and storage, and accurate inference knowledge bases. A new AI data paradigm, represented by near-memory computing and vector storage, is rapidly gaining momentum.
  • Big Data Big data applications are now moving towards the stage of supporting real-time precise and intelligent decision-making. The new data paradigm represented by near-memory computing will greatly improve the analytical efficiency of the lakehouse big data platform.
  • Distributed Databases Open-source distributed databases are serving more mission-critical enterprise applications. A new architecture that features high performance and reliability is being created based on distributed databases and shared storage.
  • Cloud Native Multi-cloud has become the new normal for enterprise data centers. Data storage needs to be able to support new cloud-native applications. The STaaS business model is expanding its reach beyond public clouds and into enterprise data centers.
  • Unstructured Data By 2025, the global data volume will reach 180 ZB, of which over 80% is unstructured data. Unstructured data is widely used in enterprises and is becoming production and decision-making data.
  • Intrinsic Resilience of Storage AI foundation model applications aggregate massive amounts of private domain data, exposing enterprises to increasing data resilience risk. There is an urgent need for a comprehensive data resilience system with intrinsic resilience of storage.
  • All-Scenario Inclusive All-Flash Storage All-flash storage features high performance, superb reliability, and optimal TCO. It can be used to build all-flash data centers to replace both high-performance and large-capacity HDDs.
  • Data-Centric Architecture AI foundation models are transforming the compute-storage architecture of data centers from CPU-centric to data-centric. A new system architecture and ecosystem are being built.
  • AI-Powered Storage AI technologies are becoming increasingly integrated into data storage products and their management, greatly improving the service level of data infrastructure.
  • Energy-Saving Storage Energy-saving initiatives for data storage are seeing wider implementation. Data storage typically accounts for over 30% of a data center's energy consumption. Energy consumption indicators are now being integrated into construction standards.

Recommendations

  • AI Foundation Models
  • Big Data
  • Distributed Databases
  • Cloud Native
  • Unstructured Data
  • Intrinsic Resilience of Storage
  • All-Scenario Inclusive All-Flash Storage
  • Data-Centric Architecture
  • AI-Powered Storage
  • Energy-Saving Storage
AI Foundation Models
  • Adopt data lake construction mode for foundation models and upgrade the performance of the current data lake storage.
  • Build forward-looking data infrastructures that include all-flash storage, new data paradigms, and intrinsic resilience of storage.
Big Data
  • Set up a team to design joint solutions for big data platforms and storage and develop a mechanism for regular teamwork.
  • Explore the new data paradigm to achieve real-time (T+0) decision-making as big data platforms evolve towards lakehouses on the basis of decoupled storage-compute architecture.
Distributed Databases
  • Consistently promote the decoupled storage-compute architecture for distributed databases.
  • Encourage the database team and storage team to jointly incubate a new data paradigm.
Cloud Native
  • Migrate innovative services that have uncertainties to public clouds while retaining core services in on-prem data centers.
  • Build agile and highly reliable container platforms and develop best practices for containerized applications.
  • Develop an open and decoupled architecture for cloud construction.
Unstructured Data
  • Enterprise IT teams should strengthen their mass unstructured data processing capabilities.
  • Choose professional scale-out storage to build a foundation for mass unstructured data.
Intrinsic Resilience of Storage
  • Include storage resilience capabilities in enterprise construction plans.
  • Enhance the software and hardware resilience capabilities of storage devices.
  • Prioritize the deployment of data resilience capabilities such as encryption and ransomware protection.
All-Scenario Inclusive All-Flash Storage
  • Tailor all-flash storage plans to current and future enterprise data volumes and requirements.
  • Seize opportunities to replace legacy storage with all-flash models.
Data-Centric Architecture
  • Keep pace with the evolution of server and storage architectures, make timely adjustments, and seek opportunities from new storage.
AI-Powered Storage
  • Clearly define service model indicators and SLA requirements, and develop new evaluation standards systems once new platforms and technologies are introduced.
  • Leverage the AI capabilities provided by storage vendors and work together on continuous AI capability improvement.
Energy-Saving Storage
  • Shift focus from the current power consumption of a single device to carbon emissions throughout the device's entire lifecycle.
  • Comply with unified energy efficiency evaluation standards.
  • Promote storage vendors to innovate for lower power consumption.

Trend Analysis

  • AI Foundation Models
  • Big Data
  • Distributed Databases
  • Cloud Native
  • Unstructured Data
  • Intrinsic Resilience of
    Storage
  • All-Scenario Inclusive
    All-Flash Storage
  • Data-Centric Architecture
  • AI-Powered Storage
  • Energy-Saving Storage
Trend Analysis

AI has developed rapidly and far beyond expectations. AI foundation models are evolving into industry-specific models. Data determines the development of AI. Data storage serves as the carrier of data and has become a form of critical infrastructure for AI foundation models. In addition, data storage is essential for data collection, preprocessing, training, and inference of AI foundation models.

Suggestions

1Build a reliable foundation model infrastructure that attaches equal importance to compute and storage.

2Adopt data lake construction mode for foundation models that share the same data sources as HPC and big data, and upgrade the performance of the current data lake storage.

3Build forward-looking data infrastructures that include all-flash storage, data-centric architecture, data fabric, new data paradigms (vector storage and near-memory computing), and intrinsic resilience of storage.

4The one-stop training/inference HCI appliance is recommended for enterprise segments.

5Create a professional technical team with enhanced professional skills of AI foundation models, particularly in storage aspects.

Trend Analysis

Data lake storage is key to facilitating big data applications to assist real-time, precise, and intelligent decision-making and driving big data platforms to use lakehouse architecture.

Access for diverse workloads is the basic feature of new data lake storage. Data lake storage supports near-data computing, and the new data paradigm allows big data to support applications more efficiently.

Suggestions

1Enterprises should focus on innovative collaboration between big data platforms and storage to promote real-time data analytics.

2Set up a team to design joint solutions for big data platforms and storage and develop a mechanism for regular teamwork.

3Explore the new data paradigm to achieve real-time (T+0) decision-making as big data platforms evolve towards lakehouses on the basis of decoupled storage-compute architecture.

Trend Analysis

Distributed databases built based on an open-source ecosystem are replacing traditional core systems in order to better suit service changes, achieve higher efficiency at a lower cost, and facilitate long-term technology evolution.

Stability is the top consideration for core databases. Performance, functionality, and energy efficiency are also important appraisal criteria. The decoupled storage-compute architecture has become the de facto standard for distributed databases. Distributed databases are driving the development of a new data paradigm.

Suggestions

1Consistently promote the decoupled storage-compute architecture for distributed databases.

2Encourage the database team and storage team to jointly incubate a new data paradigm.

Trend Analysis

The multi-cloud architecture has become the new normal for enterprise IT. As container-based cloud-native applications are increasingly used in key services, storage support for containers will become a necessity.

As enterprises continue pursuing multi-cloud construction and increasingly demand optimal services with lower cost and higher efficiency, cloud-native infrastructures are becoming more open and decoupled.

The business model of clouds is shifting from CAPEX-based to OPEX-based, which is simultaneously reshaping the business model of data storage for enterprises. Data storage will strike a balance between CAPEX and OPEX business models.

Suggestions

1Migrate innovative services that have uncertainties, along with emerging services like OA to public clouds, while retaining core services in their on-premises data centers.

2Container platform teams should collaborate with storage teams to build agile and highly reliable container platforms and develop best practices for containerized applications.

3Develop an open and decoupled architecture for cloud construction.

4Select the suitable business models based on enterprise and service requirements.

Trend Analysis

New applications will give rise to mass unstructured data, and AI foundation models will accelerate the use of unstructured data in production and decision-making systems.

A growing number of industries are looking for professional-grade scale-out storage solutions for enterprise data centers to efficiently and securely store unstructured data.

Suggestions

1Enterprise IT teams should strengthen their mass unstructured data processing capabilities.

2Choose professional scale-out storage to build a foundation for mass unstructured data.

Trend Analysis

The risks associated with a lack of resilience continue to grow as we move to the AI era and begin to aggregate massive amounts of data. Consequently, enterprises are embracing both data resilience and network resilience as key parts of their protection systems.

The intrinsic resilience of storage is built upon an inherently resilient architecture. It has been designed to enhance both storage device resilience and data resilience.

Suggestions

1Include storage resilience capabilities in enterprise construction plans when working on network resilience protection projects.

2Enhance the software and hardware resilience capabilities of storage devices to improve their overall protection capabilities.

3Prioritize the deployment of data resilience capabilities such as encryption and ransomware protection for storage devices.

Trend Analysis

In 2022, SSDs had over double the market share (over 65%) and shipment of HDDs, which strongly illustrates that enterprises are embracing all-flash storage.

The implementation of mass unstructured data into production decision-making systems signals the new era of all-flash storage that we are embracing. Higher-performance all-flash storage has an obvious advantage in TCO and significantly improves enterprise efficiency and service experience.

Suggestions

1Tailor all-flash storage plans to current and future enterprise data volumes and requirements.

2Seize opportunities to replace legacy storage with all-flash models.

Trend Analysis

In the future, applications such as AI and big data will require higher performance and lower latency and CPU performance growth may slow down. With the development of the composable server architecture, the storage architecture will also evolve into a data-centric composable architecture to greatly improve the performance of storage systems. Various processors, memory pools, and flash pools of storage systems will be interconnected through new data buses, avoiding slow data access due to poor CPU performance.

Suggestions

1Keep pace with the evolution of server and storage architectures, make timely adjustments, and seek opportunities from new storage.

Trend Analysis

Storage vendors are adopting diverse disruptive innovations to optimize storage SLA management.

The AI capability needs to be fully unlocked to enablement AI management architecture. AI technologies are no longer limited to monitoring and O&M of storage devices, but also supercharging storage products from the bottom up with intelligence. Storage vendors prioritize product intelligence to optimize storage device efficiency and reliability.

Suggestions

1Clearly define service model indicators and SLA requirements, and develop new evaluation standards systems once new platforms and technologies are introduced.

2Leverage the AI capabilities provided by storage vendors and work together on continuous AI capability improvement.

3Update the capability model of enterprise IT teams and provide comprehensive pre-training for employees.

Trend Analysis

Organizations around the world are working to meet carbon peak and neutrality goals, and this starts with data centers. More than 30% of a data center's energy consumption goes to storage. Therefore, to build sustainable data centers, we need to focus on reducing the energy consumption of IT equipment, particularly, storage devices, in addition to lowering power usage effectiveness (PUE).

Innovations in energy-saving storage technologies are a catalyst for the low-carbon development of data centers.

Suggestions

1Shift focus from the current power consumption of a single device to carbon emissions throughout the device's entire lifecycle.

2Comply with unified energy efficiency evaluation standards.

3Promote storage vendors to innovate for lower power consumption.