Bank of China: Pioneering Deployment of New-gen Storage
The Bank of China provides an excellent reference case for the latest-generation of intelligent lossless storage networks.Huawei Tech Issue 93
From mobile payments and digital currency to online banking, digital technologies have reshaped everyday financial transactions. Fintech is a key driver of the digital transformation that's reshaping business models in the banking sector. New technologies like cloud computing, AI, big data, 5G, and blockchain present new vistas for banks going forward.
Back in 2018, BOC released the Technology-led Digital Development Strategy, placing digital technology at the top of its new strategic agenda. The purpose of the strategy was to build a digital bank that supports excellent user experience, a diverse ecosystem, online and offline collaboration, flexible product innovation, efficient operation management, and intelligent risk control.
The IT infrastructure underpinning fintech has become increasingly important. The demand for digital, personal services at the front end has grown considerably, as has the need for stronger computing power and higher stability and flexibility at the back end. Satisfying this demand entails restructuring financial IT infrastructure, including the deployment of an all-cloud, distributed architecture and mainframe offload.
To overhaul financial IT infrastructure, investment in new technologies like financial cloud, big data, and IT isn’t enough. Core business systems such as accounting systems and online transaction systems also need to be upgraded.
The need for stable, fast-response data storage, real-time disaster recovery, and intra-city data backup necessitates building highly available financial IT systems. Following a comprehensive review of its IT infrastructure architecture, BOC recognized that it needed to transform its legacy data storage system, which was built on Fiber Channel (FC).
This decision to upgrade was driven by three key factors:
Service growth: In the first nine months of 2020, BOC saw considerable growth in loans to micro and small enterprises and personal financial services, while growth in the number of large corporate customers remained steady. Across the 62 countries and regions in which BOC operates, online transactions like mobile banking were growing rapidly, mainly due to the bank's efforts to empower its business using technology. It also showed a shift in BOC's business model.
As services grew, the demand for computing and storage servers also grew exponentially, and the daily data generated by transaction systems was rising by terabytes. The average daily intra-city data backup of BOC's service systems increased from 500 terabytes in 2018 to several petabytes in 2020, representing year-on-year growth of over 30% for three straight years. BOC's multi-site multi-center distributed architecture and high-speed data exchange within and between data centers posed great challenges to its existing fiber channel storage area network (FC SAN), which could only deliver a rate of 8 or 16 Gbit/s.
Technological advances: This included advances in storage media and storage protocols.
For many applications, the superior performance of solid state drives (SSDs) has seen them replace hard disk drives (HDDs). Enterprises making the switch from HDDs to SSDs typically see a 100-fold increase in read/write IOPS, a decrease in latency from 2 ms to 0.2 ms, a five-fold reduction in annual failure rate, and energy savings of 87%.
The emergence of this storage medium has driven the shift from the serial SCSI protocol to the high-speed, parallel NVMe protocol. To ensure high throughput and low latency of the all-flash NVMe protocol, the latest-generation storage network must ensure zero packet loss, low latency, and high throughput to support remote direct memory access (RDMA).
Simpler storage network management: In the long run, all financial institutions will end up migrating their IT infrastructure to cloud. Currently, services like Internet finance, mobile finance, and big data analytics have been cloudified, but core services such as online transactions have not. This is because FC-SAN storage networks run on independent systems and protocols, making it costly and challenging to migrate these services to cloud. In any given city, the hundreds of 8G/16G FC links between a bank's different data centers occupy expensive WDM transmission channels, making FC a more costly option.
Another problem was that BOC sourced FC-SAN products from only two US companies, which meant that construction and maintenance costs were high and the bank found it difficult to have autonomy over its key business systems with FC-SAN.
Built based on standard Ethernet IP networks and open RoCE protocols, BOC's RoCE-SAN is the banking industry's first latest-generation intelligent lossless storage network.
As shown in Figure 1, RoCE-SAN uses Huawei CloudEngine data center switch and OceanStor Dorado all-flash architecture. Tailored to BOC's usage needs, the solution represents several major technological breakthroughs, including intelligent cache management, per-flow precise speed control, and second-level, highly available failover, satisfying financial companies’ needs for highly available storage networks.
Figure 1: Architecture of the latest-generation intelligent lossless storage network
On November 20, 2020, BOC's application project management platform and emergency O&M management platform went live.
Built based on the RoCE-SAN architecture, the upgraded storage network provides three major improvements to the bank's business systems (including online transaction systems):
The new storage network currently supports 25GE access and 100GE uplink, and can be evolved to support 100GE access and 400GE uplink. The network's larger capacity and bandwidth means that the potential of all-flash storage, which provides millions of IOPS, can be fully unleashed, preparing BOC for dealing with petabytes of data every day. The new storage network can deliver lossless long-haul transmission between data centers that are more than 50 km apart. The intelligent algorithms built into the network have also increased the bandwidth utilization of private lines that connect different data centers and cut costs. All else being equal, the new storage system can deliver 85% higher throughput than the FC-SAN.
Figure 2: Performance comparison between RoCE and FC
With its intelligent traffic identification and proactive, differentiated control, the new network can increase bandwidth utilization and lower latency during congestion. Compared with the FC-SAN that BOC used previously, the network can reduce the latency during congestion caused by traffic surges by up to 50% (see figure 3).
Figure 3: Changes in latency of a 256 KB data block
When a network fault occurs, the switch on the network can quickly detect changes in host, storage device, and network status, and notify the host of multipath failover. The whole process takes less than one second, ensuring high system reliability.
The new storage network was built on universal Ethernet switches and runs on the IP and RoCE common storage network protocols. This makes the core business systems more open and standardized, while allowing the bank to have full autonomy over its core business.
The network can interoperate with the operating systems commonly used in the banking industry, enabling the bank to develop scenario-specific services with third parties based on open and standard APIs, contributing to a thriving ecosystem.
The latest-generation intelligent lossless storage network allows for seamless interconnection between the SAN in a data center and the LAN in common service scenarios and concurrent operations of both. This reduces management complexity and O&M costs while cloudifying storage servers, improving the automation and service agility of the entire IT infrastructure.
The new network provides high-performance networking with 100G access/400G forwarding, which can support the large-scale deployment of tens of thousands of storage servers on the distributed cloud platform.
In the future, BOC will continue to drive the evolution of its storage network to make it more stable, reliable, intelligent, and efficient. By building remote long-haul transmission and online data compression into the network, BOC will be able to develop a comprehensive geo-redundant disaster recovery system.
BOC's RoCE-SAN intelligent lossless storage network has already proven its value. This kind of network is a completely new technical domain and more work is needed to further improve BOC's storage network.
BOC has pioneered the use of latest-generation intelligent lossless storage networks in the financial industry, a major breakthrough for the bank. The successful deployment of this storage network offers a clear example that others in the financial and banking sector can follow when they upgrade their own IT architecture.
Building on this network, BOC will consolidate the data foundation that supports its financial service systems and build itself into a world-class bank while continuing to pursue digital and intelligent transformation.