SoftCOM AI: Hard on the competition with zero faults
Huawei’s SoftCOM AI solution introduces AI to All Cloud Networks. Designed to create self-driving network architecture, SoftCOM AI is a transformative solution that can help operators compete with OTT companies by using predictive AI to minimize network faults.
Competition drives innovation
Telecom services are divided into three tiers: device, network and IT infrastructure, and upper-layer applications. However, in today’s telecom’s landscape, cross-sector competition is threatening telco revenue models. Thanks to dramatic increases in network speeds, IT and Internet companies are offering cloud services in traditional telco territory: backbone networks, some MANs, IT infrastructure, and IT applications. If operators can provide top-tier cloud services, they can compete with the likes of AWS in the trillion-dollar cloud service market. If not, they will lose many of their traditional services, in particular data center leased lines.
Operators also face efficiency and cost challenges. OPEX for maintaining telecom equipment maintenance is about three times higher than CAPEX. Moreover, higher network complexity often means that engineers lack specialist knowledge and capabilities, resulting in 70 percent of major network faults being attributable to human error.
As Frank Qing, Chief Wireless Architect for Canada’s TELUS, once said, "We're using 21st century 4G networks, but network O&M is somehow like being stuck in the 18th century. Machine manufacturing has become automated but the telco industry is still using manual labor."
Product innovation alone isn’t enough to overcome the challenges facing operators. Boosting competitiveness requires innovations in system architecture, products, and business models.
What is system architecture innovation?
In cloud computing, it isn’t a breakthrough in a server or storage product. Instead, it’s system-level innovation based on new distributed systems that increase resource utilization. Innovations in products, system architecture, and business models are mutually reinforcing. To meet customer needs in the new era, Huawei has designed an innovation system covering these three areas.
On the product side, the principle behind Huawei's network equipment design is Olympic Spirit: high capacity and low latency. For system architecture, Huawei is looking at Self-driving networks that are agile, automated, and smart. And we have two goals for business model innovation: one, for operators and Huawei to form one of the Five Clouds, and, two, to build an Online Smart Service model.
The aim of AI-driven autonomous networks is to create a self-driving network model with three features: agile devices, intelligent control, and intelligent analysis.
In telecom networks, the lower layer is network equipment and the upper layer is the control layer. For network-wide control and O&M, AI and segmented autonomous functions can achieve E2E functionality through the upper-layer operating system, thus enabling the entire network to become autonomous.
The biggest change realized by autonomous networks is that maintenance personnel are no longer involved in the entire service process. The entire network is self-driving in that it’s automated, self-optimizing, and self-healing.
The growth of the Industrial Internet has changed the business models of industrial giants such as Boeing, which now provides digital services like predictive maintenance, fuel management, and flight management. A similar service concept can be applied to telecommunications. Future networks will be fully automated at the operator side with Huawei providing fully automated AI-based online services. These services will be based on a continuous iterative AI model that’s available as a continually improving Model-as-a-Service.
Introducing AI to networks will bring new value from predictability. Telecom network management and the control center are based on device southbound interfaces and data collection. Various strategies and rules enable network-wide management and scheduling to fulfill three conditions for network automation: network reachability, SLA requirements, and resource efficiency.
However, as the network becomes increasingly complex, this isn’t enough. Online AI reasoning and data analysis are required to predict traffic, quality, and faults. Scheduling the network based on predictions of future conditions avoids faults before they occurs, optimizes quality before it deteriorates, and adjusts traffic before congestion occurs. Thus, the core value offered by AI is zero faults.
Five phases, three areas
Developing a self-driving network is a long-term process that we’ve divided into five phases:
One: AI knows "what happened."
Two: AI can determine "why it happened."
Three: AI can predict "what will happen" supported by manual judgments and decisions.
Four: AI judges "what measures need to be taken", which are then carried out manually.
Five: Full automation enables self-healing.
Autonomous networks and Model-as-a-Service will provide end users with a minute-level ROADS experience, optimal network connections at all times, and networks with zero downtime. Operators will benefit from a doubling in efficiency in three areas: O&M, resource efficiency, and energy efficiency.
Doubling O&M efficiency: There are three levels of development in O&M. The first is Run-to-Failure (R2F). With R2F, O&M personnel rush to fix sudden faults when they occur in network operations. This is the lowest level of O&M. The second stage is Preventive Maintenance (PvM). This involves routine inspections. Each item of equipment is checked to prevent failures. This method is extremely inefficient. The third level is Predictive Maintenance (PdM), where the probability that a certain device will fail in the future can be predicted and targeted maintenance carried out.
With PdM, we hope to reduce alarm compression and fault location in networks by 90 percent and achieve a 90 percent prediction rate for key component failures and deterioration, taking a further step towards network self-healing.
With more than 70 percent of network faults caused by passive equipment, for example, fiber bends, device aging, and loose ports, AI can learn the characteristics of signal changes when problems like this occur and drive predictive maintenance.
Doubling resource efficiency: Currently, networks are constructed before data traffic begins to flow, sometimes leading to poor resource utilization. If the problem is approached from the other way round with network scheduling based on flow direction, resource utilization would be much higher.
AI can build traffic prediction models to accurately predict traffic, and thus the best network topologies where network paths are determined by traffic direction rather than physical connections.
Doubling energy efficiency: To achieve this, bits can manage watts; that is, network traffic can determine energy consumption. In equipment rooms and base stations, each system has dozens of parameters. AI can be trained to generate cooling, environment, and service load models to optimize efficiency for lighting, temperature, equipment generators, solar power, and batteries.
At the equipment level, dynamic energy delivery can be based on service loads. When there’s no traffic, methods such as timeslot shutoff, RF deep sleep, and carrier frequency shutoff can reduce power consumption, coupled with dynamic energy-saving management for data center servers and other equipment.
On the network system side, accurate service load prediction models can optimize all network traffic for optimal energy efficiency.
Building and training autonomy
SoftCOM AI is Huawei's target architecture for self-driving networks based on introducing AI tech and capabilities in three layers: device and cloud infrastructure; network management and control center; and network O&M system. These three layers will achieve the E2E smartification and automation of network planning, deployment, operation, maintenance, optimization, and business operations.
Huawei is also planning an AI training platform for operators that can train AI using data from network equipment that is sent to the platform. Models will be continuously updated and optimized to help improve the level of automation in the network system.
Example: SoftCom AI in an optical network
SoftCOM AI can enable the whole service development process. The first requirement is the data foundation, which determines what kind of data is needed. For an optical network, this includes fiber optic data, optical signal data, and optical routing data. The next requirement is enabling technology, or AI algorithms, including algorithms for data cleansing, integrating information, machine learning modeling, and deep learning.
A large number of models also need to be built to enable a "self-driving" optical network. These include fiber optic and filter models.
The final requirement is service application scenarios. These include the initial automated inspection of optical fiber, service provisioning, network optimization, fault location, and automated resource scheduling. Models will be able to find optimal approaches, enabling fast provisioning, simpler O&M, and smart operations. Smartification will improve network scheduling efficiency. With zero-wait, zero-touch, and zero-experience, people won't even feel the network is there.
The future is intelligence. But, network smartification won't be achieved overnight. SoftCOM AI is a part of Huawei's All Intelligence strategy in the telecoms sector to help operators create automated networks that never fail and act as a springboard for digital transformation.