Future Technologies
AI in the 5G-A Era: Scenarios, Key Technologies, and Evolution Trends
This paper explores the evolution trend of AI, analyzes its key values in 5G-A networks, and disscusses emerging application scenarios.

By Research Dept, WN, Huawei: Yingpei Lin, Yan Chen, Yi Qin, Yan Sun, Rui Xu, Yuwen Yang, Zhengming Zhang, Jiaxuan Chen, Yang Tian, Youlong Cao, Xiaomeng Chai, Hongzhi Chen, Hong Qi, Xu Pang
1 Introduction
The development of communications technologies has always played an important role in driving social transformation. 5G-A networks — the latest generation of communications network — provide a broad platform for various emerging technologies thanks to their high throughput, low latency, and high reliability. With the integration of artificial intelligence (AI) and 5G-A networks, a revolutionary leap in the communications field is now possible.
Featuring powerful data processing capabilities and intelligent decision-making, AI has become the core driving force behind today's technological revolution. Its influence is continuously expanding from basic theoretical research to extensive industry applications. In the mobile communications field, AI applications are already reshaping the network architecture and service mode. This paper delves into the key roles and development trends of AI technologies in the 5G-A era, analyzes how AI helps improve 5G-A network performance, and explores the potential application of AI technologies in the future communications field.
In this paper, we review the development of AI models, computing power, data, and application paradigms, summarize the AI evolution trend (see Section 2), and analyze the key values of AI in 5G-A networks (see Section 3). We also discuss emerging application scenarios like artificial intelligence generated content (AIGC) and embodied intelligence in the AI era. Then, we identify the key requirements and challenges of these application scenarios for 5G-A networks (see Section 4), and describe how to address these challenges by leveraging AI to improve 5G-A network performance and enabling 5G-A networks to provide high-quality AI services (see Section 5). Finally, we conclude this paper by considering the deep integration of AI and 5G-A networks in the future and envisioning a new era of intelligent and personalized communications (see Section 6).
2 AI Evolution Trends
In addition to being the core driving force behind today's technological revolution, AI also has a significant and far-reaching impact on various industries by extracting high-value information from data. AI is widely used in various fields and, as a multi-disciplinary subject, it involves research of basic theories and cutting-edge technologies, creating significant influence and business value. Breakthroughs in AI technologies, especially in industries such as the mobile Internet, have not only redefined what is possible, but also brought about profound changes in society and the economy. Incredible breakthroughs, skepticism, and constant evolution form the history of AI. As we prepare for the further development of AI, we are fully aware of its infinite potential. And with the advent of the 5G-A era, the explosive development of AI requires us to actively embrace this technique in order to achieve continuous breakthroughs and progress.
2.1 AI Models
AI models represented by neural network models were first proposed by Warren Sturgis McCulloch and Walter Pitts in 1943. Then in 1993, following years of evaluation and development, perceptron-based AI models were proposed by Frank Rosenblatt, setting off a new wave of development. Since then, machine hardware has undergone rapid iterations and updates, while both computing capability and storage space have been significantly improved. At the same time, AI models have gradually been adopted in various fields and attracted attention from all sectors of enterprises. Multi-layer perceptrons also emerged, which can work on hardware entities and be used for image recognition, after researches created them by adding more layers to perceptrons.
Researchers found that multi-layer perceptrons had strong potential in solving complex problems such as image recognition and began work on inventing new AI models to model data like text. In 1997, the long short-term memory (LSTM) recurrent neural network was proposed and had a profound impact on subsequent AI research. Thanks to the iterative upgrade of computing and storage devices, the LSTM architecture involving memory units controlled by three gates (input gate, forget gate, and output gate) can determine the memory increase/decrease and output of AI models through logic gates. As a representative of recurrent neural networks, LSTM shows great ability in dealing with long sequence problems. It has become a classical neural network architecture for sequence tasks, such as text classification, sentiment analysis, speech recognition, image title generation, and machine translation. However, this type of AI model involves large-scale parameters and high computing costs in order to achieve an optimal effect, and it cannot meet requirements in the case of limited computing power.
In 2006, Geoffrey Hinton invented the restricted Boltzmann machine (RBM) model and deep belief network (DBN) to train multi-layer neural networks, and officially named multi-layer neural networks as deep learning (DL). It was at this moment that AI models entered the deep neural network (DNN) era. AI models with more layers and smarter structures based on DNNs emerged and achieved exciting results. For example, AlexNet, which consists of five convolutional layers, one max pooling layer, three fully connected layers, and one softmax layer, won the ImageNet Large Scale Visual Recognition Challenge (ILSVRC). The industry quickly realized that deep convolutional neural networks (DCNNs) are adept at handling visual recognition tasks. Thus ensued an era of large-scale development and application of convolutional neural networks (CNNs). The most representative visual geometry group (VGG) and ResNet were proposed successively, and improved the performance of computer vision tasks to an unprecedented level. AI models had evolved from simple two-layer neural network models to VGG and ResNet models with more than 10 and 50 convolutional layers, respectively. Additionally, the ability of AI models had changed from recognizing simple images such as optical character recognition (OCR) to implementing advanced tasks like semantic segmentation and instance segmentation. However, researchers noticed that these CNNs still had certain limitations in natural language processing (NLP) tasks, and AI models underperformed in understanding natural languages.
With the emergence of Transformer, AI models have made significant breakthroughs in natural language understanding. Transformer is a neural network model based on the attention mechanism and does not use recurrent networks or convolution. This type of model consists of multi-head attention, residual connection, layer normalization, fully connected layer, and positional coding. It mines the correlation of sequence information while retaining the sequence order. Transformer has revolutionized NLP and quickly became a core AI model that dominates other fields (e.g., computer vision). In NLP, this type of AI model is often used for machine translation, text summarization, speech recognition, text completion, and document search. It was a precursor to large language models (LLMs) such as ChatGPT. Released by OpenAI in November 2022, ChatGPT has become a phenomenal application of generative AI. This AI model is based on the GPT-3.5 architecture and trained through reinforcement learning. It enables the transformation of AI models from image recognition to image understanding and then to generative AI. The number of parameters in such AI models has also changed from thousands to millions or even billions.
2.2 AI Computing Power
Computing power is a key driving force for AI development and progress. Over the past decade, the amount of computing power used to train AI models has increased by 350 million times. Many of the advances in AI stem from the significant increase in the computing power used to train and run AI models. Developers have harnessed immense computing power to train LLMs, AlphaGo, protein folding models, and autonomous driving models on huge datasets, adopting a different approach from the previous training and deployment of small-scale AI models. As a result, AI models are now capable of solving problems. In many AI fields, researchers have found a scaling rule, namely, the performance of a training objective (such as predicting the next word) increases with the amount of computing power used to train the model.
Thanks to hardware improvements, the computing power of end devices, base stations, and clouds has been improved to an unprecedented level. With the development of advanced computing units such as CPUs, GPUs, and TPUs, computing is not limited to traditional data centers. Instead, it is gradually moving to the edge, especially in full-stack scenarios. Thanks to the deployment of intelligent devices with computing capabilities, clouds, edges, and devices can jointly provide computing power for AI, effectively improving the computing efficiency and performance. Clouds, which store the collected data in data centers, perform computing and processing at central points, while base stations are edge computing devices that offer powerful data computing and processing capabilities. With end devices becoming more intelligent and AI models being iteratively upgraded, the performance of intelligent device processors has been significantly improved, and the computing power of end devices has been enhanced.
2.3 AI Data
Training AI models requires high-quality and large-scale data, which is one of the decisive factors for AI success. In the early stage of AI development, researchers manually built extremely limited datasets for AI model training and evaluation. However, as AI models grew in size and capabilities, researchers found that training data had become a bottleneck for achieving higher-performance AI models. The desire for high-quality training data has become a major challenge for AI development. For example, existing large-scale AI language models are built using text obtained from the Internet. Such text includes scientific research, news reports, and Wikipedia entries, and is further decomposed into lexical elements. According to researchers, the training data used by the GPT-4 model contains up to 12 trillion lexical elements. If AI models continue to grow at the same pace, it is possible that training will require 60 to 100 trillion lexical elements. To address issues with data exhaustion, the AI field demands stronger data acquisition capabilities.
2.4 AI Application Paradigms
When AI first emerged, AI applications mainly focused on function fitting. That is, neural network models were used to simulate specific functions or tasks, such as classification, prediction, and optimization. These applications were usually based on rules or statistical models, with the goal of improving efficiency and accuracy. Nowadays though, with the development of big data technologies, AI applications are shifting to data-driven models, which rely on large amounts of data to train algorithms and implement more accurate function fitting.
AI foundation models have developed rapidly in recent years, enabling AI to process data more efficiently and create new data and content. For instance, generative AI can creatively synthesize and augment data by digging into the underlying distribution of the input data, understanding the joint probability distribution of data and labels, and generating content similar to training data. AIGC expands the scope of AI applications and transforms content generation from a manual-based to an AI-based process, making AIGC a promising development direction in the future.
AIGC can generate text, voice, and video content. The great potential of AI in creative text generation is demonstrated by advances in natural language generation technologies, such as ChatGPT launched by OpenAI. In addition to generating answers based on the patterns and statistical rules learned in the pre-training phase, ChatGPT can also complete a variety of tasks such as writing papers, emails, and scripts, performing copywriting, and undertaking translation and encoding. In the image and video fields, DL technologies, such as generative adversarial networks (GANs) and extended models, are also used to generate realistic image and video content, including animations, simulated scenarios, and special effects. OpenAI's Sora demonstrates AI's ability to generate a complex 60-second video with multiple characters, specific types of motions, precise themes, and background details based on text instructions.
In summary, the AI application paradigms have evolved from simple function simulations to complex intelligent systems capable of understanding, learning, and creating. As a new chapter in the development of AI, AIGC brings new possibilities to the AI field, and provides innovative tools and solutions for various industries, signifying the arrival of a new era full of imagination and creativity.
3 Key Values of AI in 5G-A Networks
3.1 Improving Network Performance
AI's technological breakthroughs in language, audio, and image processing demonstrate the performance advantages of data-driven methods in multimodal information feature extraction and problem solving. With the rapid development of AI technologies, the convergence of wireless networks and AI is deepening. Improving wireless network performance with AI technologies has become a popular research topic in the communications field. Research in both academic and industry circles shows that AI technologies can be used to enhance wireless networks from multiple dimensions, such as air interface and core network intelligence. In particular, the design of wireless intelligent air interface technology is a key direction in improving network performance and promoting standardization.
- Network performance improvement: Traditional communication modules are designed using the model-driven method. That is, the communications system is simplified and modeled based on factors such as Gaussian hypothesis and linear hypothesis, consisting of modules such as modulation and demodulation, encoding and decoding, and channel measurement. AI technologies, adopting a data-driven design method, learn input and output mapping relationships of different communication modules based on training data other than the non-real hypotheses in system modeling. This makes it possible to enhance the performance of different communication modules with reference to data-driven AI technologies, for example, improved channel measurement precision and enhanced signal demodulation performance. Ultimately, the overall communication network performance is enhanced, including improved user-perceived rate and communication coverage.
- Deterministic service assurance: The time-varying nature of a communications environment leads to unstable performance of communication links. Ensuring deterministic transmission of services in a changing communications environment has always been an important research direction of wireless networks. Data-driven AI technologies learn the change characteristics of the communications environment and adaptively adjust communication policies, making it become an important technical direction to achieve deterministic service assurance.
The evolution of AI-based intelligent air interface design has also promoted the discussion of radio access network (RAN) AI standards. Research into the design of AI-based air interfaces was initiated in 3GPP Release 18. This research project explored the impact of AI-based design on the overall wireless network framework, the performance of some typical use cases, and their impact on standardization. The project defined basic AI concepts, simulation and verification methodologies, and base station and end device cooperation modes. It also studied each phase in the lifecycle management process, including model/function registration, data transmission, model transmission, model/ function selection, and model/function activation and deactivation.
3.2 Providing High-Quality AI Services
The development of AI-based wireless networks can significantly improve network performance. And as a system with powerful communication connections, distributed computing power deployment, distributed data processing, and AI algorithms, wireless networks can provide broad possibilities for the extensive construction of high-quality AI services such as AI-based image enhancement, gaming, extended reality (XR), immersive communication, and other experience-oriented services. However, such services pose extremely high requirements on network performance.
With the design and application of foundation models in the AI field becoming a development trend, the exponential growth of the foundation model scale also brings important challenges to the implementation of AI services. The technical architecture of distributed training and inference has emerged to address such challenges and is considered as a fundamental feature of the next-generation AI architecture. The edge AI computing power deployed on base stations, combined with device-network convergence, makes it possible to build distributed training and inference capabilities in wireless networks. An advantage of this architecture is that, by offloading AI computing requirements from the central cloud server to network devices closer to users, it can effectively optimize the latency and energy consumption of AI services, thereby improving user experience.
As services expand, the data processing capabilities of wireless networks become richer, providing efficient support for end-to-end data collection, transmission, storage, and sharing. These capabilities are deeply integrated with distributed training and inference capabilities in wireless networks, enabling larger-scale, more intelligent model optimization, training, and inference. However, providing data to internal or external network functions in a secure and efficient manner is still a subject that needs to be further studied.
Looking ahead, the ubiquitous edge computing power of networks will provide powerful support for AI services. With the platform advantages of network-integrated communication, sensing, and computing, we can open up a new market space for network participation in AI services, providing a strong driving force achieving prosperity in the AI era. Specifically, a RAN has natural advantages in transmission, collaboration, and sensing, and can meet requirements of AI services for low latency, high intelligence, wide coverage, and low power consumption based on the technology roadmap of "communication-computing convergence and sensing- computing convergence", accommodating explosive growth of AI services in the future.
4 5G-A AI Use Cases
4.1 Use Case Analysis
4.1.1 AIGC Applications Based on Artificial General Intelligence Devices
AIGC's multimodal processing capability can significantly improve production efficiency and reduce labor costs. As one of the most popular artificial general intelligence (AGI) applications, AIGC is gradually changing the way we create and consume content. In this section, we describe the capabilities and functions of AIGC in three scenarios — e-commerce livestreaming, cloud gaming, and video calling — from the perspective of 5G-A networks.
- E-commerce livestreaming: AIGC can generate attractive livestreaming content, including automatically generating product descriptions, answering frequently asked questions (FAQs), and even creating virtual streamers for 24/7 livestreaming. By analyzing audience interaction and feedback, AIGC can also adjust livestreaming content in real time, improving user engagement and purchase conversion rate. Automated content generation saves human resources, and provides a more personalized and diversified shopping experience.
- Cloud gaming: AIGC provides personalized game recommendations and dynamically generated game content in cloud gaming services. It can generate customized game levels or tasks according to players' gaming history and preferences (e.g., it can customize diverse NPC designs) to bring a unique gaming experience to players and increase game diversity and interest. AIGC further enables AI game battles and provides enjoyable gaming for players who pursue fiercer competition.
- Video calling: AIGC can improve call quality and provide functions such as real-time background replacement, voice quality enhancement, and sentiment analysis. By analyzing call content, AIGC automatically generates minutes of meeting (MOM), keyword tags, or emotional feedback, helping users better understand and review call content. This makes video calling more intelligent and efficient, especially in remote work and online collaboration scenarios.
The core advantage of AIGC is its high degree of automation and intelligence. With DL models, AIGC can analyze large amounts of data, learn human creative habits and styles, and independently generate high-quality content. From writing news reports and stories, to designing visual arts, or even producing music and videos, AIGC can deliver a unique perspective and creativity.
Furthermore, AIGC's customizability provides users with great flexibility to set different parameters and conditions according to their requirements. This allows AIGC to generate content with a specific theme, style, or emotion. Such customized services are especially popular in the advertising, marketing, and entertainment industries. Innovation is another highlight of AIGC. It is not restricted by traditional thinking modes or creative boundaries. Instead, it can explore unknown fields, create new forms of content, bring new horizons for artistic creation, scientific research, and education, and stimulate infinite imagination and creativity.
As technologies develop, we can expect AIGC to deliver more exciting applications and achievements in the future. For example, Apple Intelligence demonstrated at the WWDC 2024 integrates GPT-4o to enable iOS, completely transforming Siri into the "ultimate virtual assistant." Additionally, Apple is preparing to develop it as "the most powerful killer AI application." Personalized interaction and intelligence implemented through collaborative on-device processing and private cloud computing will become the standard configuration of the iPhone 16 and subsequent models. The on-device inference and edge training of AGI devices also extend the subject of traffic consumption from humans to machines, posing urgent requirements for faster and more reliable communication pipes. AIGC may be the next super application of operators' ToC services.
4.1.2 Embodied Intelligence Applications Based on Robots
With the rapid development of AI technologies, robots have moved from science fictions to becoming a reality. They have shown great potential and value in multiple fields including industry, healthcare, service, and home business. Embodied intelligence, as a new trend in the AI field, embeds intelligent systems into physical machines, so that they can directly interact with the environment. Achieving such intelligence relies on the integration of multiple disciplines, including mechanical engineering, electronics, computer science, and cognitive science. Based on this integration, intelligent systems can learn how to adapt to the environment, optimize their own behavior, and make decisions in complex scenarios.
Advanced AI models like GPT-4o have achieved multimodal interaction of text, audio, and images, and possess certain sentiment analysis capabilities. However, the key to embodied intelligence lies in the ability of intelligent systems to interact with the physical world. This represents the transition of AI from relying on manual prompts to running in a more autonomous and intelligent form. The intelligent agents can be robots, drones, unmanned vehicles, or other forms of automation equipment. They sense the outside world through an integrated sensor network and translate the sensing data into an understanding and response to the environment.
In practical applications, embodied intelligence has shown a wide range of potential. It plays an increasingly important role in improving production efficiency in industrial automation, providing more personalized customer experience in the service industry, or performing high-risk tasks in exploring unknown fields. As the AGI industry and foundation models develop, and as technologies such as computer vision, computer graphics, NLP, and cognitive science become mature, embodied intelligence is rapidly evolving from theory to practice and from labs to daily life.
Oriented to the 5G-A+ era, the ToC embodied intelligence robots serving as personal/home assistants will be used in a series of application scenarios, such as parcel pickup, household shopping, and care assistance.
- Companion robots are a typical application of embodied intelligence in households. These robots can talk with humans, recognize their emotional state through facial expressions, voice intonation, body language, and the like, and understand and respond to human emotions and needs. For the elderly, companion robots monitor their health status, remind them to take medicine, or offer help in case of an emergency. And for children, the robots provide educational content (to enable online learning) and entertainment content, such as games, music, and storytelling, to serve as a friend and teacher.
- Transportation robots are playing an increasingly important role in the logistics and express delivery industries. These robots use sensor information and map data for navigation, obstacle avoidance, and optimal path planning. This improves delivery efficiency, reduces logistics costs, and provides convenient and fast service experience for users.
The high speed, low latency, and high reliability features of 5G-A networks will enable more accurate environmental perception for embodied intelligence devices and provide highly reliable end-to-end deterministic latency services for indoor and outdoor robot applications. This will have a far-reaching impact on society and usher in a more intelligent and personalized era.
4.2 Key Requirements and Challenges
With the continuous progress of AI technologies, the application fields of AIGC and intelligent robots are expanding. This poses new requirements and challenges to wireless networks in order to support richer and more complex content generation and interaction experience.
4.2.1 Requirements and Challenges of AIGC Applications
AIGC technology has strict requirements on low latency for 5G-A network transmission, especially in application scenarios that require real-time interaction, such as online gaming, virtual reality (VR), and remote control. Low latency ensures real-time content generation and interaction, significantly improving user experience. For example, in virtual livestream shopping, the total latency from when a user sends a request to when the content is displayed must be within 70 ms to 100 ms. Additionally, the one-way latency of the air interface must be within 5 ms to 10 ms.
Because AIGC applications may involve the transmission of large amounts of data, including HD images, video streams, and complex model parameters, a high-bandwidth network is critical to support fast transmission of the data. This is necessary to meet AIGC's data processing requirements. For example, the typical upload bit rate of 1080p videos ranges from 5 Mbit/s to 8 Mbit/s, while the downlink rate of AIGC-generated content may reach 100 Mbit/s. Additionally, wireless networks need to support efficient transmission of multimodal data, including text, images, audio, and videos, and provide differentiated QoS assurance for AIGC applications, ensuring that necessary network resources are allocated to mission-critical applications.
4.2.2 Requirements and Challenges of Embodied Intelligence
Thanks to the rapid iteration of AI technologies, especially LLMs, intelligent robots are becoming increasingly smart and able to quickly and reliably execute tasks in unstructured environments. However, the uncertainty of such environments and the diversity of tasks pose a series of challenges.
First, robots need relatively high computing power and will consume a lot of power if they perform all calculations. However, lightweight design is an important requirement for intelligent robots to work in real-world environments, making it impossible to equip such robots with numerous CPUs/GPUs or large-capacity batteries. Second, robots have relatively limited sensing data. For targets outside the field of view, relying only on this data may result in a low task success rate or a long completion time.
A key solution to addressing these challenges it to offload computing to networks. The multimodal data collected by robot sensors is transmitted to the network together with task instructions. The network performs inference to generate the final output, such as target detection and path planning results, and then sends the output back to the robot for execution. And thanks to its superior sensing capability, the network can provide comprehensive environmental information to assist the robot in task execution, such as path planning.
In these scenarios, the end-to-end inference latency must be within 200 ms, and the inference accuracy must be at least 90% [22]. Furthermore, the network must be capable of accommodating sufficient intelligent robots that meet the latency and accuracy requirements of the AI services. Specifically, each cell must support the stable running of at least 30 intelligent robots.
5 Key 5G-A AI Technologies
5.1 Key Technologies for Improving Network Performance
Based on Shannon information theory, wireless air interface transmission technology has undergone long-term development since the 1950s and 1960s. It has been split into various sub-fields, such as modulation and demodulation, pilot and channel estimation, channel measurement, and waveform, which have been widely used in commercial communications systems of 2G to 5G cellular networks.
AI/machine learning (ML) has also undergone long-term development and accumulation. For example, Turing proposed the famous Turing test in 1954, and the concept of "artificial intelligence (AI)" was first proposed at the Dartmouth Conference in 1956. Then, AI underwent two rounds of technical development. After 2006, with DL algorithms and large datasets emerging as a new breakthrough point, the third wave of AI development quickly swept through various fields.
Combining AI with 5G-A networks to improve network performance is a cross-field technology that connects communication theories and AI methods. By effectively combining and extending the mathematical model, system architecture, and algorithm design in the two fields, we can build an intelligent kernel for wireless communications networks, providing better transmission performance, higher O&M efficiency, and tailored user experience. In this section, we describe the key technologies used by AI to improve network performance in terms of constellation design, flexible pilots, TDD-oriented high-precision channel measurement, and coverage enhancement waveforms.
5.1.1 AI-based Constellation Design
Constellation modulation is a digital modulation technology that carries information bits on carrier signals. Modulated signals can be vividly represented on a 2D plane by using a constellation diagram. In existing wireless communications systems, quadrature amplitude modulation (QAM) is usually used, that is, amplitude modulation is performed on two quadrature carriers I and Q. QAM can be further classified into N-QAM based on the number of constellation points in the constellation diagram, where N is the QAM order. Each modulation symbol can carry log2N bits. Under normal circumstances, the candidate sets of amplitude modulation on quadrature carriers I and Q are the same. Therefore, N is usually an even power of 2, for example, 16QAM, 64QAM, 256QAM, or 1024QAM, as shown in Figure 1.

Figure 1 Constellations of 16QAM, 64QAM, and 256QAM
QAM is a regular modulation technology, with which the relationship between each constellation point and its corresponding information bit can be represented using the same formula. For example, the correspondence between the 16QAM constellation point and 4 bits can be represented as
$$\mathrm{s}=\frac{1}{\sqrt{10}}\left(\left(1-2 b_{1}\right)\left(2-\left(1-2 b_{3}\right)\right)+1 j *\left(1-2 b_{2}\right)\left(2-\left(1-2 b_{4}\right)\right)\right)$$
which makes QAM modulation and demodulation easier to implement. However, this regularity also restricts the performance of QAM. In an Additive White Gaussian Noise (AWGN) channel, theoretically, the closer the constellation diagram is to Gaussian distribution, the closer the performance is to the Shannon channel capacity. The QAM constellation diagram is not represented as Gaussian distribution, accounting for a gap between the performance and Shannon channel capacity. The gap becomes larger as the number of orders increases, as shown in Figure 2.

Figure 2 Channel capacity of QAM
Conventional irregular constellation modulation is classified into geometric shaping and probabilistic shaping. Geometric shaping changes the position coordinates of constellation points in the constellation diagram, so that these points tend to follow Gaussian distribution. And although probabilistic shaping does not change the geometric shape of the constellation diagram (e.g., the QAM constellation diagram), it does change the appearance probability of constellation points of transmitted signals, making them closer to Gaussian distribution.
Theoretically, both methods can make the distribution of the constellation diagram closer to Gaussian distribution. In reality though, given a non-ideal receiver, the optimal constellation distribution may not be Gaussian distribution, because it varies under different channel conditions (such as SNR). Consequently, it is theoretically difficult to provide an optimal constellation design.
An optimal constellation design that best adapts to the target channel conditions can be obtained by AI with end-to-end training. Compared with QAM, geometric shaping, and probabilistic shaping, AI-based constellation design is more flexible. As shown in Figure 3, in an AWGN channel, the geometric shaping, bit mapping, and corresponding demodulator of a constellation diagram can be jointly trained in an end-to-end manner, making the performance of the constellation diagram approximate the Shannon channel capacity. At the transmit end, the AI model outputs the constellation diagram, including positions (values of carriers I and Q) of all constellation points (modulation symbols), and bit mapping is represented by the sequence of constellation points. For example, for a 4-order constellation diagram, bit00 corresponds to the first constellation point, bit01 corresponds to the second constellation point, and so on. At the receive end, the AI model is fed with modulation symbols with noise, and outputs a log-likelihood ratio (LLR) for each bit. The loss function can be a binary cross entropy (BCE) function, so that the LLRs output by the AI demodulator approximates the sent information bits. After bits pass through the channel equalization module, most of the impacts from channel and multi-user interference have been eliminated. For the demodulator, it can be approximately considered that bits pass through the AWGN channel. Therefore, in fading channel and multi-user scenarios, constellation diagrams can be obtained through training in AWGN channels.

Figure 3 Training process of AI-based constellation design
Figure 4 shows two examples of irregular constellation diagrams designed by AI. Although AI-designed constellation diagrams approximate Gaussian distribution much more than QAM constellation diagrams do, they are still more flexible and can easily adapt to various channel conditions when compared with conventional geometric shaping.

Figure 4 AI-based constellation design
5.1.2 AI-based Flexible Pilots
In 5G systems, there are various reference signals with different functions. Demodulation reference signals (DMRSs) are one such type of reference signal and are used to estimate the channel response of time-frequency resources occupied by data. However, there is a trade-off between channel estimation accuracy and DMRS density/overheads. If the channel has a relatively larger frequency selectivity (i.e., the channel changes significantly in the frequency domain), the DMRS density in the frequency domain should be increased. Similarly, if the channel changes significantly in the time domain, more resources need to be occupied in the time domain to deploy DMRSs. After determining the DMRS density in both the time and frequency domains, we need to further consider the positions of DMRSs in time-frequency resource blocks (RBs). For example, when the channel is stable, DMRSs can be evenly allocated in the frequency and time domains in order to reduce interpolation errors and implementation complexity. Because DMRSs do not transmit any data signals that are useful to users, DMRSs need to be allocated at an appropriate density to maximize throughput.
In existing protocols and solutions, DMRSs and data signals are still orthogonally allocated with time-frequency resources. Although this guarantees channel estimation performance, more resources need to be reserved for DMRSs (especially in a mobility scenario), leading to a contradiction between channel estimation performance and available data resources. A more flexible DMRS design can be generated using AI-based flexible pilot, which achieves the following functions by superimposing DMRSs and data signals for transmission:
- Providing a larger optimization space, releasing DMRS resources, and maximizing resource efficiency.
- Breaking the dependency on orthogonal DMRSs and the limitation on the number of DMRS ports.

Figure 5 Transmission pattern of data and pilot symbols in RBs in superimposed pilot mode
- Resolving complexity issues by introducing AI to allocate power between data signals and DMRSs in superposition transmission mode and jointly design receivers (using AI to integrate multiple receive modules, including at least channel estimation and equalization modules).
- Implementing DMRS coverage on all available time-frequency resources, and forming an irregular DMRS pattern through power allocation to improve the system's adaptability to the communication environment.
Specifically, when each resource element (RE) carries both modulation symbols and reference signals, power normalization is still required. Figure 5 shows the pattern of pilot and data symbols on each RE of a single RB in watermark pilot transmission mode. In the figure, the blue lines indicate the power proportion of data symbols, and the yellow lines indicate the power proportion of pilot symbols. The receive end can perform channel estimation and data demodulation on superimposed pilots using a corresponding AI receiver.
AI is deployed on base stations in uplink scenarios, and deployed on UEs in downlink scenarios. The power allocation factor is obtained through AI training, and base stations allocate power based on the number of paired layers. The carried pilot sequence can reuse the sequence generation manner in existing standards.
5.1.3 TDD-oriented High-Precision Measurement
To improve coverage and performance of wireless networks, it is necessary to expand the antenna scale. However, this will lead to a significant increase in the amount of occupied air interfaces needed for high-precision channel measurement. Implementing high-precision channel measurement in a massive MIMO system with limited air interface occupation becomes a key challenge to the 5G-A time division duplex (TDD) system.
To address this challenge, it is critical to effectively use limited channel information obtained based on different reference signals to restore full-dimensional channel information. For example, UEs obtain downlink channel information from downlink reference signals, and base stations obtain uplink channel information from uplink reference signals. However, because the channel information from different reference signals originates from the same communication environment, the measurement can be considered as being performed for the same communication environment from different perspectives or in different modes.
By utilizing AI technologies that fuse multimodal information, we can effectively integrate channel information obtained from different reference signals, and implement high-precision restoration for the communication environment and channels.

Figure 6 High-precision restoration for the communication environment and channels using AI technologies that fuse multimodal information
5.1.4 AI-based Coverage Enhancement Waveforms
A New Radio (NR) system currently has two types of waveforms in the uplink: multi-carrier orthogonal frequency division multiplexing (OFDM) waveform and single-carrier OFDM waveform. The latter is formed by adding an additional discrete Fourier transform (DFT) before a conventional OFDM processing process is executed and is therefore also called DFT-spread OFDM (DFT-s-OFDM) waveform. In contrast, the multi-carrier OFDM waveform involves superposing signals of multiple subcarriers. If subcarrier signals are superposed in the same direction, very high peak values are generated at times. Consequently, the single-carrier OFDM waveform has a low peak to average power ratio (PAPR), meaning a better coverage performance. However, for UEs requiring ultimate deep coverage (e.g., cell edge UEs or UEs with multi-time building penetration loss), the PAPR of the DFT-s-OFDM waveform is still high. It is necessary to design a new waveform with better coverage performance than the DFT-s-OFDM waveform for these UEs.
Conventional technologies used to reduce the PAPR of OFDM waveforms mainly include filter and clipping. Filter technology reduces the PAPR by designing a proper frequency domain filter to change the time domain waveform, whereas clipping technology reduces the PAPR by scaling down the signals that exceed the threshold. Although both technologies can optimize the PAPR of OFDM waveforms to a certain extent, they do so at the cost of losing some data transmission throughputs.
AI-based coverage waveform design optimizes the waveform coverage performance through AI training. This approach, when compared with the conventional waveform PAPR reduction technology, achieves a trade-off between coverage and throughput through multi-objective joint optimization, improving coverage while reducing throughput loss. A prime example of this is tone reservation (TR) technology based on AI.
In TR technology, several subcarriers in addition to the transmission bandwidth are reserved. As shown in Figure 7, signals on the subcarriers are optimized through AI training to change the waveform shape and reduce the PAPR. The input of the AI model is the time domain signals of the original data symbols, and the output is the signals on the reserved subcarriers. To avoid wasted power on the reserved subcarriers, the subcarrier power can be constrained during training.

Figure 7 AI-based TR design
5.2 Key Technologies for Providing High-Quality AI Services
5.2.1 Distributed Inference
The increasing scale of parameters used in large models requires more powerful hardware for training and inference, pushing AI models to be deployed closer to the edge. Base stations, being closest to UEs, can detect channel changes in real time and fully utilize the deep coupling of communication and computing resources to optimize AI service inference performance.

Figure 8 Dynamic scheduling of air interface resources based on joint communication and computing
Currently, there are four types of techniques for optimizing edge AI distributed inference:
- Sparse quantization and structure freezing: It optimizes inference by reducing computing workload and memory usage. For example, quantization compresses model parameters from high-precision floating-point numbers (such as 32-bit) to low-precision integers (such as 8-bit) or fixed-point types, reducing the model size and computing workload. To achieve the same objectives, we can freeze parts of the model that do not need to be updated (such as pre-trained layers), preventing them from participating in training and inference. Sparsification is a process of removing unnecessary connections or parameters from a model to reduce the model complexity, thereby reducing the computing workload and storage space. Common sparsification methods include weight pruning and structured sparsification.
- Pipeline serialism: Different layers of a model are allocated to different edge devices for serial inference. For example, the image preprocessing layer is allocated to a device with low power consumption, the convolutional layer is allocated to a device with higher performance, and finally the classification layer is allocated to a base station or cloud. This method fully utilizes the advantages of different devices to improve inference efficiency.
- Tensor parallelism: Model computing tasks are allocated to multiple edge devices for parallel inference. For example, a large-scale matrix multiplication operation is decomposed into a plurality of sub-operations that are allocated to different devices for execution, and the results are finally integrated. This method fully uses the parallel computing capability of multi-core processors to accelerate model inference.
- Batch processing: Multiple inference requests are combined for one-time processing to improve inference efficiency. For example, image recognition requests of multiple users are combined into a batch for inference, and the results are separately returned to the users. This method effectively reduces the inference delay and improves resource utilization.
Implementing dynamic sparsification, pipeline serialism, tensor parallelism, and batch processing based on dynamic changes of communications and computing resources is challenging from the perspective of air interfaces. By utilizing the unique advantages of communication-computing integration of base stations, we can design a dynamic device-network computing power collaboration solution, opening up a new space for future networks to participate in the AI computing field.
5.2.2 Retrieval Augmentation
In the world of rapidly developing wireless communications, the processing power, storage, and computing capabilities of end devices and access network devices often limit their performance. These limitations pose a challenge for large-scale model training. However, the emergence of distributed training effectively addresses this challenge. It enables computing tasks to be shared across multiple devices, thereby reducing the load of a single device while enabling each device to use its own data.
Distributed training for wireless communications systems involves several key technologies that constitute the technical pillar of this field:
- Model parallelism: It is an effective method for enabling large-scale model training. When dealing with large numbers of model parameters that cannot be carried by a single device, model parallelism splits the model into multiple parts and distributes them to different computing devices. This reduces the memory requirements for each device and significantly improves computing efficiency. Model segmentation is a core step in this process and can be performed based on the model hierarchy or with reference to the functions and features of end devices, access network devices, and cloud servers.
- Distributed knowledge base: Building a distributed knowledge base is crucial given the varying information that can be obtained by different end devices and access network devices. This process includes data collection, preprocessing, and knowledge representation design. For example, a base station with global sensing capabilities must collect data from various environments and preprocess it in a unified manner to ensure the consistency of its format and quality. We then need to determine how to represent this knowledge in the knowledge base, potentially involving the representation of features, model parameters, or intermediate results.

Figure 9 Building a distributed knowledge base

Figure 10 Success rate of robot navigation planning with the help of the base station's point cloud data
Figure 9 depicts how a distributed knowledge base is built. In the figure, the base station collects point cloud data of the environment to build a point cloud knowledge base, which is then shared to the LLM for further analysis and use. When executing a navigation task, the robot collects environmental information (video) in real time and sends the information to the LLM together with a navigation request. The LLM comprehensively analyzes the point cloud data and the video information to formulate an accurate navigation path. Finally, the LLM sends the path back to the robot, guiding the robot to complete the navigation task.
In indoor robot navigation planning tasks, the success rate is heavily dependent on the percentage of global environmental point cloud data provided by the base station. As shown in Figure 10, as the percentage of point cloud data increases, the robot's success rate in performing navigation planning tasks indoors also increases. This shows that detailed environmental data is critical to improving the accuracy and reliability of robot navigation. Specifically, the richness of point cloud data directly affects a large model's cognition of the environment, helps the large model optimize its path planning algorithm, and enables the robot to complete the navigation task more efficiently.
- High-precision small model design: It is an important technique for improving the overall system performance. Multiple small models are designed based on the features of base stations and cloud servers, and are trained using high-quality labeled data. During training, advanced techniques such as transfer learning can be used to improve model learning efficiency. Additionally, small model prediction and large model verification techniques can also be utilized. The prediction result of a small model needs to be converted into a format suitable for the input of a large model, for example, a vector or encoding format. The large model then further verifies and adjusts the result of the small model to ensure the accuracy of the final output.
Through comprehensive application of these techniques, distributed training in wireless communications systems can overcome the performance limitations of devices while maximizing each device's advantages to implement more efficient and accurate model training.
5.2.3 Feature-based Communication
Feature-based communication is a new communication paradigm that focuses on the bit streams of data and further analyzes and understands the meaning that the data represents. It deeply understands the semantic content of the transmitted data. This helps the receive end identify the most valuable information for obtaining the intent of the transmit end, thereby improving communication efficiency and accuracy.
Semantic-aware air interface communication technology integrates semantic information understanding and processing into the communication process, and optimizes the transmission policy based on semantic features. This technology is reflected in two aspects: (1) differentiated transmission of feature flows over air interfaces, and (2) air interface communication with the fault tolerance of feature flows. The following describes these two aspects.
- Differentiated transmission of feature flows over air interfaces
In conventional communications systems, data packets are transmitted and received based on technical counters such as signal strength and error rate. However, feature-based communication can process data more intelligently by introducing semantic understanding. For example, when a data packet including important information is transmitted over the air interface, its transmission priority can be increased, ensuring that this packet arrives at the destination quickly and reliably.
Additionally, feature-based communication can dynamically adjust the encoding and modulation policy based on the contribution of different feature flows to semantic recovery at the receive end.
Figure 11 shows an example of differentiated transmission of feature flows over air interfaces based on semantic guidance. Data on the signal source side is divided into multiple feature flows through semantic conversion. Before transmission, these feature flows are prioritized according to their semantic importance. Different channel encoding policies are then used based on these priorities. Take feature flows that can contribute more to semantic recovery at the receive end as an example. In such a case, more transmission resources are allocated, or a more reliable modulation and encoding policy is used. This ensures that the whole system for differentiated transmission of feature flows over air interfaces based on semantic guidance can select the most appropriate transmission manner according to semantic importance. Furthermore, this optimizes the use of spectrum resources while also guaranteeing the communication quality.
- Air interface communication with the fault tolerance of feature flows
This technique can resist wireless channel instability and possible errors by recovering the semantic content of original information when some feature information is damaged or lost.
As shown in Figure 12, conventional communication techniques focus on the correct transmission of bit streams. If part of the bit transmission fails, the receive end cannot recover the intent or information of the transmit end during image recovery. In this case, errored bits appear as mosaics, and artifacts appear on the entire image.

Figure 11 Differentiated transmission of feature flows over air interfaces based on semantic guidance

Figure 12 Air interface communication with the fault tolerance of feature flows
In contrast, air interface communication with the fault tolerance of feature flows ensures that the receive end can recover or infer the intent or information of the transmit end, even if part of the bit transmission fails, meaning that the image can be recovered. To sum up, by leveraging fault tolerance of feature flows, we can significantly improve the tolerance and recovery capabilities of air interface communication.
6 Conclusion and Outlook
This paper discussed the evolution trend, key technologies, and application prospects of AI in 5G-A networks. It covers the evolution of AI models, the rapid progress of computing power and data, and the key values and use cases of AI in 5G-A networks, comprehensively demonstrating the opportunities and challenges faced by 5G-A AI technologies. In particular, AI's key role in improving network performance, providing high-quality AI services, and enabling embodied intelligence and AIGC applications underscores its core position in future communication and network development.
The in-depth integration of 5G-A and AI will herald the arrival of a new intelligent era. As technology advances, the potential of AI will become more apparent in more fields, especially in NLP, image and video generation, and multimodal interaction. This will enable more personalized and intelligent services to improve user experience and provide innovative solutions for various industries. Furthermore, AI's huge demand for computing power and data will continuously drive the upgrade of hardware and network infrastructures to empower more efficient distributed training and inference. Key technologies in improving wireless network performance, such as AI-based constellation design, flexible pilot, high-precision channel measurement, and coverage enhancement waveform, will further optimize the use of network resources and improve communication efficiency.
Although AI has a promising future, it also faces many challenges:
- New breakthroughs in network capabilities: The continuous evolution of 5G-A technologies will further improve the network speed, connection density, and latency.
- New network O&M mode: Intelligent and automated tools will change the network O&M mode and improve efficiency and accuracy.
- Popularization of intelligent services: AI technologies will be widely used in various industries to significantly improve production efficiency and user experience.
- Technological innovation and integration: New technologies such as edge computing and network slicing will be deeply integrated with AI to provide customized services for specific application scenarios.
- Data security and privacy protection: As the amount of data processed by AI increases, technologies and regulations that ensure data security and user privacy need to be strengthened.
- AI explainability: It is necessary to improve the transparency and explainability of AI decision-making to ensure the fairness and security of the system.
In summary, AI in the 5G-A era marks a new beginning, and its development will have a profound influence on the future social structure and human life.