Future Technologies
Performance Requirements and Evaluation Methodology for AI and Communication in 6G
This paper describes the "AI and Communication" scenario and the general principles for performance definition and provides an evaluation methodology.

By Huawei Wireless Technology Lab: Gongzheng Zhang, Jian Wang, Jun Wang, Rong Li, Jianglei Ma, Peiying Zhu

By Research Dept, WN, Huawei: Yan Chen, Jiafeng Shao, Hui Lin
1 Introduction
With the rapid development of artificial intelligence (AI) technologies, especially deep learning and large pre-trained models, AI will become an essential part of almost all systems in industries and daily life. Mobile communication systems, which are already deployed on a massive scale, could be the best choice as a unified infrastructure that integrates communication and AI and that delivers ubiquitous AI services to all connected people and machines. This will drive the revolution of mobile communication systems.
To promote the development of next-generation mobile communication systems, the International Telecommunication Union – Radiocommunication Sector (ITU-R) has identified six typical usage scenarios for IMT- 2030 and beyond. In addition to the three usage scenarios enhanced from IMT-2020, AI and sensing are newly included as two beyond-communication services that are expected to be provided by 6G networks. This will boost new capabilities and corresponding performance indicators. As such, it is necessary to study the corresponding technologies, performance requirements, and evaluation methodology. While most recent works focus on the technologies, further study is needed for the performance requirements and evaluation methodology.
This paper aims to provide guidelines for designing next-generation mobile communication systems and ensure users receive guaranteed AI services. To achieve this, the paper will introduce the performance requirements and evaluation methodology for AI and communication in 6G. Specifically, this paper will first describe the "AI and Communication" scenario defined in the IMT-2030 framework, with a particular focus on the typical AI services and capability requirements of this new usage scenario for 6G. Next, the paper summarizes the current status of the performance indicators, and then introduces the design principles and the proposed qualitative and quantitative performance requirement definitions. Finally, the paper provides the corresponding evaluation methodology and an example, followed by the conclusions.
2 AI and Communication
6G, a next-generation mobile communication system, aims to make intelligence inclusive by providing artificial intelligence as a service (AIaaS). This will enable easy training, fast distribution, and accurate inference of large AI models distributed in wireless networks. And by utilizing the data and resources of distributed intelligent terminals, 6G will be able to provide AI model training services, made possible through local training at distributed terminals and model interaction between them over the network — this can also effectively protect users' data privacy. Furthermore, 6G can provide high-accuracy inference services for resource-constrained terminals by joint scheduling of communication and AI resources. This will drive AIaaS to become a typical application scenario of 6G. This section will describe the corresponding standardization progress in ITU and typical services.
2.1 AI and Communication in the IMT-2030 Framework
To facilitate the development of IMT-2030 and beyond, the Working Party (WP) 5D in the ITU-R has approved a new framework and overall objectives. This includes identifying motivations, applications, technology trends, spectrum, usage scenarios, and capabilities for next-generation mobile systems. A key application trend and enabling technology is ubiquitous intelligence — while AI can enhance the performance of wireless interfaces and enable automation of wireless networks and intelligent network services, one of the key objectives of the IMT system design is to efficiently support AI services within wireless networks.
Among the six usage scenarios identified by the ITU-R, as shown in Figure 1, "AI and Communication" is one that IMT- 2030 specifies as providing beyond-communication services. This usage scenario will support distributed computing and AI applications, including data collection, local and distributed computation offloading, and distributed AI model training and inference. Typical use cases include IMT- 2030 assisted automated driving, autonomous collaboration between devices for medical assistance applications, offloading of heavy computation operations across devices and networks, and the creation of and prediction with digital twins.

Figure 1 Usage scenarios of IMT-2030
To support the new usage scenarios, IMT-2030 should include AI- and sensing-related capabilities in addition to traditional communication capabilities, as listed in Table 1. The "AI and Communication" usage scenario would require high area traffic capacity and user experienced data rates, as well as low latency and high reliability, depending on the specific use case. In addition to the communication aspects, this usage scenario is expected to include a set of new capabilities related to the integration of AI functionalities into IMT-2030. Such capabilities include data acquisition, preparation and processing from different sources, distributed AI model training, model sharing and distributed inference across IMT systems, and computing resource orchestration and chaining. The following subsections will describe the AI capabilities and performance requirements based on the typical AI services.
Table 1 Capabilities of IMT-2030

2.2 Typical Services in the "AI and Communication" Scenario
IMT-2030 will efficiently support AI applications in an end-to-end manner, connecting distributed intelligence to provide ubiquitous AI services (e.g., AI model training, inference, deployment, and more). To achieve this goal, IMT-2030 can build a distributed and efficient AI service platform by utilizing the connection, data, and model resources and capabilities within the network. AI applications include providing intelligent capabilities for network optimization, i.e., using end-to-end AI algorithms in customized optimization and automated operation and maintenance (O&M) for wireless interfaces and networks. Such applications also include providing intelligent capabilities to the users, i.e., serving as a distributed learning infrastructure and moving the centralized intelligence on the cloud to deep edge ubiquitous intelligence through the network's native integration of communication and AI capabilities.
2.2.1 An Exemplary AI Application Served by IMT-2030
Collaborative robots are widely recognized as a future 6G application scenario that requires AI services with low latency and high learning and inference accuracy. In this exemplary use case, multiple robots work together to accomplish complex tasks in an industrial environment. Each robot is equipped with cameras and other sensors and powered by AI capabilities to achieve partial autonomy. For full autonomy and complex tasks, the collaborative robot system should sense, perceive, plan, and control towards the ultimate goal of the task. For example, when someone issues a new voice request for some goods, the command in natural language should first be understood, and the subtasks for each robot should be planned. Both aspects need to be achieved through efficiently trained large (language) models that require huge computing and memory resources. And through local vision or control models, the robots will be able to detect objects from the sensed images and plan the path trajectory with corresponding control decisions for the subtasks. This makes it possible for the AI-enabled robots to collaborate with the network in order to utilize its connected super AI capabilities for planning in complex tasks. These robots can also cooperate with each other over the network to improve the performance of local models via collaborative training, sharing and learning from the experience of each other. Two typical services in this exemplary use case are model inference and training, which are described in following sections.

Figure 2 AI applications for collaborative robots
2.2.2 Model Inference Service
AI model inference is a fundamental function for AI applications. It takes inputs, runs the AI models, and produces the expected outputs. Through ubiquitous connectivity, the 6G network with native intelligence could provide real-time model inference capabilities that meet different requirements. In the distributed AI model inference service, the 6G network jointly utilizes the communication and AI capabilities to provide high-accuracy model inference services in real time for users with limited capabilities through model collaboration. Figure 3 illustrates a typical AI model inference service. In this service, a large model may be split into two parts, which are deployed on the network and user sides and work together. The part with high resource requirements is deployed on the network side, enabling powerful network AI capabilities to provide the joint model inference service for end users.

Figure 3 AI model inference service
2.2.3 Model Training Service
AI model training is key for obtaining a model with high accuracy. The 6G network with native intelligence could provide suitable algorithms and resources for model training orchestration based on different user and network characteristics, thereby improving the speed and accuracy of model training. In the large-scale distributed AI model training service, the network serves as a management platform to provide high-speed data channels and efficient scheduling mechanisms for exchanging data or model parameters between distributed terminals. This supports fast model aggregation and distribution while also ensuring user privacy protection. Figure 4 illustrates a typical distributed training service. In each round, the distributed terminals use local data to train models locally and upload the updated models to the network. The network aggregates these updated local models to obtain a global model, which it then distributes to the terminals. The aggregation and distribution procedures are iterated, enabling joint learning while also protecting users' raw data.

Figure 4 Distributed AI model training service
3 Performance Requirements for the "AI and Communication" Scenario
System design is driven primarily by performance requirements, which evolve or revolutionize each generation of mobile communication systems. Existing mobile communication systems are mainly designed for connection-oriented data transmissions — the key performance indicators (KPIs) of such networks mainly include the transmission rate and latency of connections. However, AI services not only involve transmissions, but also include AI-related resources, meaning that AI model learning/inference accuracy and latency are the KPIs. From a communication perspective, the 6G network should provide high traffic capacity, especially in the uplink direction, in order to meet the requirements of data or model exchange in model training and inference. And from an AI perspective, the 6G network should support large-scale distributed learning and real-time inference. This is why the design of the 6G network should take into account both AI and communication in an integrated manner from the beginning. The following subsections will describe the current status and then detail the principles and architectures for performance requirement definitions in the "AI and Communication" usage scenario.
3.1 Current Status
Previous mobile communication systems, from 2G to 5G, have mainly provided communication services, with data transmission being almost the only objective. Support for AI/machine learning (ML) operations has been studied in Release 18 of 5G. Three typical AI/ML operations have been identified and reported in 3GPP TR 22.874 [2], namely, split inference, model/data distribution, and distributed/federated learning. Various applications have also been defined, for example, image recognition, real-time media editing, split inference and control among robots, and collaborative learning among multiple agents. However, all the AI/ML operations are expected to be executed in cloud servers, and the 5G system still provides only communication services to transmit the data between users and cloud servers, potentially leading to higher data rate requirements.
For the next-generation mobile communication systems that introduce new capabilities beyond communication (e.g., AI-related capabilities), supporting AI services is commonly considered in 6G-related research groups. In their published white papers [3, 4], the China IMT-2030 Promotion Group and Hexa-X, two main 6G research organizations in China and Europe respectively, both identify AI services provided by 6G as a key factor in the design of next-generation networks. They also suggest including new capabilities to identify the performance requirements of AI services. For both AI-enabled air interfaces and AI services, the two organizations propose AI-related performance indicators in addition to the traditional communication performance indicators, including AI model inference accuracy and latency. However, the performance indicators are not illustrated clearly, and no details are defined for the requirements and evaluation methodology in 6G.
The computer science community has defined some training and inference KPIs to evaluate the capabilities of the AI hardware and software systems. For example, the MLPerf benchmark [5] defined by MLCommons builds a method to measure the AI performance for both model training and inference via reference applications, models, and datasets. However, because these KPIs are used to measure the hardware or software capabilities in a centralized way, they cannot be used to measure the capabilities of distributed AI services in 6G networks that involve communication.
3.2 Principles for Performance Definition for AI and Communication in 6G
Leveraging resources such as connections, models, and data within 6G networks, 6G AIaaS provides AI capabilities that adapt to different application scenarios. Unlike conventional mobile networks, 6G networks need resources beyond only connections in order to provide high-performance AI services for users. Accordingly, 6G AIaaS needs to consider integrating communications capabilities and AI capabilities in order to build comprehensive performance indicators and evaluation methods oriented to AI services. This will provide guidance for the 6G network design and network resource configuration.
The main principles of performance definition for AI-related capabilities are as follows.
- End-to-end AI capabilities. AI services should use end-to-end performance as indicators in order to guarantee user-experienced service quality. The AI service quality depends on both communication and AI capabilities. However, the existing performance indicators and evaluation methods only focus on the communication capabilities and therefore cannot guarantee the AI service quality. This is why the IMT-2030 system needs to consider how to integrate communication capabilities with AI capabilities.
- Typical services. The IMT-2030 system is the key to realizing ubiquitous intelligence. By utilizing the AI capabilities within the network, this system should provide a platform for large-scale distributed model training and unified high-accuracy model inference to diverse users via collaboration. Accordingly, AI-related capability indicators need to be defined based on typical services, such as training and inference.
- Core performance. The goal of AI and communication integration is to enable AI services efficiently, including model training and real-time high-accuracy model inference. To ensure that AIaaS is acceptable to billions of users, it is crucial to focus on the key factors that impact user experience. Amidst the plethora of performance indicators available for AI services, the IMT-2030 system must prioritize the core indicators to accomplish this goal.
3.3 Proposed Performance Requirements for AI and Communication in 6G
The KPIs for AI and communication are defined from the perspective of services (including AI model training and inference) provided by 6G networks. The performance of such services depends on the AI model capabilities provided by the system's AI resources and the communication capabilities connecting users and networks. The proposed AI service performance requirements include a group of functionality requirements and three quantitative requirements, described below. The functionality requirements can be evaluated via inspection, and the quantitative requirements need to be evaluated via simulation.
- AI service functionality requirements
The functionality requirements for AI-related capabilities are that the candidate radio interface technologies (RITs) or sets of radio interface technologies (SRITs) shall have mechanisms and/or signaling related to the functionalities (e.g., distributed data processing, distributed learning, AI computing, AI model execution, and AI model inference) that are exposed as capabilities to external applications, or any other functionalities that the proponent(s) of candidate RITs/SRITs consider relevant to better support AI-enabled applications.
- AI service accuracy (or AI service quality)
AI service accuracy is defined as the accuracy of the AI inference/learning service. Specifically, it is the degree to which the outputs from the AI service are the same as the true values for the given inputs within the given service latency requirements, or relative to the reference case. For a given AI task, the AI service accuracy depends on the task characteristics, AI model deployment method, and AI-related data transmissions. Different applications may have different requirements on AI service accuracy. For example, the accuracy requirement of object recognition in autonomous driving applications is much higher than that of ordinary consumers who want to identify flowers. Consequently, the minimum performance requirements for the 6G network can be defined for specific applications with certain accuracy requirements. If the requirements can be met, all applications with lower requirements can be supported by the network. The minimum requirement for AI service accuracy in 6G can therefore be defined as higher than a certain accuracy within a given time for the deployment environment, assuming a given AI inference/learning task.
- AI service latency
AI service latency is defined as the time taken from the start to the end of the AI inference/learning service. It is the sum of the communication time for AI-related data transmissions and the processing time of the AI model, where the processing time depends on the devices and implementations. Similar to AI service accuracy, different applications may have different requirements on AI service latency. As such, the minimum performance requirements for the 6G network can be defined for specific applications with certain latency requirements. If the requirements can be met, all applications with lower requirements can be supported by the network. The minimum requirement for AI service latency in 6G can therefore be defined as less than a certain time with a given inference accuracy for the deployment environment, assuming a given AI inference/learning task.
- AI service density
AI service density is defined as the number of AI services that meet given AI service accuracy and AI service latency requirements supported by the network simultaneously per unit area. It is a system capacity indicator of the IMT-2030 system. For different application requirements (i.e., accuracy or latency), the system can support different AI service densities. This means that the minimum performance requirements for the 6G network can be defined for specific applications or combinations of applications with certain accuracy and latency requirements. The minimum requirement for AI service density in the 6G network can therefore be defined as the number of services per km2 for the deployment environment, assuming a given AI inference/ learning task.
4 Evaluation Methodology and Example
The previous section defines quantitative performance requirements for the typical distributed AI model training and inference services. Service performance is determined by both communication and AI resources and should therefore be evaluated with certain communication and AI assumptions. This section will describe the evaluation methodology first and then present an example with detailed assumptions and results.
4.1 Evaluation Methodology
The performance requirements can be derived from two essential KPIs, namely, AI service accuracy and latency. AI service accuracy is defined as the degree to which the outputs from the AI service are the same as the true values for the given inputs, which depends on both the AI model and AI-related data or model transmissions. AI service latency is defined as the sum of AI model processing time and data transmission time, which also depends on both the AI model and AI-related data or model transmissions. These two definitions hold true for both model training and inference services, sharing similar radio resources but with model exchange and data exchange respectively.
The performance evaluation can follow the service procedures. Figure 5 shows the AI service performance evaluation system. The performance evaluation includes the following key components:

Figure 5 AI service performance evaluation system
- Resource assumptions: The evaluation should be done in a test environment similar to the definition in communication performance evaluations [6]. Within the test environment, the radio configurations should be provided, including the bandwidth, number of antennas at the user equipment (UE) and base station (BS), and so on. This is necessary to reflect the radio resources. In addition, AI tasks should be defined with AI-related configurations, including datasets consisting of inputs and corresponding target outputs with accuracy calculation methods.
- AI service procedures: The entire procedures can start from AI model processing at the UE where the intermediate data (model output or model weights) is generated. Then, this data is transmitted from the UE to the BS under the assumed radio configurations. Next, the BS receives the intermediate data and uses the AI model to process it in order to get the service results, which are then used to calculate performance indicators.
- AI service performance calculation: The AI service accuracy and latency are calculated based on the service results, AI model processing time, and transmission time. As mentioned earlier, the AI service accuracy is calculated as the degree to which the output of the AI model processing is the same as the target value for each input in the dataset. The degree is defined according to the AI task. The AI service latency is the sum of the AI model processing time at the UE and BS and the transmission time of intermediate data.
For AI service density evaluation, AI service density is defined as the number of AI services that meet given AI service accuracy and latency requirements. This can be evaluated through AI service accuracy and latency simulation. For example, we can first set the number of served UEs N to a minimum value, and generate service requests from the UEs. Then, we use the evaluation parameters of the test environment to perform system simulation and collect statistics on the AI service accuracy within the service latency. We can gradually increase N and repeat the simulation until the AI service accuracy falls below requirements, with the value of N to be Nmax. The AI service density is calculated as C = Nmax/Coverage area.
4.2 Evaluation Example
In this subsection, we use the distributed AI inference service as an example to illustrate the performance evaluation methodology presented earlier. This methodology can also be used for collaborative training and inference services after the procedures are modified according to the corresponding service procedures (this is left for future work). In a typical future smart factory, AI-enabled robots need to perceive the environment (to detect objects in real time, for example) through cameras. The images these robots collect can be further used to achieve real-time high-accuracy AI model inference.

Figure 6 Distributed AI inference service example
Figure 6 illustrates the procedure involved in the AI inference service for a user. It includes not only the AI model processing on the user and network sides, but also the transmission between the user and the network. To be specific, the AI inference service consists of three steps: 1) the UE uses the UE-side AI model to process the input data in order to obtain intermediate data; 2) the UE transmits this data to the BS; 3) the BS uses the BS-side AI model to process the received intermediate data and obtain the inference results. Note that this example procedure represents a service starting from the UE with the input data and uploading the intermediate data to the BS to get the results. A similar procedure can also be applied in the downlink direction, where the BS first processes the input data and transmits the intermediate data to the UE to get the results. The following evaluation methodology can also be applied for this downlink case.

Figure 7 AlexNet model deployment example
- Evaluation configurations: The evaluation configurations are defined as follows, with examples given in brackets.
- Test environment: [Dense Urban]
- Radio configurations : [ same as immersive communication (user experienced data rate: 500 Mbps)]
- AI task: [image recognition]
- AI dataset: [ImageNet-1k validation dataset [7]]
- AI model: [AlexNet [8], the left part is processed by the UE, and the right part is processed by the BS, as shown in Figure 7]
- AI model processing time : [UE : 0 . 7 5 ms ; BS: 0.45 ms]
- Evaluation procedures
- AI service accuracy: AI service accuracy can be evaluated by simulation. The UE processes each input of sample \(S_{i}, i=1, \ldots, n\) in the dataset based on the UE-side AI model, and obtains the intermediate data \(Z_{i}\). According to the test environment and transmission configurations, the UE sends the intermediate data and the BS receives it. Taking a classical transmission scheme as an example, the intermediate data is first quantized and represented as bits, which are then encoded and modulated to symbols for wireless transmission. The BS processes the received intermediate data \(\tilde{Z}_{i}\) based on the AI model on the BS side, and obtains the inference result \(\tilde{Y}_{i}\) corresponding to each sample. We can then compare or calculate the inference results with the target output or label \(Y_{i}\) of each sample in order to obtain the degree to which the output is the same as the true value \(acc=\frac{1}{n} \sum_{i=1}^{n}{1_{\left \{ \tilde{Y}_{i}==Y_{i} \right \} } } \), that is , the AI service accuracy.
- For the accuracy of the reference case, we can process each sample \(S_{i}\) in the dataset based on the whole AI model in order to obtain the inference result \(\tilde{Y}_{1}^{'} \). We can then compare the inference result with the label \(Y_{i}\) of each sample to obtain the output of the reference case. The degree to which the output is the same as the true value of reference case is \(acc_{ref}=\frac{1}{n} \sum_{i=1}^{n}{1_{\left \{ \tilde{Y}_{1}^{'}==Y_{i} \right \} } } \). The relative AI service accuracy is calculated as \(\frac{acc}{acc_{ref} } \).
- AI service latency: The AI service latency is the sum of the time used for intermediate data transmission, \(t_{comm} \) , and the UE- and BS-side AI model processing time, \(t_{proc,UE}\), \(t_{proc,BS} \). Therefore, the AI service latency is given by \( t_{service}=t_{comm}+t_{proc,UE}+t_{proc,BS}\). In this example, we use the time calculated as the number of payload bits divided by the data rate as the data transmission time. The number of payload bits is determined by the number of elements in the intermediate data and the number of quantized bits per element. Other schemes can also be used taking new technologies into consideration.
Table 2 AI service performance evaluation results

- Evaluation results
The AI service accuracy and latency under different transmission setups (i.e., number of quantized bits per element) are provided in Table 2. As can be seen from the table, there is a trade-off between AI service latency and AI service accuracy due to the intermediate data transmission. If a minimum AI service accuracy (e.g., 56%) and maximum AI service latency (e.g., 2 ms) are required, we need to optimize the transmission configurations (8 bits per element in the table for this example) or improve the transmission technology to meet both requirements.
5 Conclusions
In this paper, we have illustrated the motivations, typical AI services, and performance requirements of the "AI and Communication" usage scenario — a new scenario defined in IMT-2030 for 6G. To provide guidelines for the system design and better support AI services, we proposed new performance indicators that integrate AI and communication capabilities and resources in the network, from both the user experience and network capacity perspectives. We also provided the corresponding evaluation methodology with a detailed example. This is a first step towards 6G moving from vision to technical designs.