This section provides information required to troubleshoot abnormal multi-UE rates. The information includes fault descriptions, background information, possible causes, fault handling method and procedure, and typical cases.
Troubleshooting Guide 8 Troubleshooting Rate Faults
8.1 Definitions of Rate Faults
This section defines rate faults.
The following are rate faults and their definitions:
l No transmission
User equipment (UE) that has accessed a network cannot perform data services.
l Low downlink rate on a single UE
The observed rate of a downlink service, either a User Datagram Protocol (UDP) or Transmission Control Protocol (TCP) service, on a UE is at least 10% lower than the baseline value.
l Downlink rate fluctuation on a single UE
The observed rate of a downlink service, either a UDP or TCP service, on a UE fluctuates by more than 50%.
l Low uplink rate on a single UE
The observed rate of an uplink service, either a UDP or TCP service, on a UE is at least 10% lower than the baseline value.
l Uplink rate fluctuation on a single UE
The observed rate of an uplink service, either a UDP or TCP service, on a UE fluctuates by more than 50%.
l Abnormal rates on multiple UEs
A key performance indicator (KPI) indicates an abnormal rate, or a large number of users complain about their traffic rates. This fault may be caused by a specific single-UE rate fault or a common rate fault on multiple UEs.
l User-recognized abnormal rate
The rate of a data service on a UE is abnormal according to the user's definition. For example, the currently observed rate is noticeably lower than the rate of the previous day or a period; the observed rate is considerably lower than the rate achieved by equivalent equipment.
These faults can be classified into the following types:
l No transmission
l Low single-UE rate, including uplink and downlink UDP/TCP rates l Single-UE rate fluctuation, including uplink and downlink UDP/TCP rates l Abnormal multi-UE rates
8.2 Background Information
This section provides background information for rate faults. The background information includes the user-plane protocol stack, restrictions that the protocol stipulates for UEs of different categories, and method used to calculate the theoretical rates.
Troubleshooting Guide 8 Troubleshooting Rate Faults
LTE User-Plane Protocol Stack
Figure 8-1 shows the LTE user-plane protocol stack. Rate statistics for different layers vary because of headers. Note the header differences during analysis.
Figure 8-1 LTE user-plane protocol stack
The traffic rates of data services can be measured in the following ways:
l The Ethernet-layer rate can be measured by using DU Meter at the server and client.
l The rates at the RLC and MAC layers can be measured at the eNodeB.
l The rates at layers such as RLC and MAC for Huawei user equipment (UE) can be measured by using the Probe.
Protocol-Defined Rates for UE Categories
3GPP TS 36.306 specifies the rates for various UE categories, as listed in Table 8-1 and Table 8-2.
Table 8-1 Downlink physical layer parameter values for UE categories UE Category Maximum
Category 1 10296 10296 250368 1
Category 2 51024 51024 1237248 2
Troubleshooting Guide 8 Troubleshooting Rate Faults
UE Category Maximum
Category 3 102048 75376 1237248 2
Category 4 150752 75376 1827072 2
Category 5 302752 151376 3667200 4
Table 8-2 Uplink physical layer parameter values for UE categories
UE Category Maximum Number of
Bits of a UL-SCH
In LTE networks, the theoretical traffic rate relates to the system bandwidth, modulation scheme, multiple-input multiple-output (MIMO) mode, and parameter settings. Theoretical rate calculation for a cell considers the number of symbols occupied by the physical downlink control channel (PDCCH) in each subframe and the amount of time-frequency resources occupied by the synchronization channel, by reference signals, and by the broadcast channel.
The theoretical rate can be determined based on the number of RBs and modulation order. For details, see 3GPP TS 36.213.
Take a 20 MHz cell as an example. The only UE in the cell can use 100 RBs and MCS index 28. Then, the TBS of 75736 can be selected at the MAC layer for the UE. If MIMO is used, two transport blocks (150752) are transmitted per transmission time interval (TTI), which is 1 ms.
Then, the throughput is 150.752 Mbit/s.
NOTE
The theoretical rate calculated is the protocol-stipulated MAC-layer rate, not the application-layer rate for eNodeBs.
Troubleshooting Guide 8 Troubleshooting Rate Faults
8.3 Troubleshooting Abnormal Single-UE Rates
This section provides information required to troubleshoot abnormal single-UE rates. The information includes fault descriptions, background information, possible causes, fault handling method and procedure, and typical cases.
Fault Description
The observed rate is stable but at least 10% lower than the baseline value.
Figure 8-2 Rate fault 1 - stable but lower than the baseline value
The observed rate fluctuates by more than 50%, as shown in the following figures.
Figure 8-3 Rate fault 2 - fluctuation type 1
Figure 8-4 Rate fault 2 - fluctuation type 2
Related Information
The User Datagram Protocol (UDP) is a simple datagram-oriented transport-layer protocol. UDP provides an unreliable service. It sends datagrams from the application to the IP layer but does
Troubleshooting Guide 8 Troubleshooting Rate Faults
not ensure that the datagrams can arrive at their destinations. However, UDP features a high transmission speed, because a connection does not need to be set up before UDP-based transmission between a client and a server and retransmission upon timeout is not applied.
The Transmission Control Protocol (TCP) provides connection-oriented reliable delivery of a stream of bytes. A client and a server can transmit data between each other only after a TCP connection is set up between them. TCP provides functions such as retransmission upon timeout, discarding of duplicate data, data checking, and flow control for data delivery from one end to the other end.
TCP uses a more complicated control mechanism than UDP. In most cases, a link with a normal TCP rate has a normal UDP rate, but a link with a normal UDP rate does not necessarily have a normal TCP rate. When diagnosing rate faults, ensure normal UDP rates before handling TCP services.
3GPP specifications impose uplink capability constraints on user equipment (UE) categories.
Only UEs of category 5 support 64 quadrature amplitude modulation (64QAM) in the uplink.
Possible Causes
A common way to find a cause is as follows: First, check whether the service involved is a UDP service or a TCP service. If it is a TCP service, inject uplink and downlink UDP packets on a single thread and check whether the uplink and downlink UDP rates can reach their peak values.
The purpose is to "clear the way" for TCP rate fault diagnosis. For example, eliminate rate limiting at the network adapter and rectify radio parameter setting errors before handling TCP rate faults. If the service involved is a UDP service, locate the fault by investigating link from the server to the UE in an end-to-end manner. Second, if the UDP rate can reach its peak value but the TCP rate cannot, the fault exists in the TCP transmission mechanism.
Abnormal rates have the following possible causes:
l Fault in the data source at the server
l Insufficient traffic into the eNodeB due to transmission problems
l Radio interface faults, such as eNodeB alarms related to the radio interface, signal quality problems, parameter setting errors, problems caused by multiple UEs online, license issues, and uplink interference (required to be checked for abnormal uplink rates)
l Fault in the PC connected to the UE
l TCP parameter setting error, or fault in the TCP transmission mechanism
Fault Handling
None
Fault Handling Procedure
1. Check whether data services run abnormally.
If a UE fails to access any data services, check whether the UE has been connected to or disconnected from the network. Ensure that the UE is connected. Then, check the firewall settings at the PC and the server. Ensure that the firewalls allow access of the data services.
In addition, check whether routes from the server to the evolved packet core (EPC) work properly. On the server, ping the user-plane IP address of the unified gateway (UGW). If the ping operation fails or the delay is excessively long, contact EPC or datacom technical support.
Troubleshooting Guide 8 Troubleshooting Rate Faults
2. Check whether the server malfunctions.
a. On the server, run the following command to set the UDP packet injection volume:
iperf –c x.x.x.x –u –i 1 –t 99999 –b yyym
NOTE
"x.x.x.x" denotes the service IP address of the UE.
"yyym" denotes the UDP packet injection volume, which depends on the UE in use and the cell bandwidth. The value can be greater than the theoretical maximum value as long as the data volume is sufficient.
b. On the PC, run the following command to start receiving packets:
iperf –s –u –i 1
c. (Optional) If the actual output traffic volume from the server does not reach the specified "yyym", run the following command with "-l" added to adjust the UDP packet size:
iperf –c x.x.x.x –u –i 1 –t 99999 –b yyym -l 1000
d. (Optional) If the actual output traffic volume from the server still fails to reach the specified "yyym", replace the server.
3. Check whether the input traffic volume to the eNodeB is insufficient.
A common reason for the insufficient input traffic volume is a bottleneck transmission bandwidth at an intermediate node. Check whether:
l The bandwidth is correctly set along the transmission link.
Ensure that all network elements and interfaces work at the gigabit level and in auto-negotiation speed mode. The network elements include at least Ethernet ports on the server and all switches and routers on the network.
l The transmission bandwidth on the transmission link is greater than the peak value.
If microwave is used for transmission, ensure that the transmission bandwidth is greater than the peak value.
NOTE
The transmission link refers to the S1 interface from the server to the eNodeB.
4. Check whether the radio channel quality is unsatisfactory.
l Check whether the downlink signal quality is poor.
Use the software matching the UE type to measure signal quality parameters, such as the reference signal received power (RSRP) and signal to interference plus noise ratio (SINR). The RSRP and SINR must fulfill certain conditions to meet rate requirements.
For example, to enable the actual maximum rate to approach the theoretical peak value, ensure that the RSRP and SINR stay above -85 dBm and 26 dB, respectively.
l Check whether the block error rate (BLER) is excessively high on the radio interface.
Monitor the BLER on the M2000 client. If the BLER is higher than 10%, the channel condition is poor. Improve the channel condition for better downlink signal quality.
l (Optional) Check whether uplink interference exists.
When a cell is unloaded in the uplink (all UEs are powered off and there is no service in the cell), check the received signal strength indicator (RSSI) across the uplink band.
In a normal case, the RSSI on each resource block (RB) is about -120 dBm when the cell is unloaded. If the RSSI is 3 dBm to 5 dBm higher than the normal value, uplink interference exists. Locate the interference source, and mitigate the interference.
5. Check whether the basic information about the data services or the parameter settings are incorrect.
Troubleshooting Guide 8 Troubleshooting Rate Faults
This check is twofold:
l Check whether the basic information about the data services is incorrect.
In this step, check the user's subscription information and UE's capability. Specifically, check whether the user is subscribed to the correct QCI, whether the MBR and AMBR of the UE are set as expected, and whether the UE is empowered with expected capabilities.
l Check whether the basic information about the parameter settings is incorrect.
The parameter settings refer to the settings for the eNodeB. Algorithm setting changes cause severe drops in the traffic rate. Export eNodeB parameter settings, and compare them with the baseline values. If the values are inconsistent, confirm whether the settings are customized for the operator or have been changed to incorrect values. If the settings have been changed to incorrect values, inform the operator immediately.
6. Check whether the number of users in the cell is excessively large.
Check the number of users in the cell and the downlink RB usage by performing Users Statistics Monitoring and Usage of RB Monitoring tasks, respectively, under cell performance monitoring. If an excessively large number of users have accessed the cell and RBs are exhausted when a UE accesses the cell, the traffic rate on each UE will not be high, and low-priority users will experience even lower traffic rates.
7. Check whether license information is incorrect.
Run the LST LICENSE command to query license information, and observe whether:
l The license has expired, or limitation is imposed on functions related to the data services.
l The licensed throughput capability is correct.
8. Check whether the client works abnormally.
Client faults may exist in the UE or in the PC connected to the UE.
l Check for faults in the UE.
If spare UEs are available, replace the UE and check whether the rate fault disappears.
If it disappears, the fault exists in the UE.
l Check for faults in the PC connected to the UE.
Investigate the software installed and running on the PC. You are advised to remove or close all programs except those required by the test. In addition, close the Windows firewall and firewalls of antivirus programs.
Check the central processing unit (CPU) usage. If the CPU usage exceeds 80%, the CPU is heavily loaded. Close unused software or service, or replace the PC with a better one.
9. Check for TCP errors.
TCP fault diagnosis varies depending on the symptom. If the throughput is maintained at a level lower than the peak value, check parameter settings and the round trip time (RTT).
If the throughput can reach the peak value but is not stable, check for packet loss and severe packet misordering.
l Check the TCP rate status.
Use a multi-thread download program (for example, FlashGet or FileZilla) or open multiple Windows command line windows to download data. If the rate is higher than the single-thread rate, perform further TCP checks. If the rate is equal to or even lower than the single-thread rate, go back to the previous steps to recheck for possible faults.
l Check basic TCP parameter settings.
Troubleshooting Guide 8 Troubleshooting Rate Faults
Ensure that the basic TCP parameters are correctly set. The parameters include the receive window, send window, and maximum transmission unit (MTU).
l Check the RTT.
Ping the server by using 32-byte packets and MSS-byte packets (MSS is short for maximum segment size), and take the average RTT value for the two types as the calculated RTT. Typically, the RTT value is required to be less than or equal to 50 ms.
Link optimization is required if the RTT value is greater than 50 ms.
l Check for packet loss and severe packet misordering.
On the PC side, trace packet headers or use the TCP fault diagnosis module to check for packet loss and severe packet misordering. If packet loss or severe packet misordering occurs, contact datacom personnel for handling.
10. If the fault persists, contact Huawei technical support.
Typical Cases
l Case 1: The downlink rate was low with microwave transmission.
Fault Description
On network X in a country, the cell bandwidth was 15 MHz. In a downlink File Transfer Protocol (FTP) throughput test using a single UE in a single cell, it was found that all eNodeBs connected to a 100 Mbit/s microwave transport network had their downlink throughput not exceeding 30 Mbit/s, but eNodeBs connected to a 1 Gbit/s optical transport network had their downlink throughput as high as 80 Mbit/s.
Fault Diagnosis
A UDP test found that the UDP throughput was 100 Mbit/s at the sender but dropped to only 80 Mbit/s at the receiver (eNodeB). Severe packet loss occurred. Due to TCP congestion control, the throughput of 30 Mbit/s was normal, so the fault did not exist in the eNodeB. The operator requested operation and maintenance (OM) personnel to locate the packet loss point based on the following assumption: The throughput of 80 Mbit/s on the optical transport network did not reach 100 Mbit/s, so congestion should not occur in the microwave transport network.
The microwave transmission media were replaced with an Ethernet cable for the direction connection between the eNodeB and the S-GW. The FTP transfer rate was maintained at 30 Mbit/s. The segment-by-segment check found that packet loss occurred at a position between the input and output ports on a switch before packets entered the microwave network. The operator traced the input and output ports and confirmed that packet loss occurred. The operator further found that the fault was caused by a small buffer size that was set for the port on the switch.
Fault Handling
The operator extended the buffer size and tested again. The test result indicated that the downlink rate could reach the expected value. The extended buffer size helps enhance anti-burst capability, reduce the tail drop probability, and increase the FTP transfer rate.
l Case 2: UDP services were functional, but FTP services were unavailable.
Fault Description
Operator T in country D stated that no FTP service was available on eNodeBs operating in the 1800 MHz band but all cells operated properly with UEs normally accessing the cells, being released, and performing UDP services.
Fault Diagnosis
Based on the feedback from the operator, a check for TCP errors was performed directly, only to find that the FTP transfer rate dropped to zero and the server could not be pinged.
Troubleshooting Guide 8 Troubleshooting Rate Faults
Because UDP services ran normally in the downlink, it was almost ascertained that the fault was down link disconnection.
The check on a 800 MHz eNodeB connected to the same transport network found that FTP services ran normally. Therefore, it was highly possible that the eNodeBs had faults. Due to the severe impact of the fault, data configurations were immediately restored for the 1800 MHz eNodeBs by using the backup data configuration files. The fault was rectified.
The faulty configuration files were compared with baseline data configurations. The comparison result indicated that a key radio parameter for downlink and uplink
transmission was set to a value different from the baseline value. The fault was caused by the incorrect parameter setting.
Fault Handling
Parameter settings were changed to baseline values for all faulty eNodeBs.
l Case 3: The traffic rate occasionally reached the peak value using the E398 but never reached the peak value using Samsung UEs.
Fault Description
In a single cell under an eNodeB on network Y in country P, a single Samsung UE could reach only 80 Mbit/s unexpectedly in both single-thread and multi-thread (using FileZilla) TCP download. Huawei E398 could occasionally reach 100 Mbit/s in both single-thread and multi-thread TCP download. Both the Samsung UE and Huawei E398 experienced rate drops.
Fault Diagnosis
A UDP packet injection test was performed, only to find that Huawei E398 and Samsung UE could both reach the peak values. Therefore, the fault should exist in the TCP
transmission mechanism. In this fault case, rate drops occurred, which was an evidence of packet loss. The fault symptoms on Huawei E398 and Samsung UE were different, so there must be causes other than packet loss.
The analysis of TCP/IP headers using a third-party tool indicated that packet loss occurred on the radio interface. It was found from the configuration file for the eNodeB that the QoS
The analysis of TCP/IP headers using a third-party tool indicated that packet loss occurred on the radio interface. It was found from the configuration file for the eNodeB that the QoS