5G/NR - URLLC

URLLC (Ultra Reliable Low Latency)

As the name implies, this is to guarantee the traffic with Low Latency and Very High Reliability at the same time. It has to satisfy the most challenging two requirement (Latency and Reliability) simultaneously.

How Low and How Reliable it should be ?

According to RP-191584 and TR 38.913-7.9, this requirement is specified as follows : (I think this is for single hop (e.g, between UE and gNB) probably at the layer of MAC/RLC)

Reliability : up to 10^-6
Latency : 0.5 ~1 ms

In reality, I think more important performance indicator is end-to-end latency for each application and use case. Following table is the summary of the latency requirement for each application/use cases that I collected from various references listed in the reference section.

**VR / AR (immersive media)**
Application	E2E Latency	Format (Resolution @ FPS)	Typical Bit-rate
360° VR – 4 K	< 20 ms [*]	3840 × 2160 @ 60 fps	30 – 60 Mbps [*]
360° VR – 8 K	< 20 ms [*]	7680 × 4320 @ 90 fps	100 – 200 Mbps[*]
360° VR – 12 K 3-D	< 20 ms [*]	11 520 × 6480 @ 120 fps	≈ 400 Mbps[*]
CG VR – 2 K	≈ 20 ms [*]	2560 × 1440 @ 70 fps	20 – 30 Mbps[*]
CG VR – 4 K	≈ 20 ms [*]	3840 × 1920 @ 90 fps	50 – 100 Mbps[*]
CG VR – 8 K	≈ 20 ms [*]	7680 × 3840 @ 120 fps	100 – 200 Mbps[*]

NOTE : The linked reference does not provide any nicely summarized table. So my translation of the reference may not 100% accurate. Just take this as a ballpark figure and check out the linked references for the details.

< TS 22.186 - Table 5.2-1 Performance requirements for Vehicles Platooning >

< TS 22.186 - Table 5.3-1 Performance requirements for advanced driving >

< TS 22.186 - Table 5.4-1 Performance requirements for extended sensors >

< TS 22.186 - Table 5.5-1 Performance requirements for remote driving >

< 22.261 v17.11 Table 7.6.1-1 KPI Table for additional high data rate and low latency service >

What is it for ?

Why we need this kind of challenging feature ? It seems that they have some use case as follows in mind when they investigated on this. Following is the use case specified in RP-191584

Release 15 enabled use case improvements

Such as AR/VR (Entertainment industry)

New Release 16 use cases with higher requirements

Factory automation
Transport Industry, including the remote driving use case
Electrical Power Distribution

How to Reduce Latency ?

To reduce the end-to-end latency, we need to reduce the latency both on RAN side and on Corenetwork side. Let's briefly think of how we can reduce the latency on each side. Some common technologies to reduce the latency on RAN and Corenetwork side can be summarized in illustration as shown below. RAN side latency reduction is relatively well described in 3GPP specification (mostly in Release 16), but Core network side technology seems to be more up to implementation. Personally I think Core network latency would play more role in terms of end-to-end latency and it would take a while and huge investment to restructure and optimize overall architecture of the core network.

As the name implies, URLLC is not only for Latency Reduction. Reliability improvement is also important to meet the super high standards of URLLC. I think this part would be even more challenging than reducing the latency. A common trick to increase the reliablity on RAN side is to use MCS table for low code rate and repetitive data transmission. Unfortunately this would result in throughput reduction, but it is not easy to think of any other means to increase reliability without sacrificing some throughput performance. In addition, we would come up with some solutions to increase the reliability on coreside (i.e, reliability over coreside data path), but as of now (Mar 2021) I don't have any specify knowledge on how to improve core side reliability. I will update on this as I learn further.

What cause the delay ?

As the term implies, the critical component of URLLC (Ultra-Reliable Low Latency Communication) is about delay… more strictly, it's all about reducing delay. In order to reduce delay, we need to understand the source of the delay in the first place.

Why is minimizing latency so crucial? Well, in many next-generation applications, even milliseconds matter. Think of remote surgery, where a surgeon needs real-time feedback from robotic instruments, or autonomous vehicles, where split-second decisions can be the difference between safety and danger. In these scenarios, traditional network communication just won't cut it.

URLLC aims to tackle these delays head-on, employing a range of techniques to ensure lightning-fast and incredibly reliable communication. In this blog post, we'll dive deep into the world of URLLC, exploring its key technologies, applications, and the challenges that lie ahead.

In this note, we will look into some of the major source of delays. The initial list came from Ultra-Reliable Low-Latency in 5G: A Close Reality or a Distant Goal? (Arman Maghsoudnia) and further details will be added as I learn more.

Source of the dealy

There are the different sources of latency that hinder the achievement of Ultra-Reliable Low-Latency Communication (URLLC) in 5G. Each source contributes uniquely to the overall latency, and their interplay is critical in determining system performance.

URLLC is a key feature introduced in 5G to support mission-critical applications such as autonomous vehicles, remote surgery, industrial automation, and augmented reality, which demand extremely low latency (as low as 0.5ms) and high reliability (99.999%). Despite years of theoretical research and advancements, practical implementations of URLLC often fall short of meeting these stringent requirements due to bottlenecks in system design, protocol inefficiencies, and hardware/software limitations

Processing Latency

Processing latency refers to the time required for decision-making and data processing within the different layers of the 5G system. This latency is distributed across the User Equipment (UE), the gNB (next-generation NodeB), and the core network.
Examples:

UE Side: When a UE sends a Scheduling Request (SR), it first processes the data through layers such as the Application (APP), Service Data Adaptation Protocol (SDAP), Packet Data Convergence Protocol (PDCP), Radio Link Control (RLC), Medium Access Control (MAC), and Physical (PHY) layers. Each layer adds a certain amount of processing time.

Example: Encrypting data in the PDCP layer or reassembling packets in the RLC layer introduces delays.

gNB Side: The gNB receives and processes the uplink (UL) data through the PHY, MAC, and higher layers. This includes decoding the PHY layer, scheduling resource allocation in the MAC layer, and reassembling data in the RLC layer.

Example: Table 2 in the document shows the mean processing times for gNB layers, with the RLC queue (RLC-q) contributing the highest latency at 484.20 µs on average.

Key Insights:

Software-based implementations can exacerbate processing latency due to non-deterministic Operating System (OS) scheduling.
Resource-heavy operations like encryption, modulation, and demodulation consume significant time.

Protocol Latency

Protocol latency arises from mechanisms and configurations required by the 5G protocols. These include scheduling, resource allocation, and the inherent delays in the transmission procedure.

Examples:

Scheduling Delays: In grant-based uplink transmissions, the UE must send an SR and wait for the gNB to allocate resources (UL Grant) before transmitting data. This two-step process significantly increases latency.

Example: In a TDD system, the UE has to wait for the next uplink slot before transmitting the SR, and again for the next slot to send the UL data after receiving the grant. This introduces protocol-induced delays of up to one TDD period (e.g., 0.5 ms for a DDDU configuration).

Slot Duration: Protocol constraints dictate that data must fit within predefined time slots. If a packet arrives just after a slot is allocated, it must wait until the next available slot.

Example: In the FR1 band (<6 GHz), a minimum slot duration of 0.25 ms is required for low-latency configurations. However, packets arriving just after a scheduling event experience an additional 0.25 ms delay.

Key Insights:

Grant-free uplink transmissions can reduce protocol latency by pre-allocating resources, but they are less scalable for networks with many UEs.
Mini-slot configurations provide finer-grained scheduling but increase signaling overhead and network coordination complexity.

Radio Latency

Radio latency encompasses delays in the radio hardware and its interaction with the PHY layer. It includes analog-to-digital (A/D) and digital-to-analog (D/A) conversions, queuing delays on interface buses, and transmission delays.

Examples:

A/D and D/A Conversion: Converting signals from analog to digital (or vice versa) at the Radio Head (RH) introduces latency.

Example: The RH(radio head) in the testbed (connected via USB) added 500 µs of radio latency, making it impossible to meet URLLC requirements in certain configurations.

Bus Transmission Delays: The time required to transfer data over interfaces like PCIe, Ethernet, or USB affects latency.

Example: USB 3.0 performs better than USB 2.0 for submitting radio samples but still experiences spikes in latency due to OS scheduling delays.

Queuing Delays: Queuing data on the radio interface or during inter-layer communication can bottleneck performance.

Example: The queuing delay at the gNB’s RLC layer averages 484.20 µs, as shown in Table 2.

Key Insights:

Radio latency is highly dependent on the hardware interface. SDR-based systems offer flexibility but introduce higher latencies compared to ASIC-based implementations.
Sub-millisecond latency is challenging in non-line-of-sight scenarios, especially in mmWave bands.

Deterministic vs Non-deterministic latencies

Latency in 5G systems can be broadly understood in terms of deterministic and non-deterministic latencies, which represent the predictability of delays during data transmission and processing.

Deterministic Latency

Deterministic latency refers to delays that are predictable and consistent. These latencies occur in processes or components where timing behavior is predefined and not subject to significant variation.

Characteristics:

Predictability: Deterministic latency is consistent for a given operation or under specific conditions.
Controllable: Can be managed and optimized through system design, such as scheduling algorithms or hardware configurations.
Examples in 5G:

Slot Duration: The time required to transmit data in one slot is fixed based on the numerology and configuration (e.g., 0.25ms, 0.5ms).
PHY Layer Processing: Tasks such as encoding and modulation have predictable durations when executed on dedicated hardware like ASICs.
Scheduling Cycles: Periodic scheduling in the MAC layer, where resources are allocated in well-defined time slots.

Non-Deterministic Latency

Non-deterministic latency, on the other hand, arises from processes or components where delays are variable and unpredictable. These latencies are often caused by dynamic factors, such as system load, shared resources, or environmental conditions.

Characteristics:

Variability: Non-deterministic latency fluctuates, making it harder to predict.
Dependency on External Factors: Influenced by factors like OS scheduling, hardware interfaces, and wireless channel conditions.
Examples in 5G:

OS Scheduling Delays: In software-based implementations, the operating system may prioritize other tasks, delaying critical 5G processes.
Queuing Delays: Data packets waiting in buffers (e.g., at the RLC or MAC layer) experience variable delays based on traffic load and resource availability.
Radio Conditions: Wireless channel conditions, such as fading, interference, or mobility, introduce unpredictable delays

PHY/MAC Enhancement

Even though I haven't seen much of real implementation(UE and Network) and real network deployment in Release 15 as of now(Jan 2021), there are features in Release 15 that is introduced with URLLC in mind. The list of these features are well summarized in 5GAmericas white paper as follows :

Lower latency by supporting:

Higher subcarrier spacing, with shorter transmission durations.
Mini-slots with fewer number of symbols.
Frequent PDCCH monitoring reducing the latency of the layer-1 control information.
Configured-grant, which allows the UE to autonomously transmit uplink data without having to send a scheduling request and wait for the uplink grant.
Downlink preemption.

Higher reliability by supporting;

As you may guess, we would need a lot of PHY/MAC enhancement to make this possible. Those enhancement is described in RP-191584 and 38.824 V16.0.0 (2019-03). As you see, it requires the enhancement across all physical channels and MAC scheduling.

PDCCH enhancements

DCI format(s) with configurable sizes for some fields, with a minimum DCI size targeting a reduction of 10~16 bits relative to Rel-15 DCI format 0_0/1_0 and a maximum DCI size that can be larger than Rel-15 DCI format 0_0/1_0, and provide the possibility to align with the size of the DCI format 0_0/1_0 (including possible zero padding if any)
Increased PDCCH monitoring capability on at least the maximum number of non-overlapped CCEs per slot for channel estimation for at least one SCS subject to restrictions including, but not necessary limited to, those identified in TR 38.824. Enhancements for PDCCH monitoring capability on the maximum number of monitored PDCCH candidates per slot (with potential restrictions) can be further considered.

UCI enhancements

More than one PUCCH for HARQ-ACK transmission within a slot
At least two HARQ-ACK codebooks simultaneously constructed, intended for supporting different service types for a UE

PUSCH enhancements for both grant-based PUSCH and configured grant based PUSCH

For a transport block, one dynamic UL grant or one configured grant schedules two or more PUSCH repetitions that can be in one slot, or across slot boundary in consecutive available slots

Scheduling/HARQ enhancement

Out-of-order HARQ-ACK associated with PDSCHs with different HARQ process IDs
Out-of-order PUSCH scheduling associated with different HARQ process IDs, including overlapping PUSCHs and non-overlapping PUSCHs in time-domain
Methods to handle DL data/data resource conflicts for overlapping PDSCHs in time-domain, scheduled by dynamic DL assignments