Data Throughput - Overview

 

 

 

I have been being asked to troubleshot the throughput issue so many times. Unfortunately my experience says "There is no clear/logical/deterministic way to troubleshoot for throughput test".

Then what are we supposed to do ? Are we supposed to rely on "Hit and Miss" strategy everytime we do the throughput test ? Is this process totally random ?

No at least we are not in such a worst case, fortunately. I think we can set some guidelines at least.

 

 

 

 

What to expect ?

 

Almost every time when a new technology comes, what we we hear most frequently would be 'you can download a movie in less than a minuite (in early LTE case)' or 'you can download a movie in a few seconds (in early 5G case)'.

Usually the most common reaction to this kind of (marketing ?) statement would be

  • Do we really need such an unbelievable throughput ?
  • Why I don't see the same throughput on my phone ?

I don't think I need to answer to the first question. Everybody says 'who would need such a high speed ?', but nobody complains when it really comes and start complaining about the low throughput a few years later. However high speed you get, a lot of new applications or use cases comes that let such a big pipe look limiting. In my personal experience, I think I've heard the same question (i.e, 'who would need such a high speed) since early HSDAP which is just a few Mbps) and I am still hearing complaints about the low throughput at the period of tens of Mbps. So.. my answer to this question is 'don't worry, you will soon want to get the big pipe'.

 

What I am trying to do in this section is more about answering the second question : Why I don't see the same throughput on my phone ?

In short, it is very unlikely that you would see the same throughput on your phone if you are trying it in live network as a regular subscriber. Why not ? There are two common reason as below

  • The speed that they are talking is from the test with very special condition that is very unlikely available to you
  • Often people (especially media) talks about the speed that will be available when the technology gets fully evolved, but not now.

For example, if you are talking about around 6 Gbps speed for 5G(NR), it would belong to the first case. If you are talking about 20 Gbps, it would belong to the second case at the time of writing this (Nov 2020).

I am not going to talk more about the second case. I will be focusing more on the first case.... i.e, what would be the special condition that let such a tremendous speed come true ?

 

There are several factors that is critical to determine the throughput can be summarized in an illustration as shown below. This logic applies at least to 4G and 5G. 3G throughput (i.e, HSPA) is determined by a little bit different set of factors(parameters) but I would focus more on 4G/5G throughput. If you are interested in 3G throughput, refer to this note.

Let me comment a little bit about this illustration this illustration. The comments written below is just written description of the illustration.. meaning the description and the illustration represents a same thing.

  • The first main factor you can think of is 'how many carriers are used for the data traffic'. Assuming that the bandwidth of each carrier is same, the more carriers are used the higher throughput you can expect (NOTE : In reality, there are cases where the bandwidth of each carrier would not be same. In this case, total bandwidth (called 'aggregated bandwidth) can be the determining factor.
  • Next factor you should consider is that how many resource blocks within each of the carrier are used for the traffic. In case of estimating the max throughput (theoretical throughput), we assume that every resource blocks of each of the carriers are used for the traffic.
  • Then you should think of what kind of modulation scheme is used for the traffic. Simply put, the data rate goes higher as the modulation schedule progress in QPSK -> 16QAM -> 64QAM -> 256QAM (NOTE : in reality, not only modulation scheme but also coderate affects the data rate, a specific parameter defined in 3GPP spec called MCS (modulation coding scheme) determines both modulation scheme and coderate).
  • Next major factor is how many layers are used for the traffic. You can think of this factor as 'what type of MIMO is used'. Roughly put, you can think of SISO as single Layer, 2x2 MIMO as 2 layers and 4x4 as 4 layer etc. Multiple layer imply multiple data streams are transmitted simultaneously meaning higher data throughput.
  • One last think you should consider especially in real life traffic is 'how many UEs are connected to the same network at the same time'. The more UEs are connected to the network, the less throughput you would get on your device since the physical resources for the traffic should be split among multiple devices.

 

 

 

 

Typical condition for LTE max throughput

 

At the point of writing this part (Nov 2020), I think the typical condition for achieving the max throughput in LTE can be illustrated as follows.

  • Number of Carriers : 5
  • Bandwidth of each carrier : 20 Mhz (this is max BW for one LTE channel defined in 3GPP)
  • Number of RB for each carrier : 100
  • Modulation Scheme : 256 QAM, corresponding to MCS 27
  • Number of layers : 4 (4x4 MIMO)

 

 

 

 

Typical Condition for 5G/NR max throughput

 

At the point of writing this part (Nov 2020), I think the typical condition for achieving the max throughput in 5G/NR can be illustrated as follows.

  • Frequency Range : FR2
  • Number of Carrier : 8
  • Bandwidth of Each Carrier : 100 Mhz
  • Number of RBs for each carrier : 66
  • Modulation Scheme : 256 QAM (MCS 27)
  • Number of Layers : 2 (2x2 MIMO)

 

 

As far as I know as of now (Nov 2020), I don't see anything that supports the speed shown below, but this would be realizable at least in terms of 3GPP specification.

  • Frequency Range : FR2
  • Number of Carrier : 8
  • Bandwidth of Each Carrier : 400 Mhz
  • Number of RBs for each carrier : 264
  • Modulation Scheme : 256 QAM (MCS 27)
  • Number of Layers : 2 (2x2 MIMO)

NOTE : In NR TDD case, there are ofther factors affecting the achievable throughput. They are the scheduling ratio of Downlink and Uplink, and the number of symbols assigned for each slots. It is assumed that slot and symbols are scheduled for downlink as much as possible.

 

 

 

 

First thing to remember for throughput troubleshooting

 

One sentence. "Throughput troubleshooting is not simple at all.", "Don't expect it to be simple.". If I solved the problem with single shot, I would say "I was just lucky, It is not because I am technically competent".

 

Even troubleshooting with wired communication is not easy. Think about how many more factors would get involved in the data path.

 

That's all for the first thing. Now let's move to the second important thing for this issue. What is the second thing ?

 

It's "Don't give up. You will eventually find the solution!" -:). It is just matter of time and depend on how much dedicated you are during the troubleshoot.

 

Now the third things comes (Many people think this is the first thing since it sound more technical, but I don't think it is the case).

What I want you to do as the third item is "list up all the nodes from the data transmitter to the reciever, and follow all the steps without skipping anything.". One example I can give you is  (this is an example where you use a Network Emulator for the test).

    i) IP Application Software on PC (e.g, iperf, FileZilla)

    ii) TE port on PC (e.g, Ethernet Card).

    iii) TE port on throughput test equipment (e.g, Data packet port on Network Emulator)

    iv) PDCP layer on test equipment

    v) RLC layer on test equipment

    vi) MAC layer on test equipment

    vii) L1 (Transport and PHY) layer on test equipment

    viii) L1 (Transport and PHY) layer on UE (mobile phone or data card)

    ix) MAC layer on UE

    x) RLC Layer on UE

    xi) PDCP layer on UE

    xii) TE port on UE (e.g, Modem connector)

    xiii) TE port on PC (e.g, USB port the UE is connected to)

    xiv) IP Application Software on PC to which the UE is connected.

 

The more you understand on each of these items, the better position you are in for troubleshooting. (If you really enjoy your job as engineer, one of the topic I would recommend you is to try with throughput troubleshoot or optimization. To me it looks like an art at the same time being a technology).

 

Now you would ask "Which component on the list is most important, most critical factor for the throughput ?". I wish I had a simple/clear answer to this, but my experience says "the answer varies depending on the situation". Especially it would differ depending on what kind of radio technoloty your device is using. (e.g, Is it R99 WCDMA Device, HSDPA, HSPA+, LTE ?)

 

In addition to the major technical factors listed above, sometimes very simple things as follows make you spend several hours to several weeks for troubleshoot if you are in bad luck.

    i) LAN Cable type (Sometimes you have to use 'direct cable' and sometimes you have to use 'cross over' cable).

    ii) Category of LAN cable. (Is it Cat 5 cable or Cat 6 cable ?)

    iii) Ethernet Port Capability (Is it only for 10/100 M, or Gigabit ethernet ?)

    iv) Firewall setting on your PC (I will go back to this later in a separate section).

 

I will try to go through each type of radio technology and try to point out the important factor for that specific technology. (Try to memorize all the steps listed above sicne I will talk about the steps for each of the following sections).

 

 

 

What Number you want to get ?

 

I often see two extreme opinion on the result of throughput test result. Followings are those two with example LTE Cat3 MIMO download throughput.

 

Opinion 1 : I know the ideal max throughput is 100 Mbps, but I think it doesn't make much sense at least for a mobile device because in live network, you would never be in such a situation where a network allow such a huge resource allocation for any single UE and Radio Signal Quality also would not be good enough to achieve those throughput. so I am happy if the throughput result gives 30~40 Mbps. <== I wrote this comments around 6 years (2011) ago when LTE is at relatively early stage of deployment. But now (Jun 2018), the device supporting 1Gbps is not uncommon and we are talking about 1.6 Gbps and even 2.0 Gbps device. Nobody think 100 Mbps is too high throughput. My point is that all the technology evolve like this. When a technology comes out, many people think it is 'too much' but just in a few years it becomes 'too little'.

 

Opinion 2 : The 3GPP sepecfication says the Max throughput for LTE Cat 3 is 100 Mbps, so I want to get exact 100 Mbps throughput displayed on my IP traffic monitoring tool.

 

I think there is problem with both opinion, but I would not say much on Opinion 1. Just think you are lucky if your customer has this kind of opinion -:).

 

I will talk something about Opinion 2 in this section. What would be the problem of this opinion ?

First he should not expect to get the same number on IP traffic monitor as 3GPP spec sheet shows, because what 3GPP spec sheets shows is the physical layer throughput, not the IP layer throughput. Between physical layer and IP layer, there are various kinds of overhead getting inserted. So it is natural to see a little bit less throughput on IP traffic monitor than the number on 3GPP spec sheets.

Then you may ask.. what if we want to measure only PHY throughput. Will I get the same Max throughput as the 3GPP spec document says ? In WCDMA, HSDPA, HSUPA probably 'Yes', but in LTE you would still have a little low throughput than the 3GPP spec value even in PHY throughput. It is because there is some subframes where you cannot allocate full RBs (100 RBs in case of 20Mhz, Cat 3). These frames are where SIBs are scheduled. Especailly SIB2 is causing a lot of overhead because it is supposed to be transmitted at subframe 5 at every two SFN. The amount of phyiscal layer overhead varies depending on how eNodeB allocate RBs for user data for the subframe where the SIB is transmitted. According to my experience, I saw roughly three different strategies for this case.

 

    Option 1 : eNodeB still allocate RBs for the SIB transmission subframe, but the number of RB is a little bit lower than the Max RB

    Option 2 : eNodeB does not allocate any RBs for user data at SIB transmission subframe.

    Option 3 : eNodeB stop transmitting SIBs when connected state and allocate the MAX RBs even for the SIB transmission subframe.

 

I think live network eNodeB is using Option 1 and I saw most of test equipment is using Option 2 or Option 3. But Option 3 may cause some unexpected side effect and this options is not widely used. So in theory, you may get a little bit higher throughput if you use real eNodeB in 'test lab' (not in live network) comparing to test equipment. (You would get much less throughput in live network because you cannot control the eNodeB as you want and the signal quality is not as good as in the test lab. ).

 

In concolusion, whatever method you use you would not get the 100% same as specified in 3GPP specification. In my personal opinion, it would be considered OK if you can achieve around 90% of the ideal Max throughput without much of packet error. (If the difference between the test throughput and ideal throughput is due to packet area, not much due to overhead.. you'd better investigate further to find the root cause of the problem).

 

 

 

Milestones in the history of throughput evolution

 

I've been involved in throughput testing for Cellular devices since UMTS HSPA and I had seen some stumbling blocks for almost each and every steps of evolution. These stumbling block is not only from the DUT (Cellular device) but also from other components which constitues the test system. In this section, I would list up some of the milestones (stepping stones) that I've gone through. Some of the items list

I will add some troubleshoot tips for each of the milestones, but it may not be a direct solution to the problem that you are facing since there are so many factors get involved in the data path and root cause of a problem may be different for different problem even though the symptom looks similar. However, the factors that I am listing here might be at least something worth considering for your own troubleshooting process.

 

 

< LTE Category 3 : 100 Mbps >

 

This was around 6 or 7 years ago (around 2010/2011). Nobody would think this is any big problem these days and they would classfy this as a very low throughput case. However, when LTE first came out this was pushing the limit not only on DUT(Cellular protocol) but also many other parts a well. Followings are some of the factors that would cause issues.

  • Stablity of Radio Link at MCS 23 with 2x2 MIMO. Since it was early stage of LTE deployment, it was not easy for LTE mobile phone (or test equipment) maintain stable radio quality at MCS 23. Usually this kind of issues require a lot of DL power (sometimes UL power as well) tuning, cable check and in worst case UE / Equipment firmware upgrade.
  • At this point of time, most common Network Interface Card(NIC) in PC was still 10/100 Mbps. 1 Gbps NIC was available, but not every PC has it. It means the required throughput would really hit the physical limit of Network Card of server PC. Of course, this problem can be easily resolved by using the PC with 1Gpbs NIC.
  • Just using 1Gbps does not automatically resolve the bottleneck issue of IP data path. Ethernet Cable often caused the problem. The most common type of Ethernet cable at this time were Category 5 which has 100 Mbps max throughput. So if you don't use very good quality of Category 5 cable, the througput would be sclipped by the cable as well.
  • If you are using any Ethernet Hub or Switches supporting only up to 10/100 Mbps, you have to check whether it really support the required throughput or change it to other ones supporting 1 Gbps.

 

 

< LTE Category 4 : 150 Mbps >

 

  • Stablity of Radio Link at MCS 28 with 2x2 MIMO. This is the max MCS applicable to 64 QAM and super high code rate. So achieving stable radio link was challenging.

  • Since this is obviously out of capability of 10/100 Mbps Ethernet specification, you need upgrade to every Ethernet component (e.g, Switch, Network Interface Card etc).

 

 

< LTE Category 9 : 450 Mbps >

 

  • Stablity of Radio Link at MCS 28 with 2x2 MIMO and 3CC CA. This is no problem at all for the current standard(as of Jun 2018), but maintaining the stable radio link of 3CC CA at the initial phase.

  • Everything should be with Gigabit ethernet. Gigabit ethernet was pretty common to most of the PC at this time.
  • This often caused problem on UE side when trying test with tethering. This is almost ideal max throughput of USB 2.0. So if you don't use very high quality USB 2.0 and well written driver, the throughput would be bottlenecked by USB.

 

 

 

< LTE Category 12 : 1 Gbps >

  • Stablity of Radio Link at DL 256QAM with 4x4 MIMO and Carrier aggregation. You need to make it sure that 256 QAM decoding on UE side goes well and guarantee the very high SNR for each and every antenna of 4x4. If it is with 4x4 Antenna, you would need to get 28 dB or higher SNR for all 4 antenna.
  • This is also ideal max of Gbit ethernet. so even though your PC, etherent cable, Switch claims that they support Gbit ethernet, it would be safer to doublecheck the real performance of those components.
  • If your test equipment and server Network card support jumbo frame, it would worth trying with it.
  • If you use ftp as a throuput server, you would need to use such a ftp server that support multiple download at the same time.
  • If you use iperf, you may need to carefully chose the version of the iperf. In my experience, it was possible to achieve only 500 Mbps with v2.0.5 but was able to achieve around 950 Mbps with v2.0.8 or v3.x.x. However, the specific versions may give you a little different result depending on test setup.
  • If you ard doing the throughpu test with tethering, make it sure that you are using USB v3.0 and the usb driver is well written to support enough throughput.
  • If you are testing using a special App on UE (not tethering), you have to make it sure that the App can handle enough throughput (e.g, supporting multiple download stream simultaneously)

 

 

< LTE Category 18,19,20 : 1.2.1.6,2.0 Gbps >

  • Stablity of Radio Link at DL 256QAM with 4x4 MIMO and at least 4 or more CC CA(16 layer or more). It would not be easy maintain the good radio link quality with these configuration.
  • The biggest challenge would be the ethernet port on test equipment and server PC. Most of them would support only 1Gb throughput which was good enough when LTE first came out.
  • If you are testing this with USB tethering, you have to make it sure that your USB is 3.x and support enough throughput. Even though the USB 3.x specification defines max 5Gbps, but the real implementation would not support this ideal throughput.

 

 

 

< 5G/NR >

  • In this technology, you need to make at least 2 Gbps work from day 1 and very soon need to reach 6 Gbps or higher. So you would need to use 10 Gb Ethernet from the beginning and would consider trying with 40 Gbps ethernet in a couple of years down the road.