Data Throughput - Overview

 

 

 

I have been being asked to troubleshot the throughput issue so many times. Unfortunately my experience says "There is no clear/logical/deterministic way to troubleshoot for throughput test".

Then what are we supposed to do ? Are we supposed to rely on "Hit and Miss" strategy everytime we do the throughput test ? Is this process totally random ?

No at least we are not in such a worst case, fortunately. I think we can set some guidelines at least.

 

 

 

 

First thing to remember for throughput troubleshooting

 

One sentence. "Throughput troubleshooting is not simple at all.", "Don't expect it to be simple.". If I solved the problem with single shot, I would say "I was just lucky, It is not because I am technically competent".

 

Even troubleshooting with wired communication is not easy. Think about how many more factors would get involved in the data path.

 

That's all for the first thing. Now let's move to the second important thing for this issue. What is the second thing ?

 

It's "Don't give up. You will eventually find the solution!" -:). It is just matter of time and depend on how much dedicated you are during the troubleshoot.

 

Now the third things comes (Many people think this is the first thing since it sound more technical, but I don't think it is the case).

What I want you to do as the third item is "list up all the nodes from the data transmitter to the reciever, and follow all the steps without skipping anything.". One example I can give you is  (this is an example where you use a Network Emulator for the test).

    i) IP Application Software on PC (e.g, iperf, FileZilla)

    ii) TE port on PC (e.g, Ethernet Card).

    iii) TE port on throughput test equipment (e.g, Data packet port on Network Emulator)

    iv) PDCP layer on test equipment

    v) RLC layer on test equipment

    vi) MAC layer on test equipment

    vii) L1 (Transport and PHY) layer on test equipment

    viii) L1 (Transport and PHY) layer on UE (mobile phone or data card)

    ix) MAC layer on UE

    x) RLC Layer on UE

    xi) PDCP layer on UE

    xii) TE port on UE (e.g, Modem connector)

    xiii) TE port on PC (e.g, USB port the UE is connected to)

    xiv) IP Application Software on PC to which the UE is connected.

 

The more you understand on each of these items, the better position you are in for troubleshooting. (If you really enjoy your job as engineer, one of the topic I would recommend you is to try with throughput troubleshoot or optimization. To me it looks like an art at the same time being a technology).

 

Now you would ask "Which component on the list is most important, most critical factor for the throughput ?". I wish I had a simple/clear answer to this, but my experience says "the answer varies depending on the situation". Especially it would differ depending on what kind of radio technoloty your device is using. (e.g, Is it R99 WCDMA Device, HSDPA, HSPA+, LTE ?)

 

In addition to the major technical factors listed above, sometimes very simple things as follows make you spend several hours to several weeks for troubleshoot if you are in bad luck.

    i) LAN Cable type (Sometimes you have to use 'direct cable' and sometimes you have to use 'cross over' cable).

    ii) Category of LAN cable. (Is it Cat 5 cable or Cat 6 cable ?)

    iii) Ethernet Port Capability (Is it only for 10/100 M, or Gigabit ethernet ?)

    iv) Firewall setting on your PC (I will go back to this later in a separate section).

 

I will try to go through each type of radio technology and try to point out the important factor for that specific technology. (Try to memorize all the steps listed above sicne I will talk about the steps for each of the following sections).

 

 

 

What Number you want to get ?

 

I often see two extreme opinion on the result of throughput test result. Followings are those two with example LTE Cat3 MIMO download throughput.

 

Opinion 1 : I know the ideal max throughput is 100 Mbps, but I think it doesn't make much sense at least for a mobile device because in live network, you would never be in such a situation where a network allow such a huge resource allocation for any single UE and Radio Signal Quality also would not be good enough to achieve those throughput. so I am happy if the throughput result gives 30~40 Mbps. <== I wrote this comments around 6 years (2011) ago when LTE is at relatively early stage of deployment. But now (Jun 2018), the device supporting 1Gbps is not uncommon and we are talking about 1.6 Gbps and even 2.0 Gbps device. Nobody think 100 Mbps is too high throughput. My point is that all the technology evolve like this. When a technology comes out, many people think it is 'too much' but just in a few years it becomes 'too little'.

 

Opinion 2 : The 3GPP sepecfication says the Max throughput for LTE Cat 3 is 100 Mbps, so I want to get exact 100 Mbps throughput displayed on my IP traffic monitoring tool.

 

I think there is problem with both opinion, but I would not say much on Opinion 1. Just think you are lucky if your customer has this kind of opinion -:).

 

I will talk something about Opinion 2 in this section. What would be the problem of this opinion ?

First he should not expect to get the same number on IP traffic monitor as 3GPP spec sheet shows, because what 3GPP spec sheets shows is the physical layer throughput, not the IP layer throughput. Between physical layer and IP layer, there are various kinds of overhead getting inserted. So it is natural to see a little bit less throughput on IP traffic monitor than the number on 3GPP spec sheets.

Then you may ask.. what if we want to measure only PHY throughput. Will I get the same Max throughput as the 3GPP spec document says ? In WCDMA, HSDPA, HSUPA probably 'Yes', but in LTE you would still have a little low throughput than the 3GPP spec value even in PHY throughput. It is because there is some subframes where you cannot allocate full RBs (100 RBs in case of 20Mhz, Cat 3). These frames are where SIBs are scheduled. Especailly SIB2 is causing a lot of overhead because it is supposed to be transmitted at subframe 5 at every two SFN. The amount of phyiscal layer overhead varies depending on how eNodeB allocate RBs for user data for the subframe where the SIB is transmitted. According to my experience, I saw roughly three different strategies for this case.

 

Option 1 : eNodeB still allocate RBs for the SIB transmission subframe, but the number of RB is a little bit lower than the Max RB

Option 2 : eNodeB does not allocate any RBs for user data at SIB transmission subframe.

Option 3 : eNodeB stop transmitting SIBs when connected state and allocate the MAX RBs even for the SIB transmission subframe.

 

I think live network eNodeB is using Option 1 and I saw most of test equipment is using Option 2 or Option 3. But Option 3 may cause some unexpected side effect and this options is not widely used. So in theory, you may get a little bit higher throughput if you use real eNodeB in 'test lab' (not in live network) comparing to test equipment. (You would get much less throughput in live network because you cannot control the eNodeB as you want and the signal quality is not as good as in the test lab. ).

 

In concolusion, whatever method you use you would not get the 100% same as specified in 3GPP specification. In my personal opinion, it would be considered OK if you can achieve around 90% of the ideal Max throughput without much of packet error. (If the difference between the test throughput and ideal throughput is due to packet area, not much due to overhead.. you'd better investigate further to find the root cause of the problem).

 

 

 

Milestones in the history of throughput evolution

 

I've been involved in throughput testing for Cellular devices since UMTS HSPA and I had seen some stumbling blocks for almost each and every steps of evolution. These stumbling block is not only from the DUT (Cellular device) but also from other components which constitues the test system. In this section, I would list up some of the milestones (stepping stones) that I've gone through. Some of the items list

I will add some troubleshoot tips for each of the milestones, but it may not be a direct solution to the problem that you are facing since there are so many factors get involved in the data path and root cause of a problem may be different for different problem even though the symptom looks similar. However, the factors that I am listing here might be at least something worth considering for your own troubleshooting process.

 

 

< Category 3 : 100 Mbps >

 

This was around 6 or 7 years ago (around 2010/2011). Nobody would think this is any big problem these days and they would classfy this as a very low throughput case. However, when LTE first came out this was pushing the limit not only on DUT(Cellular protocol) but also many other parts a well. Followings are some of the factors that would cause issues.

  • Stablity of Radio Link at MCS 23 with 2x2 MIMO. Since it was early stage of LTE deployment, it was not easy for LTE mobile phone (or test equipment) maintain stable radio quality at MCS 23. Usually this kind of issues require a lot of DL power (sometimes UL power as well) tuning, cable check and in worst case UE / Equipment firmware upgrade.
  • At this point of time, most common Network Interface Card(NIC) in PC was still 10/100 Mbps. 1 Gbps NIC was available, but not every PC has it. It means the required throughput would really hit the physical limit of Network Card of server PC. Of course, this problem can be easily resolved by using the PC with 1Gpbs NIC.
  • Just using 1Gbps does not automatically resolve the bottleneck issue of IP data path. Ethernet Cable often caused the problem. The most common type of Ethernet cable at this time were Category 5 which has 100 Mbps max throughput. So if you don't use very good quality of Category 5 cable, the througput would be sclipped by the cable as well.
  • If you are using any Ethernet Hub or Switches supporting only up to 10/100 Mbps, you have to check whether it really support the required throughput or change it to other ones supporting 1 Gbps.

 

 

< Category 4 : 150 Mbps >

 

  • Stablity of Radio Link at MCS 28 with 2x2 MIMO. This is the max MCS applicable to 64 QAM and super high code rate. So achieving stable radio link was challenging.

  • Since this is obviously out of capability of 10/100 Mbps Ethernet specification, you need upgrade to every Ethernet component (e.g, Switch, Network Interface Card etc).

 

 

< Category 9 : 450 Mbps >

 

  • Stablity of Radio Link at MCS 28 with 2x2 MIMO and 3CC CA. This is no problem at all for the current standard(as of Jun 2018), but maintaining the stable radio link of 3CC CA at the initial phase.

  • Everything should be with Gigabit ethernet. Gigabit ethernet was pretty common to most of the PC at this time.
  • This often caused problem on UE side when trying test with tethering. This is almost ideal max throughput of USB 2.0. So if you don't use very high quality USB 2.0 and well written driver, the throughput would be bottlenecked by USB.

 

 

 

< Category 12 : 1 Gbps >

  • Stablity of Radio Link at DL 256QAM with 4x4 MIMO and Carrier aggregation. You need to make it sure that 256 QAM decoding on UE side goes well and guarantee the very high SNR for each and every antenna of 4x4. If it is with 4x4 Antenna, you would need to get 28 dB or higher SNR for all 4 antenna.
  • This is also ideal max of Gbit ethernet. so even though your PC, etherent cable, Switch claims that they support Gbit ethernet, it would be safer to doublecheck the real performance of those components.
  • If your test equipment and server Network card support jumbo frame, it would worth trying with it.
  • If you use ftp as a throuput server, you would need to use such a ftp server that support multiple download at the same time.
  • If you use iperf, you may need to carefully chose the version of the iperf. In my experience, it was possible to achieve only 500 Mbps with v2.0.5 but was able to achieve around 950 Mbps with v2.0.8 or v3.x.x. However, the specific versions may give you a little different result depending on test setup.
  • If you ard doing the throughpu test with tethering, make it sure that you are using USB v3.0 and the usb driver is well written to support enough throughput.
  • If you are testing using a special App on UE (not tethering), you have to make it sure that the App can handle enough throughput (e.g, supporting multiple download stream simultaneously)

 

 

< Category 18,19,20 : 1.2.1.6,2.0 Gbps >

  • Stablity of Radio Link at DL 256QAM with 4x4 MIMO and at least 4 or more CC CA(16 layer or more). It would not be easy maintain the good radio link quality with these configuration.
  • The biggest challenge would be the ethernet port on test equipment and server PC. Most of them would support only 1Gb throughput which was good enough when LTE first came out.
  • If you are testing this with USB tethering, you have to make it sure that your USB is 3.x and support enough throughput. Even though the USB 3.x specification defines max 5Gbps, but the real implementation would not support this ideal throughput.

 

 

 

< 5G/NR >

  • For most of the reader, it would sound to early to talk about 5G/NR throughput as of now (Jul 2018), but it will be something most of the readers would be struggling in about an year.
  • In this technology, you need to make at least 2 Gbps work from day 1. So you would need to use 10 Gb Ethernet from the beginning and would consider trying with 40 Gbps ethernet in a couple of years down the road.