Skip to content
Ido Schimmel edited this page Oct 22, 2018 · 7 revisions
Table of Contents
  1. Qdisc
    1. Qdisc Commands
      1. Show and Statistics
  2. PRIO
    1. Offloading PRIO
    2. PRIO Statistics
    3. Grafting to PRIO
  3. RED
    1. About RED
    2. Offloading RED
    3. RED Statistics
  4. Further Resources

Qdisc

Traffic control in Linux is managed by the TC subsystem. Documentation can be found here and in the TC man page.

Qdisc Commands

A qdisc command is of the following structure:

$ tc <flags> qdisc <command> dev <dev name> [root |parent <parent ID>] [handle <handle id>] <qdisc type> [<qdiscs params>]

Handle ID is built from two 16 bits numbers of the form MAJOR:MINOR. Any qdisc created by a user will have its minor set to 0 and can be referred to using MAJOR:. In case an handle ID was not specified, a new handle ID will be allocated.
Parent ID is the identifier of the qdisc location. A qdisc can be set as a root qdisc or as a child of another qdisc. In order to set a qdisc as the X child of a qdisc with the handle Y, its parent ID should be set to Y:X.

The operation of linking two qdiscs is called grafting.

Show and Statistics

To see the configured qdisc on a port run the show command. Adding the flag -s will display the statistics as well:

$ tc -s qdisc show dev sw1p1

The offloaded flag denotes whether the qdisc is offloaded or not.
The basic statistics are:

  • Sent - Packets and bytes count.
  • dropped – The number of packets that were dropped by this queue.
  • overlimits – Qdisc specific.
  • requeues - Qdisc specific.
  • backlog – The queue's size in bytes and packets. Only the bytes count reflects the hardware status. Some qdiscs might include extra statistics. If the qdisc is offloaded, the statistics will reflect both the hardware and software statistics.

For full documentation see the TC man page.

PRIO

The PRIO scheduler sends traffic to its child qdiscs, called bands, according to a mapping from packet priority to band number. For details on how priority is assigned to packets, see Quality of Service.

PRIO enforces strict priority between its bands. Packets will be sent from band X only if there are no packets enqueued in bands 0..(X-1). For a full description please refer to the Further Resources Section.
It is possible to set a qdisc on each band. Currently, only RED can be offloaded as a child.

Offloading PRIO

PRIO parameters:

  • bands (optional) - Number of children queues. Offloading is supported for up to 8 bands and setting it to any higher number will result in not offloading the qdisc.
  • priomap (optional) - Mapping of packet's priority to a band.
    Offload is only supported for priorities 0-7. Higher priorities will be ignored.

Note: PRIO is only supported on top of physical ports and only as a root qdisc.
Note: Configuring both DCB and PRIO on the same port is not supported.

Example:

$ tc qdisc replace dev sw1p1 root handle 3: prio bands 8 priomap 0 1 2 3 4 5 6 7

PRIO Statistics

The show command will show the current configuration, including the full priomap.

$ tc -s qdisc show dev sw1p1
qdisc prio 3: root refcnt 2 offloaded bands 8 priomap  0 1 2 3 4 5 6 7 1 1 1 1 1 1 1 1
 Sent 30510403042 bytes 20289261 pkt (dropped 5199870, overlimits 0 requeues 0)
 backlog 222720b 0p requeues 0

Children of PRIO are not displayed unless they are configured with a qdisc. Adding the word invisible to the show command will show these children, but not their offloaded counter queue.

The statistics represent the sum of the statistics of all the bands. If there is RED on any of its children, the child drops will be counted by the parent as well.
When using PRIO, lower priority bands can be starved as long as there are packeted enqueued in higher priority bands. In this situation, packets might be dropped due to timeouts. These drops are not counted as PRIO drops.

In order to see the backlog of each of PRIO band one can use the following ethtool command:

$ ethtool -S sw1p1 | grep tc_transmit_queue_tc
	tc_transmit_queue_tc_0: 0
	tc_transmit_queue_tc_1: 0
	tc_transmit_queue_tc_2: 0
	tc_transmit_queue_tc_3: 0
	tc_transmit_queue_tc_4: 0
	tc_transmit_queue_tc_5: 0
	tc_transmit_queue_tc_6: 0
	tc_transmit_queue_tc_7: 0

Note: These traffic classes correspond to PRIO bands according to the following formula: TC = 7 - PRIO band.

Grafting to PRIO

By default, every PRIO band has an invisible FIFO qdisc. Invisible means that it has a handle of zero, cannot be removed and it would appear in show command only when optional parameter invisible is passed. Note that (almost) all the qdiscs have this invisible qdisc, not only PRIO.
It is possible to graft to each band another qdisc. The parent id for setting a child on PRIO qdisc with handle X on band number Y is X:(Y+1).
For example, if we set PRIO as root with handle 3

$ tc qdisc replace dev sw1p1 root handle 3: prio

To set RED on band 0 we will set the parent to be 3:1

$ tc qdisc add dev sw1p1 parent 3:1 handle 4: red limit 1000000 avpkt 10000

If PRIO is offloaded and its child can be offloaded as a child of PRIO, it will be automatically offloaded.
The operation of connecting a child to parent is called grafting.
A qdisc can be relocated, or be set to have more than one parent. Such changes are not allowed on an offloaded qdisc, and will cause the offloading of the qdisc to be stopped.

Changing the priomap of PRIO when there are offloaded qdiscs on its bands is possible but not recommended. Every child qdisc currently set on a changed band will lose the hardware stats since the last time it was accessed (for the purpose of reconfiguration or query) till the priomap change.

Note: Deleting a child of PRIO will not set the software queue back to the default FIFO. It will disable the queue instead. This will not effect the hardware queue that will be set back to the default state.

RED

About RED

RED is a queueing discipline designated for congestion control.
It drops packets based on statistical probabilities. The probability to drop a packet is zero until the queue's average size reaches the minimum limit. From there, the probability will rise linearly until it reaches the maximum probability for dropping, when the queue's average size reaches the maximum limit. When the queue's average size is above the maximum, the probability to drop a packet is 1 (See figure below).
These drops are called early drops.

figure 1

The RED qdisc has the option to mark packets with the ECN flag instead of dropping them.
This mode only affects packets which indicate that their hosts honor ECN. When set, the queue's size might reach its maximum size. In this case, packets that cannot be enqueued will be tail dropped.

Offloading RED

RED parameters:

  • limit – Hard limit for the queue's size. Not offloaded.
  • avpkt – Average queue size calculation parameter. Not offloaded. 1000 is recommended.
  • min (optional) – The minimum limit.
  • max (optional) – The maximum limit.
  • probability (optional) – The probability to drop a packet when the average queue size is at maximum limit.
  • burst (optional) – Allowed burst size. Not offloaded.
  • bandwidth (optional) – Used for average queue size calculation (and not to enforce queue's bandwidth). Not offloaded.
  • ecn (optional) – If set, indicates ECN mode is on.
  • hard-drop – If set, when the queue's average size is above the maximum limit, packets will be dropped even if they are ECN enabled and ECN mode is on. Not offloaded.

When offloading RED it is recommended to specify min, max and probability and not rely on the default values.

Note: RED is only supported on top of physical ports and only as a root qdisc or as a child of PRIO.
Note: Configuring both DCB and RED on the same port is not supported.

Example:

$ tc qdisc add dev sw1p1 root handle 4: red limit 1000000 avpkt 10000 probability 0.1 min 500000 max 1500000

RED Statistics

The show command will show the current configuration including RED's statistics.

$ tc -s qdisc show dev sw1p1
qdisc red 4: root refcnt 2 limit 1000000b min 500000b max 15782272b offload
 Sent 9962 bytes 29 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  marked 0 early 0 pdrop 0 other 0

Notes about RED statistics:

  • dropped – A packet can be dropped either by early drop or tail drop.
  • overlimits – The number of packets that were early dropped or ECN marked.
  • marked - The number of packets that were ECN marked.
  • early - The number of packets that were early dropped.
  • pdrop – The number of packets that were tail dropped.
  • other – The number of packets that were dropped for other reasons. Not in use.

Further Resources

  1. man tc
  2. man tc-red
  3. man tc-prio
  4. Traffic Control HOWTO
Clone this wiki locally