Documentation/networking/device_drivers/ethernet/marvell/octeontx2.rst
.. SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
Copyright (c) 2020 Marvell International Ltd.
Overview_Drivers_Basic packet flow_Devlink health reporters_Quality of service_RVU representors_Resource virtualization unit (RVU) on Marvell's OcteonTX2 SOC maps HW resources from the network, crypto and other functional blocks into PCI-compatible physical and virtual functions. Each functional block again has multiple local functions (LFs) for provisioning to PCI devices. RVU supports multiple PCIe SRIOV physical functions (PFs) and virtual functions (VFs). PF0 is called the administrative / admin function (AF) and has privileges to provision RVU functional block's LFs to each of the PF/VF.
RVU managed networking functional blocks
RVU managed non-networking functional blocks
Resource provisioning examples
RVU functional blocks are highly configurable as per software requirements.
Firmware setups following stuff before kernel boots
Linux kernel will have multiple drivers registering to different PF and VFs of RVU. Wrt networking there will be 3 flavours of drivers.
As mentioned above RVU PF0 is called the admin function (AF), this driver supports resource provisioning and configuration of functional blocks. Doesn't handle any I/O. It sets up few basic stuff but most of the functionality is achieved via configuration requests from PFs and VFs.
PF/VFs communicates with AF via a shared memory region (mailbox). Upon receiving requests AF does resource provisioning and other HW configuration. AF is always attached to host kernel, but PFs and their VFs may be used by host kernel itself, or attached to VMs or to userspace applications like DPDK etc. So AF has to handle provisioning/configuration requests sent by any device from any domain.
AF driver also interacts with underlying firmware to
From pure networking side AF driver supports following functionality.
This RVU PF handles IO, is mapped to a physical ethernet link and this driver registers a netdev. This supports SR-IOV. As said above this driver communicates with AF with a mailbox. To retrieve information from physical links this driver talks to AF and AF gets that info from firmware and responds back ie cannot talk to firmware directly.
Supports ethtool for configuring links, RSS, queue count, queue size, flow control, ntuple filters, dump PHY EEPROM, config FEC etc.
There are two types VFs, VFs that share the physical link with their parent SR-IOV PF and the VFs which work in pairs using internal HW loopback channels (LBK).
Type1:
Type2:
Except for the IO channels or links used for packet reception and transmission there is no other difference between these VF types. AF driver takes care of IO channel mapping, hence same VF driver works for both types of devices.
The NPA reporters are responsible for reporting and recovering the following group of errors:
GENERAL events
ERROR events
RAS events
RVU events
Sample Output::
~# devlink health
pci/0002:01:00.0:
reporter hw_npa_intr
state healthy error 2872 recover 2872 last_dump_date 2020-12-10 last_dump_time 09:39:09 grace_period 0 auto_recover true auto_dump true
reporter hw_npa_gen
state healthy error 2872 recover 2872 last_dump_date 2020-12-11 last_dump_time 04:43:04 grace_period 0 auto_recover true auto_dump true
reporter hw_npa_err
state healthy error 2871 recover 2871 last_dump_date 2020-12-10 last_dump_time 09:39:17 grace_period 0 auto_recover true auto_dump true
reporter hw_npa_ras
state healthy error 0 recover 0 last_dump_date 2020-12-10 last_dump_time 09:32:40 grace_period 0 auto_recover true auto_dump true
Each reporter dumps the
For example::
~# devlink health dump show pci/0002:01:00.0 reporter hw_npa_gen
NPA_AF_GENERAL:
NPA General Interrupt Reg : 1
NIX0: free disabled RX
~# devlink health dump show pci/0002:01:00.0 reporter hw_npa_intr
NPA_AF_RVU:
NPA RVU Interrupt Reg : 1
Unmap Slot Error
~# devlink health dump show pci/0002:01:00.0 reporter hw_npa_err
NPA_AF_ERR:
NPA Error Interrupt Reg : 4096
AQ Doorbell Error
The NIX reporters are responsible for reporting and recovering the following group of errors:
GENERAL events
ERROR events
RAS events
RVU events
Sample Output::
~# ./devlink health
pci/0002:01:00.0:
reporter hw_npa_intr
state healthy error 0 recover 0 grace_period 0 auto_recover true auto_dump true
reporter hw_npa_gen
state healthy error 0 recover 0 grace_period 0 auto_recover true auto_dump true
reporter hw_npa_err
state healthy error 0 recover 0 grace_period 0 auto_recover true auto_dump true
reporter hw_npa_ras
state healthy error 0 recover 0 grace_period 0 auto_recover true auto_dump true
reporter hw_nix_intr
state healthy error 1121 recover 1121 last_dump_date 2021-01-19 last_dump_time 05:42:26 grace_period 0 auto_recover true auto_dump true
reporter hw_nix_gen
state healthy error 949 recover 949 last_dump_date 2021-01-19 last_dump_time 05:42:43 grace_period 0 auto_recover true auto_dump true
reporter hw_nix_err
state healthy error 1147 recover 1147 last_dump_date 2021-01-19 last_dump_time 05:42:59 grace_period 0 auto_recover true auto_dump true
reporter hw_nix_ras
state healthy error 409 recover 409 last_dump_date 2021-01-19 last_dump_time 05:43:16 grace_period 0 auto_recover true auto_dump true
Each reporter dumps the
For example::
~# devlink health dump show pci/0002:01:00.0 reporter hw_nix_intr
NIX_AF_RVU:
NIX RVU Interrupt Reg : 1
Unmap Slot Error
~# devlink health dump show pci/0002:01:00.0 reporter hw_nix_gen
NIX_AF_GENERAL:
NIX General Interrupt Reg : 1
Rx multicast pkt drop
~# devlink health dump show pci/0002:01:00.0 reporter hw_nix_err
NIX_AF_ERR:
NIX Error Interrupt Reg : 64
Rx on unmapped PF_FUNC
octeontx2 silicon and CN10K transmit interface consists of five transmit levels starting from SMQ/MDQ, TL4 to TL1. Each packet will traverse MDQ, TL4 to TL1 levels. Each level contains an array of queues to support scheduling and shaping. The hardware uses the below algorithms depending on the priority of scheduler queues. once the usercreates tc classes with different priorities, the driver configures schedulers allocated to the class with specified priority along with rate-limiting configuration.
Strict Priority
Round Robin
Enable HW TC offload on the interface::
# ethtool -K <interface> hw-tc-offload on
Crate htb root::
# tc qdisc add dev <interface> clsact
# tc qdisc replace dev <interface> root handle 1: htb offload
Create tc classes with different priorities::
# tc class add dev <interface> parent 1: classid 1:1 htb rate 10Gbit prio 1
# tc class add dev <interface> parent 1: classid 1:2 htb rate 10Gbit prio 7
Create tc classes with same priorities and different quantum::
# tc class add dev <interface> parent 1: classid 1:1 htb rate 10Gbit prio 2 quantum 409600
# tc class add dev <interface> parent 1: classid 1:2 htb rate 10Gbit prio 2 quantum 188416
# tc class add dev <interface> parent 1: classid 1:3 htb rate 10Gbit prio 2 quantum 32768
RVU representor driver adds support for creation of representor devices for RVU PFs' VFs in the system. Representor devices are created when user enables the switchdev mode. Switchdev mode can be enabled either before or after setting up SRIOV numVFs. All representor devices share a single NIXLF but each has a dedicated Rx/Tx queues. RVU PF representor driver registers a separate netdev for each Rx/Tx queue pair.
Current HW does not support built-in switch which can do L2 learning and forwarding packets between representee and representor. Hence, packet path between representee and it's representor is achieved by setting up appropriate NPC MCAM filters. Transmit packets matching these filters will be loopbacked through hardware loopback channel/interface (i.e, instead of sending them out of MAC interface). Which will again match the installed filters and will be forwarded. This way representee => representor and representor => representee packet path is achieved. These rules get installed when representors are created and gets active/deactivate based on the representor/representee interface state.
Usage example:
Change device to switchdev mode::
List of representor devices on the system::
Rpf1vf0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether f6:43:83:ee:26:21 brd ff:ff:ff:ff:ff:ff Rpf1vf1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether 12:b2:54:0e:24:54 brd ff:ff:ff:ff:ff:ff Rpf1vf2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether 4a:12:c4:4c:32:62 brd ff:ff:ff:ff:ff:ff Rpf1vf3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether ca:cb:68:0e:e2:6e brd ff:ff:ff:ff:ff:ff Rpf2vf0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether 06:cc:ad:b4:f0:93 brd ff:ff:ff:ff:ff:ff
To delete the representors devices from the system. Change the device to legacy mode.
Change device to legacy mode::
RVU representors can be managed using devlink ports
(see :ref:Documentation/networking/devlink/devlink-port.rst <devlink_port>) interface.
Show devlink ports of representors::
pci/0002:1c:00.0/0: type eth netdev Rpf1vf0 flavour physical port 0 splittable false pci/0002:1c:00.0/1: type eth netdev Rpf1vf1 flavour pcivf controller 0 pfnum 1 vfnum 1 external false splittable false pci/0002:1c:00.0/2: type eth netdev Rpf1vf2 flavour pcivf controller 0 pfnum 1 vfnum 2 external false splittable false pci/0002:1c:00.0/3: type eth netdev Rpf1vf3 flavour pcivf controller 0 pfnum 1 vfnum 3 external false splittable false
The RVU representor support function attributes for representors. Port function configuration of the representors are supported through devlink eswitch port.
RVU representor driver support devlink port function attr mechanism to setup MAC address. (refer to Documentation/networking/devlink/devlink-port.rst)
To setup MAC address for port 2::
pci/0002:1c:00.0/2: type eth netdev Rpf1vf2 flavour pcivf controller 0 pfnum 1 vfnum 2 external false splittable false function: hw_addr 5c:a1:1b:5e:43:11
The rvu representor driver implements support for offloading tc rules using port representors.
Drop packets with vlan id 3::
Redirect packets with vlan id 5 and IPv4 packets to eth1, after stripping vlan header.::