Routing Family Netlink Library (libnl-route)
Thomas Graf<[email protected]>
version 3.1, Aug 11 2011
1. Introduction
This library provides APIs to the kernel interfaces of the routing family.
2. Addresses
3. Links (Network Devices)
The link configuration interface is part of the NETLINK_ROUTE
protocol family and implements the following netlink message types:
链路配置接口是NETLINK_ROUT协议簇的一部分,实现以下netlink消息类型:
-
View and modify the configuration of physical and virtual network devices.
-
查看和修改物理网卡和虚拟网卡的配置信息。
-
Create and delete virtual network devices (e.g. dummy devices, VLAN devices, tun devices, bridging devices, …)
-
创建和删除虚拟网卡
-
View and modify per link network configuration settings (e.g.
net.ipv6.conf.eth0.accept_ra
,net.ipv4.conf.eth1.forwarding
, …) -
查看和修改每个网络配置设置。
Naming Convention (network device, link, interface)
命名约定(网络设备、链路、接口)
In networking several terms are commonly used to refer to network devices. While they have distinct meanings they have been used interchangeably in the past. Within the Linux kernel, the term network device or netdev is commonly used In user space the term network interface is very common. The routing netlink protocol uses the term link and so does the iproute2 utility and most routing daemons.
在网络中,有几个术语通常用来指代网络设备。虽然它们有不同的含义,但在过去它们可以互换使用。在Linux内核中,术语netdevice或netdev通常用于用户空间,术语network interface非常常见。路由netlink协议使用术语link,iproute2实用程序和大多数路由守护进程也使用术语link。
3.1. Netlink Protocol
This section describes the protocol semantics of the netlink based link configuration interface. The following messages are defined:
本节介绍基于netlink的链路配置接口的协议语义。定义了以下消息:
Message Type | User → Kernel | Kernel → User |
---|---|---|
|
Create or update virtual network device |
Reply to |
|
Delete virtual network device |
Notification of link deleted or disappeared |
|
Retrieve link configuration and statistics |
|
|
Modify link configuration |
See Netlink Library - Message Types for more information on common semantics of these message types.
3.1.1. Link Message Format
3.1.1. link 消息格式
All netlink link messages share a common header (struct ifinfomsg
) which is appended after the netlink header (struct nlmsghdr
).
所有netlink 的liink消息够共享一个common 头部(struct ifinfomsg) 他附加在netlink头部后面(struct nlmsghdr
)
netlink header (上面灰色部分) link header(下面红色部分)
The meaning of each field may differ depending on the message type. A struct ifinfomsg
is defined in <linux/rtnetlink.h>
to represent the header.
每个字段的含义可能因消息类型而异。<linux/rtnetlink.h>中定义了一个结构ifinfo msg来表示头。
Address Family (8bit) :The address family is usually set to AF_UNSPEC
but may be specified in RTM_GETLINK
requests to limit the returned links to a specific address family.
Address Family (8bit) :地址族通常被设置为AF_UNSPEC
,但可以在RTM_GETLINK
请求中指定,以将返回的链接限制到特定的地址族。
Link Layer Type (16bit) :Currently only used in kernel→user messages to report the link layer type of a link. The value corresponds to the ARPHRD_*
defines found in <linux/if_arp.h>
. Translation from/to strings can be done using the functions nl_llproto2str()/nl_str2llproto().
Link Layer Type (16bit) :目前只在kernel→user消息中使用,用于报告链接的链接层类型。该值对应于<linux/if_arp.h>中的ARPHRD定义。可以使用函数nl_llproto2str()/nl_str2llproto()完成 从/到 字符串的转换。
Link Index (32bit) :Carries the interface index and is used to identify existing links.
Link Index (32bit) :携带接口索引并用于标识现有链接。
Flags (32bit) :In kernel→user messages the value of this field represents the current state of the link flags. In user→kernel messages this field is used to change flags or set the initial flag state of new links. Note that in order to change a flag, the flag must also be set in the Flags Change Mask field.
Flags (32bit) :在kernel→user messages中,此字段的值表示链接标志的当前状态。在用户→内核消息中,此字段用于更改标志或设置新链接的初始标志状态。请注意,要更改标志,还必须在标志更改掩码字段中设置该标志。
Flags Change Mask (32bit) :The primary use of this field is to specify a mask of flags that should be changed based on the value of the Flags field. A special meaning is given to this field when present in link notifications, see TODO.
Flags Change Mask (32bit) :此字段的主要用途是指定应根据“标志”字段的值更改的标志掩码。当此字段出现在链接通知中时,它有一个特殊的含义,请参阅TODO。
Attributes (variable) :All link message types may carry netlink attributes. They are defined in the header file <linux/if_link.h> and share the prefix IFLA_
.
Attributes (variable) :所有链接消息类型都可能带有netlink属性。它们在头文件<linux/if_link.h>中定义,并共享前缀IFLA_
。
3.1.2. Link Message Types
RTM_GETLINK (user→kernel)
Lookup link by 1. interface index or 2. link name (IFLA_IFNAME
) and return a single RTM_NEWLINK
message containing the link configuration and statistics or a netlink error message if no such link was found.
按1查找链接。接口索引或2。链接名(IFLA_IFNAME
)并返回一条包含链接配置和统计信息的RTM_NEWLINK
消息,如果没有找到此类链接,则返回一条netlink错误消息。
Parameters:
-
Address family
-
If the address family is set to
PF_BRIDGE
, only bridging devices will be returned. -
如果地址族被设置为
PF_BRIDGE
,则只返回桥接设备 -
If the address family is set to
PF_INET6
, only ipv6 enabled devices will be returned. -
如果地址族设置为PF_INET6,则只返回启用ipv6的设备。
-
Flags:
-
NLM_F_DUMP
If set, all links will be returned in form of a multipart message. -
NLM_F_DUMP
如果被设置,所有链接都将以多部分消息的形式返回。
Returns:
-
EINVAL
if neither interface nor link name are set -
如果未设置接口或链接名称,则返回EINVAL
-
ENODEV
if no link was found -
如果找不到链接,则返回ENODEV
-
ENOBUFS
if allocation failed -
如果分配失败,则返回ENOBUFS
RTM_NEWLINK (user→kernel)
Creates a new or updates an existing link. Only virtual links may be created but all links may be updated.
创建新链接或更新现有链接。只能创建虚拟链接,但可以更新所有链接。
Flags:
-
NLM_F_CREATE
Create link if it does not exist -
NLM_F_CREATE
创建链接(如果不存在) -
NLM_F_EXCL
ReturnEEXIST
if link already exists -
NLM_F_EXCL 如果链接已经存在返回EEXIST
Returns:
-
EINVAL
malformed message or invalid configuration parameters -
EINVAL 消息格式错误或配置参数无效
-
EAFNOSUPPORT
if a address family specific configuration (IFLA_AF_SPEC
) is not supported. -
EAFNOSUPPORT 如果不支持特定于地址系列的配置(
IFLA_AF_SPEC
),则提供。 -
EOPNOTSUPP
if the link does not support modification of parameters -
EOPNOTSUPP 如果链接不支持修改参数
-
EEXIST
ifNLM_F_EXCL
was set and the link exists alraedy -
EEXIST 如果设置了NLM_F_EXCL并且链接已存在
-
ENODEV
if the link does not exist andNLM_F_CREATE
is not set -
ENODEV
如果链接不存在并且未设置NLM_F_CREATE
RTM_NEWLINK (kernel→user)
This message type is used in reply to a RTM_GETLINK
request and carries the configuration and statistics of a link. If multiple links need to be sent, the messages will be sent in form of a multipart message.
此消息类型用于响应RTM_GETLINK
请求,并携带链接的配置和统计信息。如果需要发送多个链接,则消息将以多部分消息的形式发送。
The message type is also used for notifications sent by the kernel to the multicast group RTNLGRP_LINK
to inform about various link events. It is therefore recommended to always use a separate link socket for link notifications in order to separate between the two message types.
消息类型还用于内核向多播组RTNLGRP_LINK
发送通知,以通知各种链接事件。因此,建议始终为链接通知使用单独的链接套接字,以便在两种消息类型之间进行分离。
TODO: document how to detect different notifications
RTM_DELLINK (user→kernel)
Lookup link by 1. interface index or 2. link name (IFLA_IFNAME
) and delete the virtual link.
根据接口索引或者链接名(IFLA_IFNAME
)删除虚拟链路
Returns:
-
EINVAL
if neither interface nor link name are set -
EINVAL 如果接口和链接名称都未设置
-
ENODEV
if no link was found -
ENODEV
如果找不到链接 -
ENOTSUPP
if the operation is not supported (not a virtual link) -
ENOTSUPP如果不支持该操作(不是虚拟链接)
RTM_DELLINK (kernel→user)
Notification sent by the kernel to the multicast group RTNLGRP_LINK
when
内核通过RTNLGRP_LINK
链接向多播组发送通知
-
a network device was unregistered (change == ~0)
-
网络设备已注销
-
a bridging device was deleted (address family will be
PF_BRIDGE
) -
已删除桥接设备
3.2. Get / List
3.2.1. Get list of links
To retrieve the list of links in the kernel, allocate a new link cache using rtnl_link_alloc_cache() to hold the links. It will automatically construct and send a RTM_GETLINK
message requesting a dump of all links from the kernel and feed the returned RTM_NEWLINK
to the internal link message parser which adds the returned links to the cache.
要检索内核中的链接列表,请使用rtnl_link_alloc_cache() 分配一个新的链接缓存来保存链接。它将自动构造并发送一个RTM_GETLINK
消息,请求从内核转储所有链接,并将返回的RTM_NEWLINK
提供给内部链接消息解析器,后者将返回的链接添加到缓存中。
#include <netlink/route/link.h>
/**
* Allocate link cache and fill in all configured links.
* @arg sk Netlink socket.
* @arg family Link address family or AF_UNSPEC
* @arg result Pointer to store resulting cache.
*
* Allocates and initializes a new link cache. A netlink message is sent to
* the kernel requesting a full dump of all configured links. The returned
* messages are parsed and filled into the cache. If the operation succeeds
* the resulting cache will a link object for each link configured in the
* kernel.
*
* If \c family is set to an address family other than \c AF_UNSPEC the
* contents of the cache can be limited to a specific address family.
* Currently the following address families are supported:
* - AF_BRIDGE
* - AF_INET6
*
* @route_doc{link_list, Get List of Links}
* @see rtnl_link_get()
* @see rtnl_link_get_by_name()
* @return 0 on success or a negative error code.
*/
int rtnl_link_alloc_cache(struct nl_sock *sk, int family, struct nl_cache **result)
The cache will contain link objects (struct rtnl_link
, see Link Object) and can be accessed using the standard cache functions. By setting the family
parameter to an address familly other than AF_UNSPEC
, the resulting cache will only contain links supporting the specified address family.
缓存将包含链接对象(struct rtnl_link
,请参阅link Object),并且可以使用标准缓存函数进行访问。通过将family参数设置为除AF_UNSPEC
以外的地址族,生成的缓存将只包含支持指定地址族的链接。
The following direct search functions are provided to search by interface index and by link name:
#include <netlink/route/link.h>
/**
* Lookup link in cache by interface index
* @arg cache Link cache
* @arg ifindex Interface index
*
* Searches through the provided cache looking for a link with matching
* interface index.
*
* @attention The reference counter of the returned link object will be
* incremented. Use rtnl_link_put() to release the reference.
*
* @route_doc{link_list, Get List of Links}
* @see rtnl_link_get_by_name()
* @return Link object or NULL if no match was found.
*/
struct rtnl_link *rtnl_link_get(struct nl_cache *cache, int ifindex);
/**
* Lookup link in cache by link name
* @arg cache Link cache
* @arg name Name of link
*
* Searches through the provided cache looking for a link with matching
* link name
*
* @attention The reference counter of the returned link object will be
* incremented. Use rtnl_link_put() to release the reference.
*
* @route_doc{link_list, Get List of Links}
* @see rtnl_link_get()
* @return Link object or NULL if no match was found.
*/
struct rtnl_link *rtnl_link_get_by_name(struct nl_cache *cache, const char *name);
Example: Link Cache
struct nl_cache *cache;
struct rtnl_link *link;
if (rtnl_link_alloc_cache(sock, AF_UNSPEC, &cache)) < 0)
/* error */
if (!(link = rtnl_link_get_by_name(cache, "eth1")))
/* link does not exist */
/* do something with link */
rtnl_link_put(link);
nl_cache_put(cache);
3.2.2. Lookup Single Link (Direct Lookup)
If only a single link is of interest, the link can be looked up directly without the use of a link cache using the function rtnl_link_get_kernel()
.
如果只对单个链接感兴趣,则可以使用函数rtnl_link_get_kernel() 直接查找链接,而无需使用链接缓存。
#include <netlink/route/link.h>
int rtnl_link_get_kernel(struct nl_sock *sk, int ifindex, const char *name, struct rtnl_link **result);
It will construct and send a RTM_GETLINK
request using the parameters provided and wait for a RTM_NEWLINK
or netlink error message sent in return. If the link exists, the link is returned as link object (see Link Object).
它将使用提供的参数构造并发送RTM_GETLINK
请求,并等待RTM_NEWLINK
或netlink错误消息返回。如果链接存在,链接将作为链接对象返回(请参见链接对象)。
Example: Direct link lookup
struct rtnl_link *link;
if (rtnl_link_get_kernel(sock, 0, "eth1", &link) < 0)
/* error */
/* do something with link */
rtnl_link_put(link);
3.2.3. Translating interface index to link name
3.2.3. 将接口索引转换为链接名称
Applications which require to translate interface index to a link name or vice verase may use the following functions to do so. Both functions require a filled link cache to work with.
需要将接口索引转换为链接名或链接名的应用程序可以使用以下函数进行转换。这两个函数都需要一个填充的链接缓存。
/**
* Translate interface index to corresponding link name
* @arg cache Link cache
* @arg ifindex Interface index
* @arg dst String to store name 存储名称的字符串
* @arg len Length of destination string 目标字符串的长度
*
* Translates the specified interface index to the corresponding
* link name and stores the name in the destination string.
*
* @route_doc{link_translate_ifindex, Translating interface index to link name}
* @see rtnl_link_name2i()
* @return Name of link or NULL if no match was found.
*/
char *rtnl_link_i2name (struct nl_cache *cache, int ifindex, char *dst, size_t len);
/**
* Translate link name to corresponding interface index
* @arg cache Link cache
* @arg name Name of link
*
* @route_doc{link_translate_ifindex, Translating interface index to link name}
* @see rtnl_link_i2name()
* @return Interface index or 0 if no match was found.
*/
int rtnl_link_name2i (struct nl_cache *cache, const char *name);
3.3. Add / Modify
Several types of virtual link can be added on the fly using the function rtnl_link_add ()
.
#include <netlink/route/link.h>
/**
* Add virtual link 添加虚拟链接
* @arg sk netlink socket.
* @arg link new link to add
* @arg flags additional netlink message flags
*
* Builds a \c RTM_NEWLINK netlink message requesting the addition of
* a new virtual link. 建立一个netlink消息请求添加一个新的虚拟链接
*
* After sending, the function will wait for the ACK or an eventual
* error message to be received and will therefore block until the
* operation has been completed.
*
* @copydoc auto_ack_warning
*
* @return 0 on success or a negative error code.
*/
int rtnl_link_add(struct nl_sock *sk, struct rtnl_link *link, int flags);
3.4. Delete
The deletion of virtual links such as VLAN devices or dummy devices is done using the function rtnl_link_delete()
. The link passed on to the function can be a link from a link cache or it can be construct with the minimal attributes needed to identify the link.
虚拟链路(如VLAN设备或虚拟设备)的删除是使用函数 rtnl_link_delete
() 完成的。传递给函数的链接可以是来自链接缓存的链接,也可以用标识链接所需的最小属性来构造。
#include <netlink/route/link.h>
int rtnl_link_delete(struct nl_sock *sk, const struct rtnl_link *link);
The function will construct and send a RTM_DELLINK
request message and returns any errors returned by the kernel.
该函数将构造并发送RTM_DELLINK
请求消息,并返回内核返回的所有错误。
Example: Delete link by name
struct rtnl_link *link;
if (!(link = rtnl_link_alloc()))
/* error */
rtnl_link_set_name(link, "my_vlan");
if (rtnl_link_delete(sock, link) < 0)
/* error */
rtnl_link_put(link);
3.5. Link Object
A link is represented by the structure struct rtnl_link
. Instances may be created with the function rtnl_link_alloc()
or via a link cache (see Get list of links) and are freed again using the function rtnl_link_put()
.
链接由结构rtnl_link
表示。实例可以使用函数rtnl_link_alloc
()创建,也可以通过链接缓存创建,并使用函数rtnl_link_put
()再次释放。
#include <netlink/route/link.h>
struct rtnl_link *rtnl_link_alloc(void);
void rtnl_link_put(struct rtnl_link *link);
3.5.1. Name
The name serves as unique, human readable description of the link. By default, links are named based on their type and then enumerated, e.g. eth0, eth1, ethn but they may be renamed at any time.
该名称用作链接的唯一、可读的描述。默认情况下,链接根据其类型命名,然后枚举,例如eth0、eth1、ethn,但它们可以随时重命名。
Kernels >= 2.6.11 support identification by link name.
#include <netlink/route/link.h>
void rtnl_link_set_name(struct rtnl_link *link, const char *name);
char *rtnl_link_get_name(struct rtnl_link *link);
Accepted link name format: [^ /]*
(maximum length: 15 characters)
3.5.2. Interface Index (Identifier)
The interface index is an integer uniquely identifying a link. If present in any link message, it will be used to identify an existing link.
接口索引是唯一标识链接的整数。如果存在于任何链接消息中,它将用于标识现有链接。
#include <netlink/route/link.h>
void rtnl_link_set_ifindex(struct rtnl_link *link, int ifindex);
int rtnl_link_get_ifindex(struct rtnl_link *link);
3.5.3. Group
Each link can be assigned a numeric group identifier to group a bunch of links together and apply a set of changes to a group instead of just a single link.
可以为每个链接分配一个数字组标识符,以便将一组链接组合在一起,并对组应用一组更改,而不仅仅是单个链接。
#include <netlink/route/link.h>
void rtnl_link_set_group(struct rtnl_link *link, uint32_t group);
uint32_t rtnl_link_get_group(struct rtnl_link *link);
3.5.4. Link Layer Address
The link layer address (e.g. MAC address).
#include <netlink/route/link.h>
void rtnl_link_set_addr(struct rtnl_link *link, struct nl_addr *addr);
struct nl_addr *rtnl_link_get_addr(struct rtnl_link *link);
3.5.5. Broadcast Address
The link layer broadcast address
#include <netlink/route/link.h>
void rtnl_link_set_broadcast(struct rtnl_link *link, struct nl_addr *addr);
struct nl_addr *rtnl_link_get_broadcast(struct rtnl_link *link);
3.5.6. MTU (Maximum Transmission Unit)
The maximum transmission unit specifies the maximum packet size a network device can transmit or receive. This value may be lower than the capability of the physical network device.
最大传输单元指定网络设备可以传输或接收的最大数据包大小。此值可能低于物理网络设备的容量。
#include <netlink/route/link.h>
void rtnl_link_set_mtu(struct rtnl_link *link, unsigned int mtu);
unsigned int rtnl_link_get_mtu(struct rtnl_link *link);
3.5.7. Flags
The flags of a link enable or disable various link features or inform about the state of the link.
链接的标志启用或禁用各种链接功能或通知链接的状态。
#include <netlink/route/link.h>
void rtnl_link_set_flags(struct rtnl_link *link, unsigned int flags);
void rtnl_link_unset_flags(struct rtnl_link *link, unsigned int flags);
unsigned int rtnl_link_get_flags(struct rtnl_link *link);
IFF_UP | Link is up (administratively) |
IFF_RUNNING | Link is up and carrier is OK (RFC2863 OPER_UP) |
IFF_LOWER_UP | Link layer is operational |
IFF_DORMANT | Driver signals dormant |
IFF_BROADCAST | Link supports broadcasting |
IFF_MULTICAST | Link supports multicasting |
IFF_ALLMULTI | Link supports multicast routing |
IFF_DEBUG | Tell driver to do debugging (currently unused) |
IFF_LOOPBACK | Link loopback network |
IFF_POINTOPOINT | Point-to-point link |
IFF_NOARP | ARP is not supported |
IFF_PROMISC | Status of promiscious mode |
IFF_MASTER | Master of a load balancer (bonding) |
IFF_SLAVE | Slave to a master link |
IFF_PORTSEL | Driver supports setting media type (only used by ARM ethernet) |
IFF_AUTOMEDIA | Link selects port automatically (only used by ARM ethernet) |
IFF_ECHO | Echo sent packets (testing feature, CAN only) |
IFF_DYNAMIC | Unused (BSD compatibility) |
IFF_NOTRAILERS | Unused (BSD compatibility) |
To translate a link flag to a link flag name or vice versa:
#include <netlink/route/link.h>
char *rtnl_link_flags2str(int flags, char *buf, size_t size);
int rtnl_link_str2flags(const char *flag_name);
3.5.8. Transmission Queue Length
The transmission queue holds packets before packets are delivered to the driver for transmission. It is usually specified in number of packets but the unit may be specific to the link type.
传输队列在数据包被传递到驱动程序进行传输之前保存数据包。它通常以包的数量来指定,但是单位可能特定于链路类型。
#include <netlink/route/link.h>
void rtnl_link_set_txqlen(struct rtnl_link *link, unsigned int txqlen);
unsigned int rtnl_link_get_txqlen(struct rtnl_link *link);
3.5.9. Operational Status
The operational status has been introduced to provide extended information on the link status. Traditionally the link state has been described using the link flags IFF_UP, IFF_RUNNING, IFF_LOWER_UP
, and IFF_DORMANT
which was no longer sufficient for some link types.
引入了操作状态以提供有关链路状态的扩展信息。传统上,使用链接标志IFF_UP
、IFF_RUNNING
、IFF_LOWER_UP
和IFF_DORMANT
来描述链接状态,这对于某些链接类型来说已经不够了。
#include <netlink/route/link.h>
void rtnl_link_set_operstate(struct rtnl_link *link, uint8_t state);
uint8_t rtnl_link_get_operstate(struct rtnl_link *link);
IF_OPER_UNKNOWN | Unknown state |
IF_OPER_NOTPRESENT | Link not present |
IF_OPER_DOWN | Link down |
IF_OPER_LOWERLAYERDOWN | L1 down |
IF_OPER_TESTING | Testing |
IF_OPER_DORMANT | Dormant |
IF_OPER_UP | Link up |
Translation of operational status code to string and vice versa:
#include <netlink/route/link.h>
char *rtnl_link_operstate2str(uint8_t state, char *buf, size_t size);
int rtnl_link_str2operstate(const char *name);
3.5.10. Mode
Currently known link modes are:
IF_LINK_MODE_DEFAULT | Default link mode |
IF_LINK_MODE_DORMANT | Limit upward transition to dormant |
#include <netlink/route/link.h>
void rtnl_link_set_linkmode(struct rtnl_link *link, uint8_t mode);
uint8_t rtnl_link_get_linkmode(struct rtnl_link *link);
Translation of link mode to string and vice versa:
char *rtnl_link_mode2str(uint8_t mode, char *buf, size_t len);
uint8_t rtnl_link_str2mode(const char *name);
3.5.11. IfAlias
Alternative name for the link, primarly used for SNMP IfAlias.
#include <netlink/route/link.h>
const char *rtnl_link_get_ifalias(struct rtnl_link *link);
void rtnl_link_set_ifalias(struct rtnl_link *link, const char *alias);
Length limit: 256
3.5.12. Hardware Type
#include <netlink/route/link.h>
#include <linux/if_arp.h>
void rtnl_link_set_arptype(struct rtnl_link *link, unsigned int arptype);
unsigned int rtnl_link_get_arptype(struct rtnl_link *link);
Translation of hardware type to character string and vice versa:
#include <netlink/utils.h>
char *nl_llproto2str(int arptype, char *buf, size_t len);
int nl_str2llproto(const char *name);
3.5.13. Qdisc
The name of the queueing discipline used by the link is of informational nature only. It is a read-only attribute provided by the kernel and cannot be modified. The set function is provided solely for the purpose of creating link objects to be used for comparison.
链接使用的排队规程的名称仅具有信息性。它是内核提供的只读属性,不能修改。set函数仅用于创建用于比较的链接对象。
For more information on how to modify the qdisc of a link, see section Traffic Control.
#include <netlink/route/link.h>
void rtnl_link_set_qdisc(struct rtnl_link *link, const char *name);
char *rtnl_link_get_qdisc(struct rtnl_link *link);
3.5.14. Promiscuity
The number of subsystem currently depending on the link being promiscuous mode. A value of 0 indicates that the link is not in promiscuous mode. It is a read-only attribute provided by the kernel and cannot be modified. The set function is provided solely for the purpose of creating link objects to be used for comparison.
当前子系统的数量取决于链路的混杂模式。值为0表示链接未处于混杂模式。它是内核提供的只读属性,不能修改。set函数仅用于创建用于比较的链接对象。
#include <netlink/route/link.h>
void rtnl_link_set_promiscuity(struct rtnl_link *link, uint32_t count);
uint32_t rtnl_link_get_promiscuity(struct rtnl_link *link);
3.5.15. RX/TX Queues
The number of RX/TX queues the link provides. The attribute is writable but will only be considered when creating a new network device via netlink.
#include <netlink/route/link.h>
void rtnl_link_set_num_tx_queues(struct rtnl_link *link, uint32_t nqueues);
uint32_t rtnl_link_get_num_tx_queues(struct rtnl_link *link);
void rtnl_link_set_num_rx_queues(struct rtnl_link *link, uint32_t nqueues);
uint32_t rtnl_link_get_num_rx_queues(struct rtnl_link *link);
3.5.16. Weight
This attribute is unused and obsoleted in all recent kernels.
3.6. Modules
3.6.1. Bonding
Example: Add bonding link
#include <netlink/route/link.h>
struct rtnl_link *link;
link = rtnl_link_bond_alloc();
rtnl_link_set_name(link, "my_bond");
/* requires admin privileges */
if (rtnl_link_add(sk, link, NLM_F_CREATE) < 0)
/* error */
rtnl_link_put(link);
3.6.2. VLAN
extern char * rtnl_link_vlan_flags2str(int, char *, size_t);
extern int rtnl_link_vlan_str2flags(const char *);
extern int rtnl_link_vlan_set_id(struct rtnl_link *, int);
extern int rtnl_link_vlan_get_id(struct rtnl_link *);
extern int rtnl_link_vlan_set_flags(struct rtnl_link *,unsigned int);
extern int rtnl_link_vlan_unset_flags(struct rtnl_link *,unsigned int);
extern unsigned int rtnl_link_vlan_get_flags(struct rtnl_link *);
extern int rtnl_link_vlan_set_ingress_map(struct rtnl_link *,int, uint32_t);
extern uint32_t * rtnl_link_vlan_get_ingress_map(struct rtnl_link *);
extern int rtnl_link_vlan_set_egress_map(struct rtnl_link *,uint32_t, int);
extern struct vlan_map *rtnl_link_vlan_get_egress_map(struct rtnl_link *,int *);
Example: Add a VLAN device
struct rtnl_link *link;
int master_index;
/* lookup interface index of eth0 */
if (!(master_index = rtnl_link_name2i(link_cache, "eth0")))
/* error */
/* allocate new link object of type vlan */
link = rtnl_link_vlan_alloc();
/* set eth0 to be our master device */
rtnl_link_set_link(link, master_index);
rtnl_link_vlan_set_id(link, 10);
if ((err = rtnl_link_add(sk, link, NLM_F_CREATE)) < 0)
/* error */
rtnl_link_put(link);
3.6.3. MACVLAN
extern struct rtnl_link *rtnl_link_macvlan_alloc(void);
extern int rtnl_link_is_macvlan(struct rtnl_link *);
extern char * rtnl_link_macvlan_mode2str(int, char *, size_t);
extern int rtnl_link_macvlan_str2mode(const char *);
extern char * rtnl_link_macvlan_flags2str(int, char *, size_t);
extern int rtnl_link_macvlan_str2flags(const char *);
extern int rtnl_link_macvlan_set_mode(struct rtnl_link *,uint32_t);
extern uint32_t rtnl_link_macvlan_get_mode(struct rtnl_link *);
extern int rtnl_link_macvlan_set_flags(struct rtnl_link *,uint16_t);
extern int rtnl_link_macvlan_unset_flags(struct rtnl_link *,uint16_t);
extern uint16_t rtnl_link_macvlan_get_flags(struct rtnl_link *);
Example: Add a MACVLAN device
struct rtnl_link *link;
int master_index;
struct nl_addr* addr;
/* lookup interface index of eth0 */
if (!(master_index = rtnl_link_name2i(link_cache, "eth0")))
/* error */
/* allocate new link object of type macvlan */
link = rtnl_link_macvlan_alloc();
/* set eth0 to be our master device */
rtnl_link_set_link(link, master_index);
/* set address of virtual interface */
addr = nl_addr_build(AF_LLC, ether_aton("00:11:22:33:44:55"), ETH_ALEN);
rtnl_link_set_addr(link, addr);
nl_addr_put(addr);
/* set mode of virtual interface */
rtnl_link_macvlan_set_mode(link, rtnl_link_macvlan_str2mode("bridge"));
if ((err = rtnl_link_add(sk, link, NLM_F_CREATE)) < 0)
/* error */
rtnl_link_put(link);
3.6.4. VXLAN
extern struct rtnl_link *rtnl_link_vxlan_alloc(void);
extern int rtnl_link_is_vxlan(struct rtnl_link *);
extern int rtnl_link_vxlan_set_id(struct rtnl_link *, uint32_t);
extern int rtnl_link_vxlan_get_id(struct rtnl_link *, uint32_t *);
extern int rtnl_link_vxlan_set_group(struct rtnl_link *, struct nl_addr *);
extern int rtnl_link_vxlan_get_group(struct rtnl_link *, struct nl_addr **);
extern int rtnl_link_vxlan_set_link(struct rtnl_link *, uint32_t);
extern int rtnl_link_vxlan_get_link(struct rtnl_link *, uint32_t *);
extern int rtnl_link_vxlan_set_local(struct rtnl_link *, struct nl_addr *);
extern int rtnl_link_vxlan_get_local(struct rtnl_link *, struct nl_addr **);
extern int rtnl_link_vxlan_set_ttl(struct rtnl_link *, uint8_t);
extern int rtnl_link_vxlan_get_ttl(struct rtnl_link *);
extern int rtnl_link_vxlan_set_tos(struct rtnl_link *, uint8_t);
extern int rtnl_link_vxlan_get_tos(struct rtnl_link *);
extern int rtnl_link_vxlan_set_learning(struct rtnl_link *, uint8_t);
extern int rtnl_link_vxlan_get_learning(struct rtnl_link *);
extern int rtnl_link_vxlan_enable_learning(struct rtnl_link *);
extern int rtnl_link_vxlan_disable_learning(struct rtnl_link *);
extern int rtnl_link_vxlan_set_ageing(struct rtnl_link *, uint32_t);
extern int rtnl_link_vxlan_get_ageing(struct rtnl_link *, uint32_t *);
extern int rtnl_link_vxlan_set_limit(struct rtnl_link *, uint32_t);
extern int rtnl_link_vxlan_get_limit(struct rtnl_link *, uint32_t *);
extern int rtnl_link_vxlan_set_port_range(struct rtnl_link *,struct ifla_vxlan_port_range *);
extern int rtnl_link_vxlan_get_port_range(struct rtnl_link *,struct ifla_vxlan_port_range *);
extern int rtnl_link_vxlan_set_proxy(struct rtnl_link *, uint8_t);
extern int rtnl_link_vxlan_get_proxy(struct rtnl_link *);
extern int rtnl_link_vxlan_enable_proxy(struct rtnl_link *);
extern int rtnl_link_vxlan_disable_proxy(struct rtnl_link *);
extern int rtnl_link_vxlan_set_rsc(struct rtnl_link *, uint8_t);
extern int rtnl_link_vxlan_get_rsc(struct rtnl_link *);
extern int rtnl_link_vxlan_enable_rsc(struct rtnl_link *);
extern int rtnl_link_vxlan_disable_rsc(struct rtnl_link *);
extern int rtnl_link_vxlan_set_l2miss(struct rtnl_link *, uint8_t);
extern int rtnl_link_vxlan_get_l2miss(struct rtnl_link *);
extern int rtnl_link_vxlan_enable_l2miss(struct rtnl_link *);
extern int rtnl_link_vxlan_disable_l2miss(struct rtnl_link *);
extern int rtnl_link_vxlan_set_l3miss(struct rtnl_link *, uint8_t);
extern int rtnl_link_vxlan_get_l3miss(struct rtnl_link *);
extern int rtnl_link_vxlan_enable_l3miss(struct rtnl_link *);
extern int rtnl_link_vxlan_disable_l3miss(struct rtnl_link *);
Example: Add a VXLAN device
struct rtnl_link *link;
struct nl_addr* addr;
/* allocate new link object of type vxlan */
link = rtnl_link_vxlan_alloc();
/* set interface name */
rtnl_link_set_name(link, "vxlan128");
/* set VXLAN network identifier */
if ((err = rtnl_link_vxlan_set_id(link, 128)) < 0)
/* error */
/* set multicast address to join */
if ((err = nl_addr_parse("239.0.0.1", AF_INET, &addr)) < 0)
/* error */
if ((err = rtnl_link_set_group(link, addr)) < 0)
/* error */
nl_addr_put(addr);
if ((err = rtnl_link_add(sk, link, NLM_F_CREATE)) < 0)
/* error */
rtnl_link_put(link);
4. Neighbouring
5. Routing
6. Traffic Control
The traffic control architecture allows the queueing and prioritization of packets before they are enqueued to the network driver. To a limited degree it is also possible to take control of network traffic as it enters the network stack.
流量控制体系结构允许在数据包排队到网络驱动程序之前对数据包进行排队和优先级排序。在一定程度上,也可以在网络流量进入网络堆栈时对其进行控制。
The architecture consists of three different types of modules:
该体系结构由三种不同类型的模块组成:
-
Queueing disciplines (qdisc) provide a mechanism to enqueue packets in different forms. They may be used to implement fair queueing, prioritization of differentiated services, enforce bandwidth limitations, or even to simulate network behaviour such as packet loss and packet delay. Qdiscs can be classful in which case they allow traffic classes described in the next paragraph to be attached to them.
-
Queueing disciplines (qdisc) 提供一种以不同形式将数据包排队的机制。它们可以用来实现公平排队、区分服务的优先级、强制带宽限制,甚至可以用来模拟诸如分组丢失和分组延迟之类的网络行为。Qdiscs可以是类的,在这种情况下,它们允许将下一段中描述的流量类附加到它们。
-
Traffic classes (class) are supported by several qdiscs to build a tree structure for different types of traffic. Each class may be assigned its own set of attributes such as bandwidth limits or queueing priorities. Some qdiscs even allow borrowing of bandwidth between classes.
-
Traffic classes (class)由多个qdisc支持,用于为不同类型的流量构建树结构。每个类可以被分配它自己的一组属性,例如带宽限制或排队优先级。一些qdisc甚至允许在类之间借用带宽。
-
Classifiers (cls) are used to decide which qdisc/class the packet should be enqueued to. Different types of classifiers exists, ranging from classification based on protocol header values to classification based on packet priority or firewall marks. Additionally most classifiers support extended matches (ematch) which allow extending classifiers by a set of matcher modules, and actions which allow classifiers to take actions such as mangling, mirroring, or even rerouting of packets.
-
Classifiers (cls)用于决定数据包应该排队到哪个qdisc/类。存在不同类型的分类器,从基于协议头值的分类到基于包优先级或防火墙标记的分类。此外,大多数分类器支持扩展匹配(ematch),它允许通过一组匹配器模块扩展分类器,以及允许分类器执行诸如损坏、镜像甚至重新路由数据包之类的操作
Default Qdisc
The default qdisc used on all network devices is pfifo_fast
. Network devices which do not require a transmit queue such as the loopback device do not have a default qdisc attached. The pfifo_fast
qdisc provides three bands to prioritize interactive traffic over bulk traffic. Classification is based on the packet priority (diffserv).
所有网络设备上使用的默认qdisc是pfifo_fast
。不需要传输队列的网络设备(如环回设备)没有连接默认的qdisc。pfifo_fast
qdisc提供了三个频段,将交互流量优先于批量流量。分类基于数据包优先级(diffserv)
Multiqueue Default Qdisc
If the network device provides multiple transmit queues the mq
qdisc is used by default. It will automatically create a separate class for each transmit queue available and will also replace the single per device tx lock with a per queue lock.
如果网络设备提供多个传输队列,则默认情况下使用mq qdisc。它将为每个可用的传输队列自动创建一个单独的类,还将用每个队列锁替换单个每个设备的tx锁。
Example of a customized classful qdisc setup
The following figure illustrates a possible combination of different queueing and classification modules to implement quality of service needs.
下图说明了不同排队和分类模块的可能组合,以实现服务质量需求。
6.1. Traffic Control Object
Each type traffic control module (qdisc, class, classifier) is represented by its own structure. All of them are based on the traffic control object represented by struct rtnl_tc
which itself is based on the generic object struct nl_object
to make it cacheable. The traffic control object contains all attributes, implementation details and statistics that are shared by all of the traffic control object types.
每种类型的流量控制模块(qdisc、类、分类器)都由自己的结构表示。所有这些都基于struct rtnl_tc
表示的流量控制对象,而struct rtnl_tc
本身基于通用对象struct nl_object
使其可缓存。流量控制对象包含所有流量控制对象类型共享的所有属性、实现细节和统计信息。
It is not possible to allocate a struct rtnl_tc
object, instead the actual tc object types must be allocated directly using rtnl_qdisc_alloc()
, rtnl_class_alloc()
, rtnl_cls_alloc()
and then casted to struct rtnl_tc
using the TC_CAST()
macro.
Usage Example: Allocation, Casting, Freeing
#include <netlink/route/tc.h>
#include <netlink/route/qdisc.h>
struct rtnl_qdisc *qdisc;
/* Allocation of a qdisc object */
qdisc = rtnl_qdisc_alloc();
/* Cast the qdisc to a tc object using TC_CAST() to use rtnl_tc_ functions. */
rtnl_tc_set_mpu(TC_CAST(qdisc), 64);
/* Free the qdisc object */
rtnl_qdisc_put(qdisc);
6.1.1. Attributes
Handle
The handle uniquely identifies a tc object and is used to refer to other tc objects when constructing tc trees.
void rtnl_tc_set_handle(struct rtnl_tc *tc, uint32_t handle);
uint32_t rtnl_tc_get_handle(struct rtnl_tc *tc);
Interface Index
The interface index specifies the network device the traffic object is attached to. The function rtnl_tc_set_link()
should be preferred when setting the interface index. It stores the reference to the link object in the tc object and allows retrieving the mtu
and linktype
automatically.
void rtnl_tc_set_ifindex(struct rtnl_tc *tc, int ifindex);
void rtnl_tc_set_link(struct rtnl_tc *tc, struct rtnl_link *link);
int rtnl_tc_get_ifindex(struct rtnl_tc *tc);
Link Type
The link type specifies the kind of link that is used by the network device (e.g. ethernet, ATM, …). It is derived automatically when the network device is specified with rtnl_tc_set_link()
. The default fallback is ARPHRD_ETHER
(ethernet).
void rtnl_tc_set_linktype(struct rtnl_tc *tc, uint32_t type);
uint32_t rtnl_tc_get_linktype(struct rtnl_tc *tc);
Kind
The kind character string specifies the type of qdisc, class, classifier. Setting the kind results in the module specific structure being allocated. Therefore it is imperative to call rtnl_tc_set_kind()
before using any type specific API functions such as rtnl_htb_set_rate()
.
int rtnl_tc_set_kind(struct rtnl_tc *tc, const char *kind);
char *rtnl_tc_get_kind(struct rtnl_tc *tc);
MPU
The Minimum Packet Unit specifies the minimum packet size which will be transmitted ever be seen by this traffic control object. This value is used for rate calculations. Not all object implementations will make use of this value. The default value is 0.
void rtnl_tc_set_mpu(struct rtnl_tc *tc, uint32_t mpu);
uint32_t rtnl_tc_get_mpu(struct rtnl_tc *tc);
MTU
The Maximum Transmission Unit specifies the maximum packet size which will be transmitted. The value is derived from the link specified with rtnl_tc_set_link()
if not overwritten with rtnl_tc_set_mtu()
. If no link and MTU is specified, the value defaults to 1500 (ethernet).
void rtnl_tc_set_mtu(struct rtnl_tc *tc, uint32_t mtu);
uint32_t rtnl_tc_get_mtu(struct rtnl_tc *tc);
Overhead
The overhead specifies the additional overhead per packet caused by the network layer. This value can be used to correct packet size calculations if the packet size on the wire does not match the packet size seen by the kernel. The default value is 0.
void rtnl_tc_set_overhead(struct rtnl_tc *tc, uint32_t overhead);
uint32_t rtnl_tc_get_overhead(struct rtnl_tc *tc);
Parent
Specifies the parent traffic control object. The parent is identifier by its handle. Special values are:
-
TC_H_ROOT
: attach tc object directly to network device (root qdisc, root classifier) -
TC_H_INGRESS
: same asTC_H_ROOT
but on the ingress side of the network stack.void rtnl_tc_set_parent(struct rtnl_tc *tc, uint32_t parent); uint32_t rtnl_tc_get_parent(struct rtnl_tc *tc);
Statistics
Generic statistics, see Accessing Statistics for additional information.
uint64_t rtnl_tc_get_stat(struct rtnl_tc *tc, enum rtnl_tc_stat id);
6.1.2. Accessing Statistics
The traffic control object holds a set of generic statistics. Not all traffic control modules will make use of all of these statistics. Some modules may provide additional statistics via their own APIs.
ID | Type | Description |
---|---|---|
Counter |
Total # of packets transmitted |
|
Counter |
Total # of bytes transmitted |
|
Rate |
Current bytes/s rate |
|
Rate |
Current packets/s rate |
|
Rate |
Current length of the queue |
|
Rate |
# of packets currently backloged |
|
Counter |
# of packets dropped |
|
Counter |
# of packets requeued |
|
Counter |
# of packets that exceeded the limit |
RTNL_TC_RATE_BPS and RTNL_TC_RATE_PPS only return meaningful values if a rate estimator has been configured. |
Usage Example: Retrieving tc statistics
#include <netlink/route/tc.h>
uint64_t drops, qlen;
drops = rtnl_tc_get_stat(TC_CAST(qdisc), RTNL_TC_DROPS);
qlen = rtnl_tc_get_stat(TC_CAST(qdisc), RTNL_TC_QLEN);
6.1.3. Rate Table Calculations
6.2. Queueing Discipline (qdisc)
Classless Qdisc
The queueing discipline (qdisc) is used to implement fair queueing, priorization or rate control. It provides a enqueue() and dequeue() operation. Whenever a network packet leaves the networking stack over a network device, be it a physical or virtual device, it will be enqueued to a qdisc unless the device is queueless. The enqueue() operation is followed by an immediate call to dequeue() for the same qdisc to eventually retrieve a packet which can be scheduled for transmission by the driver. Additionally, the networking stack runs a watchdog which polls the qdisc regularly to dequeue and send packets even if no new packets are being enqueued.
排队规程(qdisc)用于实现公平排队、优先化或速率控制。它提供了一个enqueue()和dequeue()操作。当网络包通过网络设备(无论是物理设备还是虚拟设备)离开网络堆栈时,它将排队到qdisc,除非该设备是无队列的。enqueue()操作之后立即调用同一个qdisc的dequeue(),以最终检索可由驱动程序安排传输的数据包。此外,网络堆栈运行一个看门狗,它定期轮询qdisc,以便在没有新数据包排队的情况下出列并发送数据包。
This additional watchdog is required due to the fact that qdiscs may hold on to packets and not return any packets upon dequeue() in order to enforce bandwidth restrictions.
由于qdiscs可能会保留数据包,并且在dequeue()时不会返回任何数据包,因此需要额外的看门狗来执行带宽限制。
The figure illustrates a trivial example of a classless qdisc consisting of three bands (queues). Use of multiple bands is a common technique in qdiscs to implement fair queueing between flows or prioritize differentiated services.
Classless qdiscs can be regarded as a blackbox, their inner workings can only be steered using the configuration parameters provided by the qdisc. There is no way of taking influence on the structure of its internal queues itself.
Classful Qdisc
Classful qdiscs allow for the queueing structure and classification process to be created by the user.
The figure above shows a classful qdisc with a classifier attached to it which will make the decision whether to enqueue a packet to traffic class 1:1
or 1:2
. Unlike with classless qdiscs, classful qdiscs allow the classification process and the structure of the queues to be defined by the user. This allows for complex traffic class rules to be applied.
上图显示了一个类qdisc,它附带了一个分类器,该分类器将决定是否将数据包排队到1:1或1:2的流量类。与无类QDISC不同,有类QDISC允许用户定义分类过程和队列结构。这允许应用复杂的流量类规则。
Qdisc | Classful | Description |
---|---|---|
ATM |
Yes |
FIXME |
Blackhole |
No |
This qdisc will drop all packets passed to it. |
CBQ |
Yes |
The CBQ (Class Based Queueing) is a classful qdisc which allows creating traffic classes and enforce bandwidth limitations for each class. |
DRR |
Yes |
The DRR (Deficit Round Robin) scheduler is a classful qdisc impelemting fair queueing. Each class is assigned a quantum specyfing the maximum number of bytes that can be served per round. Unused quantum at the end of the round is carried over to the next round. |
DSMARK |
Yes |
FIXME |
FIFO |
No |
FIXME |
GRED |
No |
FIXME |
HFSC |
Yes |
FIXME |
HTB |
Yes |
FIXME |
mq |
Yes |
FIXME |
multiq |
Yes |
FIXME |
netem |
No |
FIXME |
Prio |
Yes |
FIXME |
RED |
Yes |
FIXME |
SFQ |
Yes |
FIXME |
TBF |
Yes |
FIXME |
teql |
No |
FIXME |
Attribute | C Interface |
---|---|
Allocation / Freeing |
|
Addition |
|
Modification |
|
Deletion |
|
Cache |
|
6.2.1. Retrieving Qdisc Configuration
The function rtnl_qdisc_alloc_cache() is used to retrieve the current qdisc configuration in the kernel. It will construct a RTM_GETQDISC
netlink message, requesting the complete list of qdiscs configured in the kernel.
#include <netlink/route/qdisc.h>
struct nl_cache *all_qdiscs;
if (rtnl_link_alloc_cache(sock, &all_qdiscs) < 0)
/* error while retrieving qdisc cfg */
The cache can be accessed using the following functions:
-
Search qdisc with matching ifindex and handle:
struct rtnl_qdisc *rtnl_qdisc_get(struct nl_cache *cache, int ifindex, uint32_t handle);
-
Search qdisc with matching ifindex and parent:
struct rtnl_qdisc *rtnl_qdisc_get_by_parent(struct nl_cache *cache, int ifindex , uint32_t parent);
-
Or any of the generic cache functions (e.g. nl_cache_search(), nl_cache_dump(), etc.)
Example: Search and print qdisc
struct rtnl_qdisc *qdisc;
int ifindex;
ifindex = rtnl_link_get_ifindex(eth0_obj);
/* search for qdisc on eth0 with handle 1:0 */
if (!(qdisc = rtnl_qdisc_get(all_qdiscs, ifindex, TC_HANDLE(1, 0))))
/* no such qdisc found */
nl_object_dump(OBJ_CAST(qdisc), NULL);
rtnl_qdisc_put(qdisc);
6.2.2. Adding a Qdisc
In order to add a new qdisc to the kernel, a qdisc object needs to be allocated. It will hold all attributes of the new qdisc.
#include <netlink/route/qdisc.h>
struct rtnl_qdisc *qdisc;
if (!(qdisc = rtnl_qdisc_alloc()))
/* OOM error */
The next step is to specify all generic qdisc attributes using the tc object interface described in the section Attributes.
The following attributes must be specified: - IfIndex - Parent - Kind
/* Attach qdisc to device eth0 */
rtnl_tc_set_link(TC_CAST(qdisc), eth0_obj);
/* Make this the root qdisc */
rtnl_tc_set_parent(TC_CAST(qdisc), TC_H_ROOT);
/* Set qdisc identifier to 1:0, if left unspecified, a handle will be generated by the kernel. */
rtnl_tc_set_handle(TC_CAST(qdisc), TC_HANDLE(1, 0));
/* Make this a HTB qdisc */
rtnl_tc_set_kind(TC_CAST(qdisc), "htb");
After specyfing the qdisc kind (rtnl_tc_set_kind()) the qdisc type specific interface can be used to set attributes which are specific to the respective qdisc implementations:
/* HTB feature: Make unclassified packets go to traffic class 1:5 */
rtnl_htb_set_defcls(qdisc, TC_HANDLE(1, 5));
Finally, the qdisc is ready to be added and can be passed on to the function rntl_qdisc_add() which takes care of constructing a netlink message requesting the addition of the new qdisc, sends the message to the kernel and waits for the response by the kernel. The function returns 0 if the qdisc has been added or updated successfully or a negative error code if an error occured.
The kernel operation for updating and adding a qdisc is the same. Therefore when calling rtnl_qdisc_add() any existing qdisc with matching handle will be updated unless the flag NLM_F_EXCL is specified. |
The following flags may be specified:
NLM_F_CREATE | Create qdisc if it does not exist, otherwise -NLE_OBJ_NOTFOUND is returned. |
NLM_F_REPLACE | If another qdisc is already attached to the same parent and their handles mismatch, replace the qdisc instead of returning -EEXIST. |
NLM_F_EXCL | Return -NLE_EXISTS if a qdisc with matching handles exists already. |
The function rtnl_qdisc_add() requires administrator privileges. |
/* Submit request to kernel and wait for response */
err = rtnl_qdisc_add(sock, qdisc, NLM_F_CREATE);
/* Return the qdisc object to free memory resources */
rtnl_qdisc_put(qdisc);
if (err < 0) {
fprintf(stderr, "Unable to add qdisc: %s\n", nl_geterror(err));
return err;
}
6.2.3. Deleting a qdisc
#include <netlink/route/qdisc.h>
struct rtnl_qdisc *qdisc;
qdisc = rtnl_qdisc_alloc();
rtnl_tc_set_link(TC_CAST(qdisc), eth0_obj);
rtnl_tc_set_parent(TC_CAST(qdisc), TC_H_ROOT);
rtnl_qdisc_delete(sock, qdisc)
rtnl_qdisc_put(qdisc);
The function rtnl_qdisc_delete() requires administrator privileges. |
6.2.4. HTB - Hierarchical Token Bucket
HTB Qdisc Attributes
Default Class
The default class is the fallback class to which all traffic which remained unclassified is directed to. If no default class or an invalid default class is specified, packets are transmitted directly to the next layer (direct transmissions).
uint32_t rtnl_htb_get_defcls(struct rtnl_qdisc *qdisc);
int rtnl_htb_set_defcls(struct rtnl_qdisc *qdisc, uint32_t defcls);
Rate to Quantum (r2q)
TODO
uint32_t rtnl_htb_get_rate2quantum(struct rtnl_qdisc *qdisc);
int rtnl_htb_set_rate2quantum(struct rtnl_qdisc *qdisc, uint32_t rate2quantum);
HTB Class Attributes
Priority
uint32_t rtnl_htb_get_prio(struct rtnl_class *class);
int rtnl_htb_set_prio(struct rtnl_class *class, uint32_t prio);
Rate
The rate (bytes/s) specifies the maximum bandwidth an invidivual class can use without borrowing. The rate of a class should always be greater or erqual than the rate of its children.
uint32_t rtnl_htb_get_rate(struct rtnl_class *class);
int rtnl_htb_set_rate(struct rtnl_class *class, uint32_t ceil);
Ceil Rate
The ceil rate specifies the maximum bandwidth an invidivual class can use. This includes bandwidth that is being borrowed from other classes. Ceil defaults to the class rate implying that by default the class will not borrow. The ceil rate of a class should always be greater or erqual than the ceil rate of its children.
uint32_t rtnl_htb_get_ceil(struct rtnl_class *class);
int rtnl_htb_set_ceil(struct rtnl_class *class, uint32_t ceil);
Burst
TODO
uint32_t rtnl_htb_get_rbuffer(struct rtnl_class *class);
int rtnl_htb_set_rbuffer(struct rtnl_class *class, uint32_t burst);
Ceil Burst
TODO
uint32_t rtnl_htb_get_bbuffer(struct rtnl_class *class);
int rtnl_htb_set_bbuffer(struct rtnl_class *class, uint32_t burst);
Quantum
TODO
int rtnl_htb_set_quantum(struct rtnl_class *class, uint32_t quantum);
extern int rtnl_htb_set_cbuffer(struct rtnl_class *, uint32_t);
6.3. Class
UNSPEC | TC_H_ROOT | 0:pY | pX:pY | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
UNSPEC |
|
|
||||||||||
0:hY |
|
|
||||||||||
hX:hY |
|
if pX != hX return -EINVAL
|
6.4. Classifier (cls)
TODO
6.5. ClassID Management
TODO
6.6. Packet Location Aliasing (pktloc)
TODO
6.7. Traffic Control Module API
TODO
Version 3.1
Last updated 2014-01-21 20:43:12 CET