特别声明: 文中所述内容纯属个人理解,如有错误,请大家指正。欢迎交流
从性能角度而言,三种模式的效率依次升高。
(1)通用网卡驱动
对于通用网卡驱动,只有transparent_mode=0有效。
以linux-2.6.32.43为例,说明此模式下的数据处理流程:
NAPI igb_poll (drivers/net/igb/igb_main.c 4372)
调用 igb_clean_rx_irq_adv (drivers/net/igb/igb_main.c 4381)
调用 igb_receive_skb (drivers/net/igb/igb_main.c 4760)
调用 napi_gro_receive (drivers/net/igb/igb_main.c 4568)
待完成
后续就到了大家的老熟人:netif_receive_skb
点击(此处)折叠或打开
- list_for_each_entry_rcu(ptype, &ptype_all, list) {
- if (ptype->dev == null_or_orig || ptype->dev == skb->dev ||
- ptype->dev == orig_dev) {
- if (pt_prev)
- ret = deliver_skb(skb, pt_prev, orig_dev);
- pt_prev = ptype;
- }
- }
- #ifdef CONFIG_NET_CLS_ACT
- skb = handle_ing(skb, &pt_prev, &ret, orig_dev);
- if (!skb)
- goto out;
- ncls:
- #endif
- skb = handle_bridge(skb, &pt_prev, &ret, orig_dev);
- if (!skb)
- goto out;
- skb = handle_macvlan(skb, &pt_prev, &ret, orig_dev);
- if (!skb)
- goto out;
- type = skb->protocol;
- list_for_each_entry_rcu(ptype,
- &ptype_base[ntohs(type) & PTYPE_HASH_MASK], list) {
- if (ptype->type == type &&
- (ptype->dev == null_or_orig || ptype->dev == skb->dev ||
- ptype->dev == orig_dev)) {
- if (pt_prev)
- ret = deliver_skb(skb, pt_prev, orig_dev);
- pt_prev = ptype;
- }
- }
- if (pt_prev) {
- ret = pt_prev->func(skb, skb->dev, pt_prev, orig_dev);
- } else {
- kfree_skb(skb);
- /* Jamal, now you will not able to escape explaining
- * me how you were going to use this. :-)
- */
- ret = NET_RX_DROP;
- }
联系pf_ring的device_handler注册函数
点击(此处)折叠或打开
- void register_device_handler(void) {
- if(transparent_mode != standard_linux_path) return;
- prot_hook.func = packet_rcv;
- prot_hook.type = htons(ETH_P_ALL);
- dev_add_pack(&prot_hook);
- }
net_if_receive_skb会遍历所有type为ETH_P_ALL的处理函数,并调用回调处理函数packet_recv
至此,我们的pf_ring已经能从内核中获取包了。
模式1的流程图如下:
对于pfring定制的驱动,以pf_ring 5.4.0为例,对于模式1和2,其流程如下:
NAPI igb_poll (drivers/PF_RING_aware/intel/igb/igb-3.2.10/src/igb_main.c 5866)
调用 igb_clean_rx_irq (drivers/PF_RING_aware/intel/igb/igb-3.2.10/src/igb_main.c 5879)
igb_clean_rx_irq这个函数是关键:
点击(此处)折叠或打开
- #ifdef HAVE_PF_RING
- {
- if(pf_ring_handle_skb(q_vector, skb) <= 0) {
- #endif
- #ifdef HAVE_VLAN_RX_REGISTER
- igb_receive_skb(q_vector, skb);
- #else
- napi_gro_receive(&q_vector->napi, skb);
- #endif
- #ifdef HAVE_PF_RING
- }
只有当pf_ring_handle_skb的返回值为<=0的值,才会调用napi_gro_receive函数。而pf_ring_handle_skb在什么时候会返回<=0的值呢?
看看它的定义:drivers/PF_RING_aware/intel/ixgbe/ixgbe-3.7.17/src/ixgbe_main.c 1697
点击(此处)折叠或打开
- #ifdef HAVE_PF_RING
- static int pf_ring_handle_skb(struct ixgbe_q_vector *q_vector, struct sk_buff *skb) {
- int debug = 0;
- struct pfring_hooks *hook = (struct pfring_hooks*)skb->dev->pfring_ptr;
- if(unlikely(debug)) printk(KERN_INFO "[PF_RING] pf_ring_handle_skb()\n");
- if(hook && (hook->magic == PF_RING)) {
- /* Wow: PF_RING is alive & kickin' ! */
- if(unlikely(debug))
- printk(KERN_INFO "[PF_RING] alive [%s][len=%d]\n", skb->dev->name, skb->len);
- if(*hook->transparent_mode != standard_linux_path) {
- u_int8_t skb_reference_in_use;
- int rc = hook->ring_handler(skb, 1, 1, &skb_reference_in_use,
- q_vector->rx.ring->queue_index,
- q_vector->adapter->num_rx_queues);
- if(rc > 0 /* Packet handled by PF_RING */) {
- if(*hook->transparent_mode == driver2pf_ring_non_transparent) {
- /* PF_RING has already freed the memory */
- return(rc);
- }
- }
- return(0);
- } else {
- if(unlikely(debug)) printk(KERN_INFO "[PF_RING] not present on %s\n", skb->dev->name);
- return(0);
- }
- }
- return(-1);
- }
- #endif
当模式设置为standard_linux_path(transparent_mode=0)时,返回0,当设置为driver2pf_ring_non_transparent(transparent_mode=2)时,才可能返回大于0的值。
模式1和2的流程图如下图所示:
从这里可以看到,当设置为模式0和模式1时,内核依然会处理这些包,而设置为2时,包只由pf_ring处理。
(3)效率提升
从上面的分析看以看出,模式0网络处理路径最长,效率也是最低的;模式1路径较短,效率比模式0要高,但内核中的处理工作依然较多;(3)模式2路径最短,内核中的处理工作较少。