socket流程(16)——neigh_hh_output和dev_queue_xmit

250阅读 0评论2015-12-11 48576958
分类:LINUX

作者:gfree.wind@gmail.com
博客:linuxfocus.blog.chinaunix.com

前面已经看完了函数ip_finish_output2下面开始看neigh_hh_output: 
  1. static inline int neigh_hh_output(struct hh_cache *hh, struct sk_buff *skb)
  2. {
  3.     unsigned seq;
  4.     int hh_len;
     /* 
     这里使用顺序锁的机制,来读取hh的数据 
     关于顺序锁的介绍,请参考其他资料。
     这里的循环,保证了在读取hh的时候,没有其他线程或者进程修改了hh
     */
  1.     do {
  2.         int hh_alen;

  3.         seq = read_seqbegin(&hh->hh_lock);
  4.         /* 得到hh的长度和数据,复制到skb_buff中 */
  5.         hh_len = hh->hh_len;
  6.         hh_alen = HH_DATA_ALIGN(hh_len);
  7.         memcpy(skb->data - hh_alen, hh->hh_data, hh_alen);
  8.     } while (read_seqretry(&hh->hh_lock, seq));
     /* 调整skb_buff的data和len */
  1.     skb_push(skb, hh_len);
  2.     /* 调用hh的回调函数,发送skb_buff */
  3.     return hh->hh_output(skb);
  4. }
通过搜索hh的hh_output,可以发现,在不同的情况下,hh_output指向的函数是不一样的。
比如,在函数neigh_destroy中,hh->hh_output = neigh_blackhole;而neigh_blackhole,顾名思义是neighbour的一个黑洞,该函数直接释放skb_buff。

根据我的查找,在正确的流程下,hh的hh_output应该指向为dev_queue_xmit。
dev_queue_xmit有一大段的注释——不错,我喜欢:D
  1. /**
  2.  *    dev_queue_xmit - transmit a buffer
  3.  *    @skb: buffer to transmit
  4.  *
  5.  *    Queue a buffer for transmission to a network device. The caller must
  6.  *    have set the device and priority and built the buffer before calling
  7.  *    this function. The function can be called from an interrupt.
  8.  *
  9.  *    A negative errno code is returned on a failure. A success does not
  10.  *    guarantee the frame will be transmitted as it may be dropped due
  11.  *    to congestion or traffic shaping.
  12.  *
  13.  * -----------------------------------------------------------------------------------
  14.  * I notice this method can also return errors from the queue disciplines,
  15.  * including NET_XMIT_DROP, which is a positive value. So, errors can also
  16.  * be positive.
  17.  *
  18.  * Regardless of the return value, the skb is consumed, so it is currently
  19.  * difficult to retry a send to this method. (You can bump the ref count
  20.  * before sending to hold a reference for retry if you are careful.)
  21.  *
  22.  * When calling this method, interrupts MUST be enabled. This is because
  23.  * the BH enable code must have IRQs enabled so that it will not deadlock.
  24.  * --BLG
  25.  */
从注释中,可以看出这个函数是用于发送skb_buff的——将buffer加入到driver的queue中,剩下的就是driver的事情了。呵呵,胜利就在眼前。这已经是UDP发送数据在TCP/IP函数栈的最后一个函数了,后面关于driver的工作,就不是我学习的范围了。

那么让我们简单的看看dev_queue_xmit吧。
  1. int dev_queue_xmit(struct sk_buff *skb)
  2. {
  3.     struct net_device *dev = skb->dev;
  4.     struct netdev_queue *txq;
  5.     struct Qdisc *q;
  6.     int rc = -ENOMEM;

  7.     /* Disable soft irqs for various locks below. Also
  8.      * stops preemption for RCU.
  9.      */
  10.     rcu_read_lock_bh();
     /* 得到发送device的发送队列 */
  1.     txq = dev_pick_tx(dev, skb);
  2.     q = rcu_dereference_bh(txq->qdisc);

  3. #ifdef CONFIG_NET_CLS_ACT
  4.     skb->tc_verd = SET_TC_AT(skb->tc_verd, AT_EGRESS);
  5. #endif
  6.     if (q->enqueue) {
  7.         /* 该设备有enqueue的处理函数 */
  8.         rc = __dev_xmit_skb(skb, q, dev, txq);
  9.         goto out;
  10.     }
     /* 该设备没有队列, 大部分情况都是软件设备(虚拟设备?)*/
  1.     /* The device has no queue. Common case for software devices:
  2.      loopback, all the sorts of tunnels...

  3.      Really, it is unlikely that netif_tx_lock protection is necessary
  4.      here. (f.e. loopback and IP tunnels are clean ignoring statistics
  5.      counters.)
  6.      However, it is possible, that they rely on protection
  7.      made by us here.

  8.      Check this and shot the lock. It is not prone from deadlocks.
  9.      Either shot noqueue qdisc, it is even simpler 8)
  10.      */
  11.     if (dev->flags & IFF_UP) {
  12.         int cpu = smp_processor_id(); /* ok because BHs are off */

  13.         if (txq->xmit_lock_owner != cpu) {
  14.             /*
  15.             别的CPU正在使用该设备,需要尝试获得锁,来获取device的使用权。
  16.             */

  17.             HARD_TX_LOCK(dev, txq, cpu);
  1.             if (!netif_tx_queue_stopped(txq)) {
  2.                 /* 发送数据 */
  3.                 rc = dev_hard_start_xmit(skb, dev, txq);
  4.                 if (dev_xmit_complete(rc)) {
  5.                     HARD_TX_UNLOCK(dev, txq);
  6.                     goto out;
  7.                 }
  8.             }
  9.             HARD_TX_UNLOCK(dev, txq);
  10.             if (net_ratelimit())
  11.                 printk(KERN_CRIT "Virtual device %s asks to "
  12.                  "queue packet!\n", dev->name);
  13.         } else {
  14.             /* 出错了。。。。*/
  15.             /* Recursion is detected! It is possible,
  16.              * unfortunately */
  17.             if (net_ratelimit())
  18.                 printk(KERN_CRIT "Dead loop on virtual device "
  19.                  "%s, fix it urgently!\n", dev->name);
  20.         }
  21.     }

  22.     rc = -ENETDOWN;
  23.     rcu_read_unlock_bh();

  24.     kfree_skb(skb);
  25.     return rc;
  26. out:
  27.     rcu_read_unlock_bh();
  28.     return rc;
  29. }
好了,到了目前为止。UDP的发送过程就基本上完成了。哦,遗漏了一个IP分片函数ip_fragment。
刚刚简单浏览了一遍这个函数,发现比较容易阅读,但是今天已经有点晚了。就留在明天学习了。

上一篇:tcp/ip源代码(17)——ip_fragment
下一篇:UDP socket流程(15)——ip_local_out及其调用的函数