This chapter includes:
The networking stack supports the following types of drivers:
You can tell a native driver from an io-net driver by the name:
NetBSD drivers aren't as tightly integrated into the overall stack. In the NetBSD operating system, these drivers operate with interrupts disabled and, as such, generally have fewer mutexing issues to deal with on the transmit and receive path. With a straight port of a NetBSD driver, the stack defaults to a single-threaded model, in order to prevent possible transmit and receive synchronization issues with simultaneous execution. If the driver has been carefully analyzed and proper synchronization techniques applied, then a flag can be flipped during the driver attachment, saying that the multi-threaded operation is allowed.
If one driver operates in single-threaded mode, all drivers operate in single-threaded mode. |
The native and NetBSD drivers all hook directly into the stack in a similar manner. The io-net drivers interface through a “shim” layer that converts the io-net binary interface into the compatible io-pkt interface. We have a special driver, devnp-shim.so, that's automatically loaded when you start an io-net driver.
The shim layer provides binary compatibility with existing io-net drivers. As such, these drivers are also not as tightly integrated into the stack. Features such as dynamically setting media options or jumbo packets for example aren't supported for these drivers. Given that the driver operates within the io-net design context, the drivers won't perform as well as a native one. In addition to the packet receive / transmit device drivers, device drivers are also available that integrate hardware crypto acceleration functionality directly into the stack.
For information about specific drivers, see the Utilities Reference:
|
For information about converting drivers, see the “Porting an io-net driver to io-pkt” technote.
There's a fine line between native and ported drivers. If you do more than the initial “make it run” port, the feature sets of a ported driver and a native driver aren't really any different.
If you look deeper, there are some differences:
For this reason, a configuration flag is, by default, set to indicate that the driver doesn't support multi-threaded access. As a result, the entire stack runs in a single-threaded mode of operation (if one driver can't run in multithreaded mode, no drivers will run with multiple threads). You can change this flag once you've carefully examined the driver to ensure that there are no locking issues.
However, Neutrino's version of delay() takes a time in milliseconds, so this could result in very long timeouts if used directly as-is in the drivers. We've defined DELAY() to do the appropriate conversion of the delay from microseconds to milliseconds, so all NetBSD ported drivers should define delay() to be DELAY().
The differences between legacy io-net drivers and other drivers include the following:
You can load drivers into the stack from the command line just as with io-net. For example:
io-pkt-v4-hc -di82544
This command-line invocation works whether or not the driver is a native driver or an io-net-style driver. The stack automatically detects the driver type and loads the devnp-shim.so binary if the driver is an io-net driver.
Make sure that all drivers are located in a directory that can be resolved by the LD_LIBRARY_PATH environment variable if you don't want to have to specify the fully qualified name of the device in the command line. |
You can also mount a driver in the standard way:
mount -Tio-pkt /lib/dll/devnp-i82544.so
The mount command still supports the io-net option, to provide backward compatibility with existing scripts:
mount -Tio-net /lib/dll/devnp-i82544.so
The standard way to remove a driver from the stack is with the ifconfig iface destroy command. For example:
ifconfig wm0 destroy
For native drivers and io-net drivers, the nicinfo utility is usually the first debug tool that you'll use (aside from ifconfig) when problems with networking occur. This will let you know whether or not the driver has properly negotiated at the link layer and whether or not it's sending and receiving packets.
Ensure that the slogger daemon is running, and then after the problem occurs, run the sloginfo utility to see if the driver has logged any diagnostic information. You can increase the amount of diagnostic information that a driver logs by specifying the verbose command-line option to the driver. Many drivers support various levels of verbosity; you might even try specifying verbose=10.
For ported NetBSD drivers that don't include nicinfo capabilities, you can use netstat -I iface to get very basic packet input / output information. Use ifconfig to get the basic device information. Use ifconfig -v to get more detailed information.
Having different devices sharing a hardware interrupt is kind of a neat idea, but unless you really need to do it — because you've run out of hardware interrupt lines — it generally doesn't help you much. In fact, it can cause you trouble. For example, if your driver doesn't work (e.g. no received packets), check to see if it's sharing an interrupt with another device, and if so, reconfigure your board so it doesn't.
Most of the time, when shared interrupts are configured, there's no good reason for it (i.e. you haven't really run out of interrupts) and this can decrease your performance, because when the interrupt fires, all of the devices sharing the interrupt need to run and check to see if it's for them. If you check the source code, you can see that some drivers do the “right thing,” which is to read registers in their interrupt handlers to see if the interrupt is really for them, and then ignore it if not. But many drivers don't; they schedule their thread-level event handlers to check their hardware, which is inefficient and reduces performance.
If you're using the PCI bus, use the pci -v utility to check the interrupt allocation.
Sharing interrupts can vastly increase interrupt latency, depending upon exactly what each of the drivers does. After an interrupt fires, the kernel doesn't reenable it until all driver handlers tell the kernel that they've finished handling it. So, if one driver takes a long time servicing a shared interrupt that's masked, then if another device on the same interrupt causes an interrupt during that time period, processing of that interrupt can be delayed for an unknown duration of time.
Interrupt sharing can cause problems, and reduce performance, increase CPU consumption, and seriously increase latency. Unless you really need to do it, don't. If you must share interrupts, make sure your drivers are doing the “right thing.”
If you've downloaded the source from Foundry27 (http://community.qnx.com/sf/sfmain/do/home), you'll find a technote in the source tree under /trunk/sys/dev_qnx/doc that describes how to write a native driver. Sample driver code is also available under the /trunk/sys/dev_qnx/sample directory.
If you want to use gdb to debug a driver, youfirst have to make sure that your source is compiled with debugging information included. With your driver code in the correct place in the sys tree (dev_qnx or dev), you can do the following:
# cd sys # make CPULIST=x86 clean # make CPULIST=x86 CCOPTS=-O0 DEBUG=-g install
Now that you have a debug version, you can start gdb and set a breakpoint at main() in the io-pkt binary.
Don't forget to specify your driver in the arguments, and ensure that the PATH and LD_LIBRARY_PATH environment variables are properly set up. |
After hitting the breakpoint in main(), do a sharedlibrary command in gdb. You should see libc loaded in. Set a breakpoint in dlsym(). When that's hit, your driver should be loaded in, but io-pkt hasn't done the first callout into it. Do a set solib-search-path and add the path to your driver, and then do a sharedlibrary again. The debugger should load the symbols for your driver, and then you can set a breakpoint where you want your debugging to start.
The stack's 802.11 layer can dump debugging information. You can enable and disable the dumping by using sysctl settings. If you do:
sysctl -a | grep 80211
with a Wi-Fi driver, you'll see net.link.ieee80211.debug and net.link.ieee80211.vap0.debug. To turn on the debug output, type the following:
sysctl -w net.link.ieee80211.debug = 1 sysctl -w net.link.ieee80211.vap0.debug=0xffffffff
You can then use sloginfo to display the debug log.
Jumbo packets are packets that carry more payload than the normal 1500 bytes. Even the definition of a jumbo packet is unclear; different people use different lengths. For jumbo packets to work, the protocol stack, the drivers, and the network switches must all support jumbo packets:
If you can use jumbo packets with io-pkt, you can see substantial performance gains because more data can be moved per packet header processing overhead.
To configure a driver to operate with jumbo packets, do this (for example):
# ifconfig wm0 ip4csum tcp4csum udp4csum # ifconfig wm0 mtu 8100 # ifconfig wm0 10.42.110.237
For maximum performance, we also turned on hardware packet checksumming (for both transmit and receive) and we've arbitrarily chosen a jumbo packet MTU of 8100 bytes. A little detail: io-pkt by default allocates 2 KB clusters for packet buffers. This works well for 1500 byte packets, but for example when an 8 KB jumbo packet is received, we end up with 4 linked clusters. We can improve performance by telling io-pkt (when we start it) that we're going to use jumbo packets, like this:
# io-pkt-v6-hc -d i82544 -p tcpip pagesize=8192,mclbytes=8192
If we pass the pagesize and mclbytes command-line options to the stack, we tell it to allocate contiguous 8 KB buffers (which may end up being two adjacent 4 KB pages, which works fine) for each 8 KB cluster to use for packet buffers. This reduces packet processing overhead, which improves throughput and reduces CPU utilization.
If an Ethernet packet is shorter than ETHERMIN bytes, padding can be added to the packet to reach the required minimum length. In the interests of performance, the driver software doesn't automatically pad the packets, but leaves it to the hardware to do so if supported. If hardware pads the packets, the contents of the padding depend on the hardware implementation.
Transmit Segmentation Offload (TSO) is a capability provided by some modern NIC cards (see, for example, http://en.wikipedia.org/wiki/Large_segment_offload). Essentially, instead of the stack being responsible for breaking a large IP packet into MTU-sized packets, the driver does it. This greatly offloads the amount of CPU required to transmit large amounts of data.
You can tell if a driver supports TSO by typing ifconfig and looking at the capabilities section of the interface output. It will have tso marked as one of its capabilities. To configure the driver to use TSO, type (for example):
ifconfig wm0 tso4 ifconfig wm0 10.42.110.237