ifnet: Replace if_addr_lock rwlock with epoch + mutexRun on LLNW canaries and tested by pho@gallatin:Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5based ConnectX 4-LX NIC, I
ifnet: Replace if_addr_lock rwlock with epoch + mutexRun on LLNW canaries and tested by pho@gallatin:Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5based ConnectX 4-LX NIC, I see an almost 12% improvement in receivedpacket rate, and a larger improvement in bytes delivered all the wayto userspace.When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,I see, using nstat -I mce0 1 before the patch:InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree4.98 0.00 4.42 0.00 4235592 33 83.80 4720653 2149771 1235 247.324.73 0.00 4.20 0.00 4025260 33 82.99 4724900 2139833 1204 247.324.72 0.00 4.20 0.00 4035252 33 82.14 4719162 2132023 1264 247.324.71 0.00 4.21 0.00 4073206 33 83.68 4744973 2123317 1347 247.324.72 0.00 4.21 0.00 4061118 33 80.82 4713615 2188091 1490 247.324.72 0.00 4.21 0.00 4051675 33 85.29 4727399 2109011 1205 247.324.73 0.00 4.21 0.00 4039056 33 84.65 4724735 2102603 1053 247.32After the patchInMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree5.43 0.00 4.20 0.00 3313143 33 84.96 5434214 1900162 2656 245.515.43 0.00 4.20 0.00 3308527 33 85.24 5439695 1809382 2521 245.515.42 0.00 4.19 0.00 3316778 33 87.54 5416028 1805835 2256 245.515.42 0.00 4.19 0.00 3317673 33 90.44 5426044 1763056 2332 245.515.42 0.00 4.19 0.00 3314839 33 88.11 5435732 1792218 2499 245.525.44 0.00 4.19 0.00 3293228 33 91.84 5426301 1668597 2121 245.52Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patchReviewed by: gallatinSponsored by: Limelight NetworksDifferential Revision: https://reviews.freebsd.org/D15366
show more ...
sys/dev: further adoption of SPDX licensing ID tags.Mainly focus on files that use BSD 2-Clause license, however the tool Iwas using misidentified many licenses so this was mostly a manual - error
sys/dev: further adoption of SPDX licensing ID tags.Mainly focus on files that use BSD 2-Clause license, however the tool Iwas using misidentified many licenses so this was mostly a manual - errorprone - task.The Software Package Data Exchange (SPDX) group provides a specificationto make it easier for automated tools to detect and summarize well knownopensource licenses. We are gradually adopting the specification, notingthat the tags are considered only advisory and do not, in any way,superceed or replace the license texts.
Mechanically convert to if_inc_counter().
Use define from if_var.h to access a field inside struct if_data,that resides in struct ifnet.Sponsored by: Nginx, Inc.
Provide necessary includes.
The r48589 promised to remove implicit inclusion of if_var.h soon. Prepareto this event, adding if_var.h to files that do need it. Also, includeall includes that now are included due to implicit po
The r48589 promised to remove implicit inclusion of if_var.h soon. Prepareto this event, adding if_var.h to files that do need it. Also, includeall includes that now are included due to implicit pollution via if_var.hSponsored by: NetflixSponsored by: Nginx, Inc.
Avoid controller reinitialization which could be triggered bydhclient(8) or alias addresses are added.Tested by: dcx dcy <[email protected]>
Mechanically substitute flags from historic mbuf allocator withmalloc(9) flags in sys/dev.
- There's no need to overwrite the default device method with the default one. Interestingly, these are actually the default for quite some time (bus_generic_driver_added(9) since r52045 and bus_
- There's no need to overwrite the default device method with the default one. Interestingly, these are actually the default for quite some time (bus_generic_driver_added(9) since r52045 and bus_generic_print_child(9) since r52045) but even recently added device drivers do this unnecessarily. Discussed with: jhb, marcel- While at it, use DEVMETHOD_END. Discussed with: jhb- Also while at it, use __FBSDID.
- Remove attempts to implement setting of BMCR_LOOP/MIIF_NOLOOP (reporting IFM_LOOP based on BMCR_LOOP is left in place though as it might provide useful for debugging). For most mii(4) drivers i
- Remove attempts to implement setting of BMCR_LOOP/MIIF_NOLOOP (reporting IFM_LOOP based on BMCR_LOOP is left in place though as it might provide useful for debugging). For most mii(4) drivers it was unclear whether the PHYs driven by them actually support loopback or not. Moreover, typically loopback mode also needs to be activated on the MAC, which none of the Ethernet drivers using mii(4) implements. Given that loopback media has no real use (and obviously hardly had a chance to actually work) besides for driver development (which just loopback mode should be sufficient for though, i.e one doesn't necessary need support for loopback media) support for it is just dropped as both NetBSD and OpenBSD already did quite some time ago.- Let mii_phy_add_media() also announce the support of IFM_NONE.- Restructure the PHY entry points to use a structure of entry points instead of discrete function pointers, and extend this to include a "reset" entry point. Make sure any PHY-specific reset routine is always used, and provide one for lxtphy(4) which disables MII interrupts (as is done for a few other PHYs we have drivers for). This includes changing NIC drivers which previously just called the generic mii_phy_reset() to now actually call the PHY-specific reset routine, which might be crucial in some cases. While at it, the redundant checks in these NIC drivers for mii->mii_instance not being zero before calling the reset routines were removed because as soon as one PHY driver attaches mii->mii_instance is incremented and we hardly can end up in their media change callbacks etc if no PHY driver has attached as mii_attach() would have failed in that case and not attach a miibus(4) instance. Consequently, NIC drivers now no longer should call mii_phy_reset() directly, so it was removed from EXPORT_SYMS.- Add a mii_phy_dev_attach() as a companion helper to mii_phy_dev_probe(). The purpose of that function is to perform the common steps to attach a PHY driver instance and to hook it up to the miibus(4) instance and to optionally also handle the probing, addition and initialization of the supported media. So all a PHY driver without any special requirements has to do in its bus attach method is to call mii_phy_dev_attach() along with PHY-specific MIIF_* flags, a pointer to its PHY functions and the add_media set to one. All PHY drivers were updated to take advantage of mii_phy_dev_attach() as appropriate. Along with these changes the capability mask was added to the mii_softc structure so PHY drivers taking advantage of mii_phy_dev_attach() but still handling media on their own do not need to fiddle with the MII attach arguments anyway.- Keep track of the PHY offset in the mii_softc structure. This is done for compatibility with NetBSD/OpenBSD.- Keep track of the PHY's OUI, model and revision in the mii_softc structure. Several PHY drivers require this information also after attaching and previously had to wrap their own softc around mii_softc. NetBSD/OpenBSD also keep track of the model and revision on their mii_softc structure. All PHY drivers were updated to take advantage as appropriate.- Convert the mebers of the MII data structure to unsigned where appropriate. This is partly inspired by NetBSD/OpenBSD.- According to IEEE 802.3-2002 the bits actually have to be reversed when mapping an OUI to the MII ID registers. All PHY drivers and miidevs where changed as necessary. Actually this now again allows to largely share miidevs with NetBSD, which fixed this problem already 9 years ago. Consequently miidevs was synced as far as possible.- Add MIIF_NOMANPAUSE and mii_phy_flowstatus() calls to drivers that weren't explicitly converted to support flow control before. It's unclear whether flow control actually works with these but typically it should and their net behavior should be more correct with these changes in place than without if the MAC driver sets MIIF_DOPAUSE.Obtained from: NetBSD (partially)Reviewed by: yongari (earlier version), silence on arch@ and net@
Correct spelling in comments.Submitted by: brucec
Convert the PHY drivers to honor the mii_flags passed down and convertthe NIC drivers as well as the PHY drivers to take advantage of themii_attach() introduced in r213878 to get rid of certain hac
Convert the PHY drivers to honor the mii_flags passed down and convertthe NIC drivers as well as the PHY drivers to take advantage of themii_attach() introduced in r213878 to get rid of certain hacks. Forthe most part these were:- Artificially limiting miibus_{read,write}reg methods to certain PHY addresses; we now let mii_attach() only probe the PHY at the desired address(es) instead.- PHY drivers setting MIIF_* flags based on the NIC driver they hang off from, partly even based on grabbing and using the softc of the parent; we now pass these flags down from the NIC to the PHY drivers via mii_attach(). This got us rid of all such hacks except those of brgphy() in combination with bce(4) and bge(4), which is way beyond what can be expressed with simple flags.While at it, I took the opportunity to change the NIC drivers to passup the error returned by mii_attach() (previously by mii_phy_probe())and unify the error message used in this case where and as appropriateas mii_attach() actually can fail for a number of reasons, not justbecause of no PHY(s) being present at the expected address(es).Reviewed by: jhb, yongari
KTR_CTx are long time aliased by existing classes so they can't servetheir purpose anymore. Axe them out.Sponsored by: Sandvine IncorporatedDiscussed with: jhb, emastePossible MFC: TBD
The NetBSD Foundation has granted permission to remove clause 3 and 4 fromthe software.Obtained from: NetBSD
Use if_maddr_rlock()/if_maddr_runlock() rather than IF_ADDR_LOCK()/IF_ADDR_UNLOCK() across network device drivers when accessing theper-interface multicast address list, if_multiaddrs. This willa
Use if_maddr_rlock()/if_maddr_runlock() rather than IF_ADDR_LOCK()/IF_ADDR_UNLOCK() across network device drivers when accessing theper-interface multicast address list, if_multiaddrs. This willallow us to change the locking strategy without affecting our driverprogramming interface or binary interface.For two wireless drivers, remove unnecessary locking, since theydon't actually access the multicast address list.Approved by: re (kib)MFC after: 6 weeks
- Use the revamped code from the gem(4) PCI front-end, which doesn't require parts of the Expansion ROM to be copied around, for obtaining the MAC address on !OFW platforms.- Don't unnecessarily
- Use the revamped code from the gem(4) PCI front-end, which doesn't require parts of the Expansion ROM to be copied around, for obtaining the MAC address on !OFW platforms.- Don't unnecessarily cache bus space tag and handle nor RIDs in the softcs of the front-ends.- Don't use function calls in initializers.- Let the SBus front-end depend on sbus(4).
o Disable HMEDEBUG by default.o Add CTASSERTs ensuring that HME_NRXDESC and HME_NTXDESC are set to legal values.o Use appropriate maxsize, nsegments and maxsegsize parameters when creating DMA
o Disable HMEDEBUG by default.o Add CTASSERTs ensuring that HME_NRXDESC and HME_NTXDESC are set to legal values.o Use appropriate maxsize, nsegments and maxsegsize parameters when creating DMA tags and correct some comments related to them.o The FreeBSD bus_dmamap_sync(9) supports ored together flags for quite some time now so collapse calls accordingly.o Add missing BUS_DMASYNC_PREREAD when syncing the control DMA maps in hme_rint() and hme_start_locked().o Keep state of the link state and use it to enable or disable the MAC in hme_mii_statchg() accordingly as well as to return early from hme_start_locked() in case the link is down.o Introduce a sc_flags and use it to replace individual members like sc_pci.o Add bus_barrier(9) calls to hme_mac_bitflip(), hme_mii_readreg(), hme_mii_writereg() and hme_stop() to ensure the respective bit has been written before we starting polling on it and for the right bits to change.o Rather just returning in case hme_mac_bitflip() fails and leaving us in an undefined state report the problem and move on; chances are the requested configuration will become active shortly after.o Don't call hme_start_locked() in hme_init_locked() unconditionally but only after calls to hme_init_locked() when it's appropriate, i.e. in hme_watchdog().o Add a KASSERT which asserts nsegs is valid also to hme_load_txmbuf().o In hme_load_txmbuf(): - use a maximum of the newly introduced HME_NTXSEGS segments instead of the incorrect HME_NTXQ, which reflects the maximum TX queue length, for loading the mbufs and put the DMA segments back onto the stack instead of the softc as 16 should be ok there. - use the common errno(2) return values instead of homegrown ones, - given that hme_load_txmbuf() is allowed to fail resulting in a packet drop for quite some time now implement the functionality of hme_txcksum() by means of m_pullup(9), which de-obfuscates the code and allows to always retrieve the correct length of the IP header, [1] - also add a KASSERT which asserts nsegs is valid, - take advantage of m_collapse(9) instead of m_defrag(9) for performance reasons.o Don't bother to check whether the interface is running or whether its queue is empty before calling hme_start_locked() in hme_tint(), the former will check these anyway.o In hme_intr() call hme_rint() before hme_tint() as gem_tint() may take quite a while to return when it calls hme_start_locked().o Get rid of sc_debug and just check if_flags for IFF_DEBUG directly.o Add a shadow sc_ifflags so we don't reset the chip when unnecessary.o Handle IFF_ALLMULTI correctly. [2]o Use PCIR_BAR instead of a homegrown macro.o Replace sc_enaddr[6] with sc_enaddr[ETHER_ADDR_LEN].o Use the maximum of 256 TX descriptors for better performance as using all of them has no additional static cost rather than using just half of them.Reported by: rwatson [2]Suggested by: yongari [1]Reviewed by: yongariMFC after: 1 month
Remove invalid BUS_DMA_ALLOCNOW when creating a tag which is used fora "static" memory allocation only.
o break newbus api: add a new argument of type driver_filter_t to bus_setup_intr()o add an int return code to all fast handlerso retire INTR_FAST/IH_FASTFor more info: http://docs.freebsd.org
o break newbus api: add a new argument of type driver_filter_t to bus_setup_intr()o add an int return code to all fast handlerso retire INTR_FAST/IH_FASTFor more info: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=465712+0+current/freebsd-currentReviewed by: manyApproved by: re@
- Use the hme_tick() callout instead of if_slowtimo() for driving hme_watchdog() in order to avoid races accessing if_timer.- Use bus_get_dma_tag() so hme(4) works on platforms requiring it.- Don
- Use the hme_tick() callout instead of if_slowtimo() for driving hme_watchdog() in order to avoid races accessing if_timer.- Use bus_get_dma_tag() so hme(4) works on platforms requiring it.- Don't bother to set if_mtu to ETHERMTU, ether_ifattach() does that.
Remove the HME_LOCK_ASSERT() in hme_mifinit(), which was added in theprevious revision; it's actually ok when invoking hme_mifinit() fromhme_config() without the lock held.
- In hme_stop() mask all interrupts.- In hme_eint() print MIF register contents on MIF interrupts.- In hme_mifinit() don't bother to preserve the previous MIF config. This was mainly done in orde
- In hme_stop() mask all interrupts.- In hme_eint() print MIF register contents on MIF interrupts.- In hme_mifinit() don't bother to preserve the previous MIF config. This was mainly done in order to preserve the PHY select bit (external or internal PHY) but which only needs to be set as appropriate when reading from or writing to the desired PHY in hme_mii_{read,write}reg(). Similarly don't bother to set the PHY select bit in hme_mii_statchg().- In hme_mii_{read,write}reg() ignore requests to PHYs other than the external and internal PHY one.- Move enabling/disabling the MII drivers of the external transceiver from hme_init_locked() and based on the sheer presence of an external to hme_mifinit() and based on the currently selected media, defaulting to the internal transceiver when the media hasn't been set, yet. Invoke hme_mifinit() from the newly added hme_mediachange_locked() so the setting of the MII drivers is updated when changing media. These changes keep the MII bus from wedging (which manifests in the HME and the PHYs no longer being able to communicate with each other) when the PHY device drivers isolate the unused PHY in two-PHY configurations as present in f.e. Netra t1 100 while changing media, either from hme_init_locked() (see also below) or via ifconfig(8). They also allow for using both transceivers/PHYs.- In the newly added hme_mediachange_locked() also reset the PHYs in two- PHY configurations before invoking mii_mediachg(). This is required for successfully unisolating the previously unused PHY when switching between PHYs.- Now that changing media should no longer cause problems back out rev. 1.27 and re-enable setting the current media in hme_init_locked() (see the commit message of rev. 1.23 for more info).These changes are roughly a merge of NetBSD gem.c rev. 1.32 - 1.35 (1.30was already fixed differently in our 1.36; 1.31 and 1.32 were wrong) withsome parts reworked and things that don't make sense like setting the MIIdrivers and restoring the previous MIF and XIF settings in hme_mii_{read,write}reg() omitted.MFC after: 2 weeks
Fix invalid reference of mbuf chains.Use proper pointer dereference to inform modified mbuf chains tocaller.While I'm here perform checksum offload setup after loading DMAmaps as m_defrag(9) can
Fix invalid reference of mbuf chains.Use proper pointer dereference to inform modified mbuf chains tocaller.While I'm here perform checksum offload setup after loading DMAmaps as m_defrag(9) can return new mbuf chains.In collaboration with: glebius
Fix typo in printf string.MFC after: 1 weekApproved by: cperciva (mentor)
Backout rev. 1.46. It caused Rx checksum offload breakage on littleendian systems.Reported by: joergTested by: joerg
1234