|
Revision tags: v22.03, v22.03-rc4, v22.03-rc3, v22.03-rc2, v22.03-rc1 |
|
| #
5569dd7d |
| 20-Jan-2022 |
Tudor Cornea <[email protected]> |
kni: allow configuring thread granularity
The Kni kthreads seem to be re-scheduled at a granularity of roughly 1 millisecond right now, which seems to be insufficient for performing tests involving
kni: allow configuring thread granularity
The Kni kthreads seem to be re-scheduled at a granularity of roughly 1 millisecond right now, which seems to be insufficient for performing tests involving a lot of control plane traffic.
Even if KNI_KTHREAD_RESCHEDULE_INTERVAL is set to 5 microseconds, it seems that the existing code cannot reschedule at the desired granularily, due to precision constraints of schedule_timeout_interruptible().
In our use case, we leverage the Linux Kernel for control plane, and it is not uncommon to have 60K - 100K pps for some signaling protocols.
Since we are not in atomic context, the usleep_range() function seems to be more appropriate for being able to introduce smaller controlled delays, in the range of 5-10 microseconds. Upon reading the existing code, it would seem that this was the original intent. Adding sub-millisecond delays, seems unfeasible with a call to schedule_timeout_interruptible().
KNI_KTHREAD_RESCHEDULE_INTERVAL 5 /* us */ schedule_timeout_interruptible( usecs_to_jiffies(KNI_KTHREAD_RESCHEDULE_INTERVAL));
Below, we attempted a brief comparison between the existing implementation, which uses schedule_timeout_interruptible() and usleep_range().
We attempt to measure the CPU usage, and RTT between two Kni interfaces, which are created on top of vmxnet3 adapters, connected by a vSwitch.
insmod rte_kni.ko kthread_mode=single carrier=on
schedule_timeout_interruptible(usecs_to_jiffies(5)) kni_single CPU Usage: 2-4 % [root@localhost ~]# ping 1.1.1.2 -I eth1 PING 1.1.1.2 (1.1.1.2) from 1.1.1.1 eth1: 56(84) bytes of data. 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=2.70 ms 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=1.00 ms 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=1.99 ms 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.985 ms 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=1.00 ms
usleep_range(5, 10) kni_single CPU usage: 50% 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.338 ms 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.150 ms 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.123 ms 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.139 ms 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.159 ms
usleep_range(20, 50) kni_single CPU usage: 24% 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.202 ms 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.170 ms 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.171 ms 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.248 ms 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.185 ms
usleep_range(50, 100) kni_single CPU usage: 13% 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.537 ms 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.257 ms 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.231 ms 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.143 ms 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.200 ms
usleep_range(100, 200) kni_single CPU usage: 7% 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.716 ms 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.167 ms 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.459 ms 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.455 ms 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.252 ms
usleep_range(1000, 1100) kni_single CPU usage: 2% 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=2.22 ms 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=1.17 ms 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=1.17 ms 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=1.17 ms 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=1.15 ms
Upon testing, usleep_range(1000, 1100) seems roughly equivalent in latency and cpu usage to the variant with schedule_timeout_interruptible(), while usleep_range(100, 200) seems to give a decent tradeoff between latency and cpu usage, while allowing users to tweak the limits for improved precision if they have such use cases.
Disabling RTE_KNI_PREEMPT_DEFAULT, interestingly seems to lead to a softlockup on my kernel.
Kernel panic - not syncing: softlockup: hung tasks CPU: 0 PID: 1226 Comm: kni_single Tainted: G W O 3.10 #1 <IRQ> [<ffffffff814f84de>] dump_stack+0x19/0x1b [<ffffffff814f7891>] panic+0xcd/0x1e0 [<ffffffff810993b0>] watchdog_timer_fn+0x160/0x160 [<ffffffff810644b2>] __run_hrtimer.isra.4+0x42/0xd0 [<ffffffff81064b57>] hrtimer_interrupt+0xe7/0x1f0 [<ffffffff8102cd57>] smp_apic_timer_interrupt+0x67/0xa0 [<ffffffff8150321d>] apic_timer_interrupt+0x6d/0x80
This patch also attempts to remove this option.
References: [1] https://www.kernel.org/doc/Documentation/timers/timers-howto.txt
Signed-off-by: Tudor Cornea <[email protected]> Acked-by: Padraig Connolly <[email protected]> Reviewed-by: Ferruh Yigit <[email protected]>
show more ...
|
|
Revision tags: v21.11, v21.11-rc4 |
|
| #
a1b2558c |
| 23-Nov-2021 |
Ferruh Yigit <[email protected]> |
kni: restrict bifurcated device support
To enable bifurcated device support, rtnl_lock is released before calling userspace callbacks and asynchronous requests are enabled.
But these changes caused
kni: restrict bifurcated device support
To enable bifurcated device support, rtnl_lock is released before calling userspace callbacks and asynchronous requests are enabled.
But these changes caused more issues, like bug #809, #816. To reduce the scope of the problems, the bifurcated device support related changes are only enabled when it is requested explicitly with new 'enable_bifurcated' module parameter. And bifurcated device support is disabled by default.
So the bifurcated device related problems are isolated and they can be fixed without impacting all use cases.
Bugzilla ID: 816 Fixes: 631217c76135 ("kni: fix kernel deadlock with bifurcated device") Cc: [email protected]
Signed-off-by: Ferruh Yigit <[email protected]> Acked-by: Igor Ryzhov <[email protected]>
show more ...
|
|
Revision tags: v21.11-rc3, v21.11-rc2, v21.11-rc1, v21.08, v21.08-rc4, v21.08-rc3, v21.08-rc2, v21.08-rc1, v21.05, v21.05-rc4, v21.05-rc3, v21.05-rc2, v21.05-rc1, v21.02, v21.02-rc4, v21.02-rc3, v21.02-rc2, v21.02-rc1, v20.11, v20.11-rc5, v20.11-rc4, v20.11-rc3, v20.11-rc2, v20.11-rc1 |
|
| #
87efaea6 |
| 17-Aug-2020 |
Ferruh Yigit <[email protected]> |
kni: fix build with Linux 5.9
Starting from Linux 5.9 'get_user_pages_remote()' API doesn't get 'struct task_struct' parameter: commit 64019a2e467a ("mm/gup: remove task_struct pointer for all gup c
kni: fix build with Linux 5.9
Starting from Linux 5.9 'get_user_pages_remote()' API doesn't get 'struct task_struct' parameter: commit 64019a2e467a ("mm/gup: remove task_struct pointer for all gup code")
The change reflected to the KNI with version check.
Cc: [email protected]
Signed-off-by: Ferruh Yigit <[email protected]>
show more ...
|
|
Revision tags: v20.08, v20.08-rc4, v20.08-rc3, v20.08-rc2, v20.08-rc1, v20.05, v20.05-rc4, v20.05-rc3, v20.05-rc2, v20.05-rc1, v20.02, v20.02-rc4, v20.02-rc3, v20.02-rc2, v20.02-rc1 |
|
| #
c793dce9 |
| 21-Dec-2019 |
Stephen Hemminger <[email protected]> |
kni: rename variable with namespace prefix
All global variables in kernel should be prefixed by the same to avoid any symbol conflics. Rename dflt_carrier to kni_default_carrier.
Fixes: 89397a01ce4
kni: rename variable with namespace prefix
All global variables in kernel should be prefixed by the same to avoid any symbol conflics. Rename dflt_carrier to kni_default_carrier.
Fixes: 89397a01ce4a ("kni: set default carrier state of interface") Cc: [email protected]
Signed-off-by: Stephen Hemminger <[email protected]> Acked-by: Ferruh Yigit <[email protected]>
show more ...
|
|
Revision tags: v19.11, v19.11-rc4, v19.11-rc3 |
|
| #
d965af9e |
| 20-Nov-2019 |
Ferruh Yigit <[email protected]> |
kni: increase kernel version requirement for VA
A build error reported related to the selected 'get_user_pages_remote()' kernel API:
.../kernel/linux/kni/kni_dev.h:113:8: error: too few arguments
kni: increase kernel version requirement for VA
A build error reported related to the selected 'get_user_pages_remote()' kernel API:
.../kernel/linux/kni/kni_dev.h:113:8: error: too few arguments to function ‘get_user_pages_remote’ ret = get_user_pages_remote(tsk, tsk->mm, iova, 1 ^~~~~~~~~~~~~~~~~~~~~
Currently there are three versions of the 'get_user_pages_remote()' supported, based on kernel version < 4.9, = 4.9, > 4.9.
These version based checks are not working fine with the distro kernels which is the cause of reported build error. The error reported by the kernel version 4.8, but it is using API defined in > 4.9.
To be able to take control of this, and possible more, related build error, increasing the minimum supported kernel version for iova=va with KNI to kernel version 4.9.
This leaves us with single version of the kernel API and more manageable.
Signed-off-by: Ferruh Yigit <[email protected]> Reviewed-by: David Marchand <[email protected]>
show more ...
|
| #
e73831dc |
| 17-Nov-2019 |
Vamsi Attunuru <[email protected]> |
kni: support userspace VA
Patch adds support for kernel module to work in IOVA = VA mode by providing address translation routines to convert userspace VA to kernel VA.
KNI performance using PA is
kni: support userspace VA
Patch adds support for kernel module to work in IOVA = VA mode by providing address translation routines to convert userspace VA to kernel VA.
KNI performance using PA is not changed by this patch. But comparing KNI using PA to KNI using VA, the latter will have lower performance due to the cost of the added translation.
This translation is implemented only with kernel versions starting 4.6.0.
Signed-off-by: Vamsi Attunuru <[email protected]> Signed-off-by: Kiran Kumar K <[email protected]> Reviewed-by: Jerin Jacob <[email protected]>
show more ...
|
|
Revision tags: v19.11-rc2, v19.11-rc1, v19.08, v19.08-rc4, v19.08-rc3, v19.08-rc2, v19.08-rc1 |
|
| #
398d6f94 |
| 24-Jun-2019 |
Stephen Hemminger <[email protected]> |
kni: support minimal ethtool
Some applications use ethtool so add the minimum ethtool ops.
Signed-off-by: Stephen Hemminger <[email protected]> Acked-by: Ferruh Yigit <[email protected]
kni: support minimal ethtool
Some applications use ethtool so add the minimum ethtool ops.
Signed-off-by: Stephen Hemminger <[email protected]> Acked-by: Ferruh Yigit <[email protected]>
show more ...
|
| #
5cb4510c |
| 24-Jun-2019 |
Stephen Hemminger <[email protected]> |
kni: replace void pointer with FIFO types
Using void * instead of proper type is unsafe practice.
Signed-off-by: Stephen Hemminger <[email protected]> Acked-by: Ferruh Yigit <ferruh.yigit@
kni: replace void pointer with FIFO types
Using void * instead of proper type is unsafe practice.
Signed-off-by: Stephen Hemminger <[email protected]> Acked-by: Ferruh Yigit <[email protected]>
show more ...
|
| #
d14e59f9 |
| 24-Jun-2019 |
Stephen Hemminger <[email protected]> |
kni: drop unused fields
Several fields were either totally unused or set and never used.
Signed-off-by: Stephen Hemminger <[email protected]> Acked-by: Ferruh Yigit <[email protected]>
|
| #
2d62049d |
| 24-Jun-2019 |
Stephen Hemminger <[email protected]> |
kni: remove stats from private struct
Since kernel 2.6.28 the network subsystem has provided dev->stats for devices to use statistics handling and is the default if no ndo_get_stats is provided.
Th
kni: remove stats from private struct
Since kernel 2.6.28 the network subsystem has provided dev->stats for devices to use statistics handling and is the default if no ndo_get_stats is provided.
This allow allows for 64 bit (rather than just 32 bit) statistics with KNI.
Signed-off-by: Stephen Hemminger <[email protected]> Acked-by: Ferruh Yigit <[email protected]>
show more ...
|
| #
ee3cac92 |
| 06-Jun-2019 |
Igor Ryzhov <[email protected]> |
kni: remove PCI related information
As there is no ethtool support in KNI anymore, PCI related information is no longer needed.
Fixes: ea6b39b5b847 ("kni: remove ethtool support")
Signed-off-by: I
kni: remove PCI related information
As there is no ethtool support in KNI anymore, PCI related information is no longer needed.
Fixes: ea6b39b5b847 ("kni: remove ethtool support")
Signed-off-by: Igor Ryzhov <[email protected]> Acked-by: Ferruh Yigit <[email protected]>
show more ...
|
| #
ea6b39b5 |
| 24-May-2019 |
Ferruh Yigit <[email protected]> |
kni: remove ethtool support
Current design requires kernel drivers and they need to be probed by Linux up to some level so that they can be usable by DPDK for ethtool support, this requires maintain
kni: remove ethtool support
Current design requires kernel drivers and they need to be probed by Linux up to some level so that they can be usable by DPDK for ethtool support, this requires maintaining the Linux drivers in DPDK.
Also ethtool support is limited and hard, if not impossible, to expand to other PMDs.
Since KNI ethtool support is not used commonly, if not used at all, removing the support for the sake of simplicity and maintenance.
Signed-off-by: Ferruh Yigit <[email protected]> Acked-by: Stephen Hemminger <[email protected]> Acked-by: Neil Horman <[email protected]> Acked-by: Thomas Monjalon <[email protected]>
show more ...
|
|
Revision tags: v19.05, v19.05-rc4, v19.05-rc3, v19.05-rc2, v19.05-rc1 |
|
| #
3c458891 |
| 01-Apr-2019 |
Thomas Monjalon <[email protected]> |
eal: remove exec-env directory
Only one header file (rte_kni_common.h) was in the sub-directory include/exec-env/ This file was installed in a sub-directory of the same name in the makefile-based b
eal: remove exec-env directory
Only one header file (rte_kni_common.h) was in the sub-directory include/exec-env/ This file was installed in a sub-directory of the same name in the makefile-based build. Source and install directories are moved as below:
lib/librte_eal/linux/eal/include/exec-env/ -> lib/librte_eal/linux/eal/include/
build/include/exec-env/ -> build/include/
The consequence is to have a file hierarchy a bit more flat.
Signed-off-by: Thomas Monjalon <[email protected]> Reviewed-by: David Marchand <[email protected]> Tested-by: David Marchand <[email protected]> Acked-by: Ferruh Yigit <[email protected]>
show more ...
|
|
Revision tags: v19.02, v19.02-rc4, v19.02-rc3, v19.02-rc2, v19.02-rc1, v18.11, v18.11-rc5, v18.11-rc4, v18.11-rc3, v18.11-rc2, v18.11-rc1 |
|
| #
89397a01 |
| 24-Oct-2018 |
Dan Gora <[email protected]> |
kni: set default carrier state of interface
Add module parameter 'carrier='on|off' to set the default carrier state for linux network interfaces created by the KNI module. The default carrier state
kni: set default carrier state of interface
Add module parameter 'carrier='on|off' to set the default carrier state for linux network interfaces created by the KNI module. The default carrier state is 'off'.
For KNI interfaces which need to reflect the carrier state of a physical Ethernet port controlled by the DPDK application, the default carrier state should be left set to 'off'. The application can set the carrier state of the KNI interface to reflect the state of the physical Ethernet port using rte_kni_update_link().
For KNI interfaces which are purely virtual, the default carrier state can be set to 'on'. This enables the KNI interface to be used without having to explicity set the carrier state to 'on' using rte_kni_update_link().
Signed-off-by: Dan Gora <[email protected]> Acked-by: Ferruh Yigit <[email protected]>
show more ...
|
|
Revision tags: v18.08, v18.08-rc3, v18.08-rc2, v18.08-rc1, v18.05, v18.05-rc6, v18.05-rc5, v18.05-rc4, v18.05-rc3, v18.05-rc2, v18.05-rc1 |
|
| #
e77fec69 |
| 19-Apr-2018 |
Yangchao Zhou <[email protected]> |
kni: fix possible mbuf leaks and speed up port release
rx_q fifo can only be released by kernel thread. There may be mbuf leaks in rx_q because kernel threads are randomly stopped.
When the kni is
kni: fix possible mbuf leaks and speed up port release
rx_q fifo can only be released by kernel thread. There may be mbuf leaks in rx_q because kernel threads are randomly stopped.
When the kni is released and netdev is unregisterd, convert the physical address mbufs in rx_q to the virtual address in free_q. By the way, alloc_q can be processed together to speed up the release rate in userspace.
In my test, it is improved from 300-500ms with a mempool that has 131072 mbufs to 10ms(regardless of the specifications).
Suggested-by: Ferruh Yigit <[email protected]> Signed-off-by: Yangchao Zhou <[email protected]> Acked-by: Ferruh Yigit <[email protected]>
show more ...
|
| #
acaa9ee9 |
| 22-Feb-2018 |
Hemant Agrawal <[email protected]> |
move kernel modules directories
This patch moves the kernel modules code from EAL to a common place. - Separate the kernel module code from user space code.
Signed-off-by: Hemant Agrawal <hemant.a
move kernel modules directories
This patch moves the kernel modules code from EAL to a common place. - Separate the kernel module code from user space code.
Signed-off-by: Hemant Agrawal <[email protected]> Tested-by: Bruce Richardson <[email protected]>
show more ...
|