1a037699dSSebastian Fricke.. SPDX-License-Identifier: GPL-2.0 2a037699dSSebastian Fricke 3a037699dSSebastian Fricke======================================== 4a037699dSSebastian FrickeDebugging advice for driver development 5a037699dSSebastian Fricke======================================== 6a037699dSSebastian Fricke 7a037699dSSebastian FrickeThis document serves as a general starting point and lookup for debugging 8a037699dSSebastian Frickedevice drivers. 9a037699dSSebastian FrickeWhile this guide focuses on debugging that requires re-compiling the 10a037699dSSebastian Frickemodule/kernel, the :doc:`userspace debugging guide 11a037699dSSebastian Fricke</process/debugging/userspace_debugging_guide>` will guide 12a037699dSSebastian Frickeyou through tools like dynamic debug, ftrace and other tools useful for 13a037699dSSebastian Frickedebugging issues and behavior. 14a037699dSSebastian FrickeFor general debugging advice, see the :doc:`general advice document 15a037699dSSebastian Fricke</process/debugging/index>`. 16a037699dSSebastian Fricke 17a037699dSSebastian Fricke.. contents:: 18a037699dSSebastian Fricke :depth: 3 19a037699dSSebastian Fricke 20a037699dSSebastian FrickeThe following sections show you the available tools. 21a037699dSSebastian Fricke 22a037699dSSebastian Frickeprintk() & friends 23a037699dSSebastian Fricke------------------ 24a037699dSSebastian Fricke 25a037699dSSebastian FrickeThese are derivatives of printf() with varying destinations and support for 26a037699dSSebastian Frickebeing dynamically turned on or off, or lack thereof. 27a037699dSSebastian Fricke 28a037699dSSebastian FrickeSimple printk() 29a037699dSSebastian Fricke~~~~~~~~~~~~~~~ 30a037699dSSebastian Fricke 31a037699dSSebastian FrickeThe classic, can be used to great effect for quick and dirty development 32a037699dSSebastian Frickeof new modules or to extract arbitrary necessary data for troubleshooting. 33a037699dSSebastian Fricke 34a037699dSSebastian FrickePrerequisite: ``CONFIG_PRINTK`` (usually enabled by default) 35a037699dSSebastian Fricke 36a037699dSSebastian Fricke**Pros**: 37a037699dSSebastian Fricke 38a037699dSSebastian Fricke- No need to learn anything, simple to use 39a037699dSSebastian Fricke- Easy to modify exactly to your needs (formatting of the data (See: 40a037699dSSebastian Fricke :doc:`/core-api/printk-formats`), visibility in the log) 41a037699dSSebastian Fricke- Can cause delays in the execution of the code (beneficial to confirm whether 42a037699dSSebastian Fricke timing is a factor) 43a037699dSSebastian Fricke 44a037699dSSebastian Fricke**Cons**: 45a037699dSSebastian Fricke 46a037699dSSebastian Fricke- Requires rebuilding the kernel/module 47a037699dSSebastian Fricke- Can cause delays in the execution of the code (which can cause issues to be 48a037699dSSebastian Fricke not reproducible) 49a037699dSSebastian Fricke 50a037699dSSebastian FrickeFor the full documentation see :doc:`/core-api/printk-basics` 51a037699dSSebastian Fricke 52a037699dSSebastian FrickeTrace_printk 53a037699dSSebastian Fricke~~~~~~~~~~~~ 54a037699dSSebastian Fricke 55a037699dSSebastian FrickePrerequisite: ``CONFIG_DYNAMIC_FTRACE`` & ``#include <linux/ftrace.h>`` 56a037699dSSebastian Fricke 57a037699dSSebastian FrickeIt is a tiny bit less comfortable to use than printk(), because you will have 58a037699dSSebastian Fricketo read the messages from the trace file (See: :ref:`read_ftrace_log` 59a037699dSSebastian Frickeinstead of from the kernel log, but very useful when printk() adds unwanted 60a037699dSSebastian Frickedelays into the code execution, causing issues to be flaky or hidden.) 61a037699dSSebastian Fricke 62a037699dSSebastian FrickeIf the processing of this still causes timing issues then you can try 63a037699dSSebastian Fricketrace_puts(). 64a037699dSSebastian Fricke 65a037699dSSebastian FrickeFor the full Documentation see trace_printk() 66a037699dSSebastian Fricke 67a037699dSSebastian Frickedev_dbg 68a037699dSSebastian Fricke~~~~~~~ 69a037699dSSebastian Fricke 70a037699dSSebastian FrickePrint statement, which can be targeted by 71a037699dSSebastian Fricke:ref:`process/debugging/userspace_debugging_guide:dynamic debug` that contains 72a037699dSSebastian Frickeadditional information about the device used within the context. 73a037699dSSebastian Fricke 74a037699dSSebastian Fricke**When is it appropriate to leave a debug print in the code?** 75a037699dSSebastian Fricke 76a037699dSSebastian FrickePermanent debug statements have to be useful for a developer to troubleshoot 77a037699dSSebastian Frickedriver misbehavior. Judging that is a bit more of an art than a science, but 78a037699dSSebastian Frickesome guidelines are in the :ref:`Coding style guidelines 79a037699dSSebastian Fricke<process/coding-style:13) printing kernel messages>`. In almost all cases the 80a037699dSSebastian Frickedebug statements shouldn't be upstreamed, as a working driver is supposed to be 81a037699dSSebastian Frickesilent. 82a037699dSSebastian Fricke 83a037699dSSebastian FrickeCustom printk 84a037699dSSebastian Fricke~~~~~~~~~~~~~ 85a037699dSSebastian Fricke 86a037699dSSebastian FrickeExample:: 87a037699dSSebastian Fricke 88a037699dSSebastian Fricke #define core_dbg(fmt, arg...) do { \ 89a037699dSSebastian Fricke if (core_debug) \ 90a037699dSSebastian Fricke printk(KERN_DEBUG pr_fmt("core: " fmt), ## arg); \ 91a037699dSSebastian Fricke } while (0) 92a037699dSSebastian Fricke 93a037699dSSebastian Fricke**When should you do this?** 94a037699dSSebastian Fricke 95a037699dSSebastian FrickeIt is better to just use a pr_debug(), which can later be turned on/off with 96a037699dSSebastian Frickedynamic debug. Additionally, a lot of drivers activate these prints via a 97a037699dSSebastian Frickevariable like ``core_debug`` set by a module parameter. However, Module 98a037699dSSebastian Frickeparameters `are not recommended anymore 99a037699dSSebastian Fricke<https://lore.kernel.org/all/2024032757-surcharge-grime-d3dd@gregkh>`_. 100a037699dSSebastian Fricke 101a037699dSSebastian FrickeFtrace 102a037699dSSebastian Fricke------ 103a037699dSSebastian Fricke 104a037699dSSebastian FrickeCreating a custom Ftrace tracepoint 105a037699dSSebastian Fricke~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 106a037699dSSebastian Fricke 107a037699dSSebastian FrickeA tracepoint adds a hook into your code that will be called and logged when the 108a037699dSSebastian Fricketracepoint is enabled. This can be used, for example, to trace hitting a 109a037699dSSebastian Frickeconditional branch or to dump the internal state at specific points of the code 110a037699dSSebastian Frickeflow during a debugging session. 111a037699dSSebastian Fricke 112a037699dSSebastian FrickeHere is a basic description of :ref:`how to implement new tracepoints 113a037699dSSebastian Fricke<trace/tracepoints:usage>`. 114a037699dSSebastian Fricke 115a037699dSSebastian FrickeFor the full event tracing documentation see :doc:`/trace/events` 116a037699dSSebastian Fricke 117a037699dSSebastian FrickeFor the full Ftrace documentation see :doc:`/trace/ftrace` 118a037699dSSebastian Fricke 119a037699dSSebastian FrickeDebugFS 120a037699dSSebastian Fricke------- 121a037699dSSebastian Fricke 122a037699dSSebastian FrickePrerequisite: ``CONFIG_DEBUG_FS` & `#include <linux/debugfs.h>`` 123a037699dSSebastian Fricke 124a037699dSSebastian FrickeDebugFS differs from the other approaches of debugging, as it doesn't write 125a037699dSSebastian Frickemessages to the kernel log nor add traces to the code. Instead it allows the 126a037699dSSebastian Frickedeveloper to handle a set of files. 127a037699dSSebastian FrickeWith these files you can either store values of variables or make 128a037699dSSebastian Frickeregister/memory dumps or you can make these files writable and modify 129a037699dSSebastian Frickevalues/settings in the driver. 130a037699dSSebastian Fricke 131a037699dSSebastian FrickePossible use-cases among others: 132a037699dSSebastian Fricke 133a037699dSSebastian Fricke- Store register values 134a037699dSSebastian Fricke- Keep track of variables 135a037699dSSebastian Fricke- Store errors 136a037699dSSebastian Fricke- Store settings 137a037699dSSebastian Fricke- Toggle a setting like debug on/off 138a037699dSSebastian Fricke- Error injection 139a037699dSSebastian Fricke 140a037699dSSebastian FrickeThis is especially useful, when the size of a data dump would be hard to digest 141a037699dSSebastian Frickeas part of the general kernel log (for example when dumping raw bitstream data) 142a037699dSSebastian Frickeor when you are not interested in all the values all the time, but with the 143a037699dSSebastian Frickepossibility to inspect them. 144a037699dSSebastian Fricke 145a037699dSSebastian FrickeThe general idea is: 146a037699dSSebastian Fricke 147a037699dSSebastian Fricke- Create a directory during probe (``struct dentry *parent = 148a037699dSSebastian Fricke debugfs_create_dir("my_driver", NULL);``) 149a037699dSSebastian Fricke- Create a file (``debugfs_create_u32("my_value", 444, parent, &my_variable);``) 150a037699dSSebastian Fricke 151a037699dSSebastian Fricke - In this example the file is found in 152a037699dSSebastian Fricke ``/sys/kernel/debug/my_driver/my_value`` (with read permissions for 153a037699dSSebastian Fricke user/group/all) 154a037699dSSebastian Fricke - any read of the file will return the current contents of the variable 155a037699dSSebastian Fricke ``my_variable`` 156a037699dSSebastian Fricke 157a037699dSSebastian Fricke- Clean up the directory when removing the device 158a037699dSSebastian Fricke (``debugfs_remove_recursive(parent);``) 159a037699dSSebastian Fricke 160a037699dSSebastian FrickeFor the full documentation see :doc:`/filesystems/debugfs`. 161a037699dSSebastian Fricke 162a037699dSSebastian FrickeKASAN, UBSAN, lockdep and other error checkers 163a037699dSSebastian Fricke---------------------------------------------- 164a037699dSSebastian Fricke 165a037699dSSebastian FrickeKASAN (Kernel Address Sanitizer) 166a037699dSSebastian Fricke~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 167a037699dSSebastian Fricke 168a037699dSSebastian FrickePrerequisite: ``CONFIG_KASAN`` 169a037699dSSebastian Fricke 170a037699dSSebastian FrickeKASAN is a dynamic memory error detector that helps to find use-after-free and 171a037699dSSebastian Frickeout-of-bounds bugs. It uses compile-time instrumentation to check every memory 172a037699dSSebastian Frickeaccess. 173a037699dSSebastian Fricke 174a037699dSSebastian FrickeFor the full documentation see :doc:`/dev-tools/kasan`. 175a037699dSSebastian Fricke 176a037699dSSebastian FrickeUBSAN (Undefined Behavior Sanitizer) 177a037699dSSebastian Fricke~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 178a037699dSSebastian Fricke 179a037699dSSebastian FrickePrerequisite: ``CONFIG_UBSAN`` 180a037699dSSebastian Fricke 181a037699dSSebastian FrickeUBSAN relies on compiler instrumentation and runtime checks to detect undefined 182a037699dSSebastian Frickebehavior. It is designed to find a variety of issues, including signed integer 183a037699dSSebastian Frickeoverflow, array index out of bounds, and more. 184a037699dSSebastian Fricke 185a037699dSSebastian FrickeFor the full documentation see :doc:`/dev-tools/ubsan` 186a037699dSSebastian Fricke 187a037699dSSebastian Frickelockdep (Lock Dependency Validator) 188a037699dSSebastian Fricke~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 189a037699dSSebastian Fricke 190a037699dSSebastian FrickePrerequisite: ``CONFIG_DEBUG_LOCKDEP`` 191a037699dSSebastian Fricke 192a037699dSSebastian Frickelockdep is a runtime lock dependency validator that detects potential deadlocks 193a037699dSSebastian Frickeand other locking-related issues in the kernel. 194a037699dSSebastian FrickeIt tracks lock acquisitions and releases, building a dependency graph that is 195a037699dSSebastian Frickeanalyzed for potential deadlocks. 196a037699dSSebastian Frickelockdep is especially useful for validating the correctness of lock ordering in 197a037699dSSebastian Frickethe kernel. 198a037699dSSebastian Fricke 199a037699dSSebastian FrickePSI (Pressure stall information tracking) 200a037699dSSebastian Fricke~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 201a037699dSSebastian Fricke 202a037699dSSebastian FrickePrerequisite: ``CONFIG_PSI`` 203a037699dSSebastian Fricke 204a037699dSSebastian FrickePSI is a measurement tool to identify excessive overcommits on hardware 205a037699dSSebastian Frickeresources, that can cause performance disruptions or even OOM kills. 206a037699dSSebastian Fricke 207a037699dSSebastian Frickedevice coredump 208a037699dSSebastian Fricke--------------- 209a037699dSSebastian Fricke 210*126437fcSRandy DunlapPrerequisite: ``CONFIG_DEV_COREDUMP`` & ``#include <linux/devcoredump.h>`` 211a037699dSSebastian Fricke 212a037699dSSebastian FrickeProvides the infrastructure for a driver to provide arbitrary data to userland. 213a037699dSSebastian FrickeIt is most often used in conjunction with udev or similar userland application 214a037699dSSebastian Fricketo listen for kernel uevents, which indicate that the dump is ready. Udev has 215a037699dSSebastian Frickerules to copy that file somewhere for long-term storage and analysis, as by 216*126437fcSRandy Dunlapdefault, the data for the dump is automatically cleaned up after a default 217*126437fcSRandy Dunlap5 minutes. That data is analyzed with driver-specific tools or GDB. 218*126437fcSRandy Dunlap 219*126437fcSRandy DunlapA device coredump can be created with a vmalloc area, with read/free 220*126437fcSRandy Dunlapmethods, or as a scatter/gather list. 221a037699dSSebastian Fricke 222a037699dSSebastian FrickeYou can find an example implementation at: 223a037699dSSebastian Fricke`drivers/media/platform/qcom/venus/core.c 224*126437fcSRandy Dunlap<https://elixir.bootlin.com/linux/v6.11.6/source/drivers/media/platform/qcom/venus/core.c#L30>`__, 225*126437fcSRandy Dunlapin the Bluetooth HCI layer, in several wireless drivers, and in several 226*126437fcSRandy DunlapDRM drivers. 227*126437fcSRandy Dunlap 228*126437fcSRandy Dunlapdevcoredump interfaces 229*126437fcSRandy Dunlap~~~~~~~~~~~~~~~~~~~~~~ 230*126437fcSRandy Dunlap 231*126437fcSRandy Dunlap.. kernel-doc:: include/linux/devcoredump.h 232*126437fcSRandy Dunlap 233*126437fcSRandy Dunlap.. kernel-doc:: drivers/base/devcoredump.c 234a037699dSSebastian Fricke 235a037699dSSebastian Fricke**Copyright** ©2024 : Collabora 236