1a037699dSSebastian Fricke.. SPDX-License-Identifier: GPL-2.0
2a037699dSSebastian Fricke
3a037699dSSebastian Fricke========================================
4a037699dSSebastian FrickeDebugging advice for driver development
5a037699dSSebastian Fricke========================================
6a037699dSSebastian Fricke
7a037699dSSebastian FrickeThis document serves as a general starting point and lookup for debugging
8a037699dSSebastian Frickedevice drivers.
9a037699dSSebastian FrickeWhile this guide focuses on debugging that requires re-compiling the
10a037699dSSebastian Frickemodule/kernel, the :doc:`userspace debugging guide
11a037699dSSebastian Fricke</process/debugging/userspace_debugging_guide>` will guide
12a037699dSSebastian Frickeyou through tools like dynamic debug, ftrace and other tools useful for
13a037699dSSebastian Frickedebugging issues and behavior.
14a037699dSSebastian FrickeFor general debugging advice, see the :doc:`general advice document
15a037699dSSebastian Fricke</process/debugging/index>`.
16a037699dSSebastian Fricke
17a037699dSSebastian Fricke.. contents::
18a037699dSSebastian Fricke    :depth: 3
19a037699dSSebastian Fricke
20a037699dSSebastian FrickeThe following sections show you the available tools.
21a037699dSSebastian Fricke
22a037699dSSebastian Frickeprintk() & friends
23a037699dSSebastian Fricke------------------
24a037699dSSebastian Fricke
25a037699dSSebastian FrickeThese are derivatives of printf() with varying destinations and support for
26a037699dSSebastian Frickebeing dynamically turned on or off, or lack thereof.
27a037699dSSebastian Fricke
28a037699dSSebastian FrickeSimple printk()
29a037699dSSebastian Fricke~~~~~~~~~~~~~~~
30a037699dSSebastian Fricke
31a037699dSSebastian FrickeThe classic, can be used to great effect for quick and dirty development
32a037699dSSebastian Frickeof new modules or to extract arbitrary necessary data for troubleshooting.
33a037699dSSebastian Fricke
34a037699dSSebastian FrickePrerequisite: ``CONFIG_PRINTK`` (usually enabled by default)
35a037699dSSebastian Fricke
36a037699dSSebastian Fricke**Pros**:
37a037699dSSebastian Fricke
38a037699dSSebastian Fricke- No need to learn anything, simple to use
39a037699dSSebastian Fricke- Easy to modify exactly to your needs (formatting of the data (See:
40a037699dSSebastian Fricke  :doc:`/core-api/printk-formats`), visibility in the log)
41a037699dSSebastian Fricke- Can cause delays in the execution of the code (beneficial to confirm whether
42a037699dSSebastian Fricke  timing is a factor)
43a037699dSSebastian Fricke
44a037699dSSebastian Fricke**Cons**:
45a037699dSSebastian Fricke
46a037699dSSebastian Fricke- Requires rebuilding the kernel/module
47a037699dSSebastian Fricke- Can cause delays in the execution of the code (which can cause issues to be
48a037699dSSebastian Fricke  not reproducible)
49a037699dSSebastian Fricke
50a037699dSSebastian FrickeFor the full documentation see :doc:`/core-api/printk-basics`
51a037699dSSebastian Fricke
52a037699dSSebastian FrickeTrace_printk
53a037699dSSebastian Fricke~~~~~~~~~~~~
54a037699dSSebastian Fricke
55a037699dSSebastian FrickePrerequisite: ``CONFIG_DYNAMIC_FTRACE`` & ``#include <linux/ftrace.h>``
56a037699dSSebastian Fricke
57a037699dSSebastian FrickeIt is a tiny bit less comfortable to use than printk(), because you will have
58a037699dSSebastian Fricketo read the messages from the trace file (See: :ref:`read_ftrace_log`
59a037699dSSebastian Frickeinstead of from the kernel log, but very useful when printk() adds unwanted
60a037699dSSebastian Frickedelays into the code execution, causing issues to be flaky or hidden.)
61a037699dSSebastian Fricke
62a037699dSSebastian FrickeIf the processing of this still causes timing issues then you can try
63a037699dSSebastian Fricketrace_puts().
64a037699dSSebastian Fricke
65a037699dSSebastian FrickeFor the full Documentation see trace_printk()
66a037699dSSebastian Fricke
67a037699dSSebastian Frickedev_dbg
68a037699dSSebastian Fricke~~~~~~~
69a037699dSSebastian Fricke
70a037699dSSebastian FrickePrint statement, which can be targeted by
71a037699dSSebastian Fricke:ref:`process/debugging/userspace_debugging_guide:dynamic debug` that contains
72a037699dSSebastian Frickeadditional information about the device used within the context.
73a037699dSSebastian Fricke
74a037699dSSebastian Fricke**When is it appropriate to leave a debug print in the code?**
75a037699dSSebastian Fricke
76a037699dSSebastian FrickePermanent debug statements have to be useful for a developer to troubleshoot
77a037699dSSebastian Frickedriver misbehavior. Judging that is a bit more of an art than a science, but
78a037699dSSebastian Frickesome guidelines are in the :ref:`Coding style guidelines
79a037699dSSebastian Fricke<process/coding-style:13) printing kernel messages>`. In almost all cases the
80a037699dSSebastian Frickedebug statements shouldn't be upstreamed, as a working driver is supposed to be
81a037699dSSebastian Frickesilent.
82a037699dSSebastian Fricke
83a037699dSSebastian FrickeCustom printk
84a037699dSSebastian Fricke~~~~~~~~~~~~~
85a037699dSSebastian Fricke
86a037699dSSebastian FrickeExample::
87a037699dSSebastian Fricke
88a037699dSSebastian Fricke  #define core_dbg(fmt, arg...) do { \
89a037699dSSebastian Fricke	  if (core_debug) \
90a037699dSSebastian Fricke		  printk(KERN_DEBUG pr_fmt("core: " fmt), ## arg); \
91a037699dSSebastian Fricke	  } while (0)
92a037699dSSebastian Fricke
93a037699dSSebastian Fricke**When should you do this?**
94a037699dSSebastian Fricke
95a037699dSSebastian FrickeIt is better to just use a pr_debug(), which can later be turned on/off with
96a037699dSSebastian Frickedynamic debug. Additionally, a lot of drivers activate these prints via a
97a037699dSSebastian Frickevariable like ``core_debug`` set by a module parameter. However, Module
98a037699dSSebastian Frickeparameters `are not recommended anymore
99a037699dSSebastian Fricke<https://lore.kernel.org/all/2024032757-surcharge-grime-d3dd@gregkh>`_.
100a037699dSSebastian Fricke
101a037699dSSebastian FrickeFtrace
102a037699dSSebastian Fricke------
103a037699dSSebastian Fricke
104a037699dSSebastian FrickeCreating a custom Ftrace tracepoint
105a037699dSSebastian Fricke~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
106a037699dSSebastian Fricke
107a037699dSSebastian FrickeA tracepoint adds a hook into your code that will be called and logged when the
108a037699dSSebastian Fricketracepoint is enabled. This can be used, for example, to trace hitting a
109a037699dSSebastian Frickeconditional branch or to dump the internal state at specific points of the code
110a037699dSSebastian Frickeflow during a debugging session.
111a037699dSSebastian Fricke
112a037699dSSebastian FrickeHere is a basic description of :ref:`how to implement new tracepoints
113a037699dSSebastian Fricke<trace/tracepoints:usage>`.
114a037699dSSebastian Fricke
115a037699dSSebastian FrickeFor the full event tracing documentation see :doc:`/trace/events`
116a037699dSSebastian Fricke
117a037699dSSebastian FrickeFor the full Ftrace documentation see :doc:`/trace/ftrace`
118a037699dSSebastian Fricke
119a037699dSSebastian FrickeDebugFS
120a037699dSSebastian Fricke-------
121a037699dSSebastian Fricke
122a037699dSSebastian FrickePrerequisite: ``CONFIG_DEBUG_FS` & `#include <linux/debugfs.h>``
123a037699dSSebastian Fricke
124a037699dSSebastian FrickeDebugFS differs from the other approaches of debugging, as it doesn't write
125a037699dSSebastian Frickemessages to the kernel log nor add traces to the code. Instead it allows the
126a037699dSSebastian Frickedeveloper to handle a set of files.
127a037699dSSebastian FrickeWith these files you can either store values of variables or make
128a037699dSSebastian Frickeregister/memory dumps or you can make these files writable and modify
129a037699dSSebastian Frickevalues/settings in the driver.
130a037699dSSebastian Fricke
131a037699dSSebastian FrickePossible use-cases among others:
132a037699dSSebastian Fricke
133a037699dSSebastian Fricke- Store register values
134a037699dSSebastian Fricke- Keep track of variables
135a037699dSSebastian Fricke- Store errors
136a037699dSSebastian Fricke- Store settings
137a037699dSSebastian Fricke- Toggle a setting like debug on/off
138a037699dSSebastian Fricke- Error injection
139a037699dSSebastian Fricke
140a037699dSSebastian FrickeThis is especially useful, when the size of a data dump would be hard to digest
141a037699dSSebastian Frickeas part of the general kernel log (for example when dumping raw bitstream data)
142a037699dSSebastian Frickeor when you are not interested in all the values all the time, but with the
143a037699dSSebastian Frickepossibility to inspect them.
144a037699dSSebastian Fricke
145a037699dSSebastian FrickeThe general idea is:
146a037699dSSebastian Fricke
147a037699dSSebastian Fricke- Create a directory during probe (``struct dentry *parent =
148a037699dSSebastian Fricke  debugfs_create_dir("my_driver", NULL);``)
149a037699dSSebastian Fricke- Create a file (``debugfs_create_u32("my_value", 444, parent, &my_variable);``)
150a037699dSSebastian Fricke
151a037699dSSebastian Fricke  - In this example the file is found in
152a037699dSSebastian Fricke    ``/sys/kernel/debug/my_driver/my_value`` (with read permissions for
153a037699dSSebastian Fricke    user/group/all)
154a037699dSSebastian Fricke  - any read of the file will return the current contents of the variable
155a037699dSSebastian Fricke    ``my_variable``
156a037699dSSebastian Fricke
157a037699dSSebastian Fricke- Clean up the directory when removing the device
158a037699dSSebastian Fricke  (``debugfs_remove_recursive(parent);``)
159a037699dSSebastian Fricke
160a037699dSSebastian FrickeFor the full documentation see :doc:`/filesystems/debugfs`.
161a037699dSSebastian Fricke
162a037699dSSebastian FrickeKASAN, UBSAN, lockdep and other error checkers
163a037699dSSebastian Fricke----------------------------------------------
164a037699dSSebastian Fricke
165a037699dSSebastian FrickeKASAN (Kernel Address Sanitizer)
166a037699dSSebastian Fricke~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
167a037699dSSebastian Fricke
168a037699dSSebastian FrickePrerequisite: ``CONFIG_KASAN``
169a037699dSSebastian Fricke
170a037699dSSebastian FrickeKASAN is a dynamic memory error detector that helps to find use-after-free and
171a037699dSSebastian Frickeout-of-bounds bugs. It uses compile-time instrumentation to check every memory
172a037699dSSebastian Frickeaccess.
173a037699dSSebastian Fricke
174a037699dSSebastian FrickeFor the full documentation see :doc:`/dev-tools/kasan`.
175a037699dSSebastian Fricke
176a037699dSSebastian FrickeUBSAN (Undefined Behavior Sanitizer)
177a037699dSSebastian Fricke~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
178a037699dSSebastian Fricke
179a037699dSSebastian FrickePrerequisite: ``CONFIG_UBSAN``
180a037699dSSebastian Fricke
181a037699dSSebastian FrickeUBSAN relies on compiler instrumentation and runtime checks to detect undefined
182a037699dSSebastian Frickebehavior. It is designed to find a variety of issues, including signed integer
183a037699dSSebastian Frickeoverflow, array index out of bounds, and more.
184a037699dSSebastian Fricke
185a037699dSSebastian FrickeFor the full documentation see :doc:`/dev-tools/ubsan`
186a037699dSSebastian Fricke
187a037699dSSebastian Frickelockdep (Lock Dependency Validator)
188a037699dSSebastian Fricke~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
189a037699dSSebastian Fricke
190a037699dSSebastian FrickePrerequisite: ``CONFIG_DEBUG_LOCKDEP``
191a037699dSSebastian Fricke
192a037699dSSebastian Frickelockdep is a runtime lock dependency validator that detects potential deadlocks
193a037699dSSebastian Frickeand other locking-related issues in the kernel.
194a037699dSSebastian FrickeIt tracks lock acquisitions and releases, building a dependency graph that is
195a037699dSSebastian Frickeanalyzed for potential deadlocks.
196a037699dSSebastian Frickelockdep is especially useful for validating the correctness of lock ordering in
197a037699dSSebastian Frickethe kernel.
198a037699dSSebastian Fricke
199a037699dSSebastian FrickePSI (Pressure stall information tracking)
200a037699dSSebastian Fricke~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
201a037699dSSebastian Fricke
202a037699dSSebastian FrickePrerequisite: ``CONFIG_PSI``
203a037699dSSebastian Fricke
204a037699dSSebastian FrickePSI is a measurement tool to identify excessive overcommits on hardware
205a037699dSSebastian Frickeresources, that can cause performance disruptions or even OOM kills.
206a037699dSSebastian Fricke
207a037699dSSebastian Frickedevice coredump
208a037699dSSebastian Fricke---------------
209a037699dSSebastian Fricke
210*126437fcSRandy DunlapPrerequisite: ``CONFIG_DEV_COREDUMP`` & ``#include <linux/devcoredump.h>``
211a037699dSSebastian Fricke
212a037699dSSebastian FrickeProvides the infrastructure for a driver to provide arbitrary data to userland.
213a037699dSSebastian FrickeIt is most often used in conjunction with udev or similar userland application
214a037699dSSebastian Fricketo listen for kernel uevents, which indicate that the dump is ready. Udev has
215a037699dSSebastian Frickerules to copy that file somewhere for long-term storage and analysis, as by
216*126437fcSRandy Dunlapdefault, the data for the dump is automatically cleaned up after a default
217*126437fcSRandy Dunlap5 minutes. That data is analyzed with driver-specific tools or GDB.
218*126437fcSRandy Dunlap
219*126437fcSRandy DunlapA device coredump can be created with a vmalloc area, with read/free
220*126437fcSRandy Dunlapmethods, or as a scatter/gather list.
221a037699dSSebastian Fricke
222a037699dSSebastian FrickeYou can find an example implementation at:
223a037699dSSebastian Fricke`drivers/media/platform/qcom/venus/core.c
224*126437fcSRandy Dunlap<https://elixir.bootlin.com/linux/v6.11.6/source/drivers/media/platform/qcom/venus/core.c#L30>`__,
225*126437fcSRandy Dunlapin the Bluetooth HCI layer, in several wireless drivers, and in several
226*126437fcSRandy DunlapDRM drivers.
227*126437fcSRandy Dunlap
228*126437fcSRandy Dunlapdevcoredump interfaces
229*126437fcSRandy Dunlap~~~~~~~~~~~~~~~~~~~~~~
230*126437fcSRandy Dunlap
231*126437fcSRandy Dunlap.. kernel-doc:: include/linux/devcoredump.h
232*126437fcSRandy Dunlap
233*126437fcSRandy Dunlap.. kernel-doc:: drivers/base/devcoredump.c
234a037699dSSebastian Fricke
235a037699dSSebastian Fricke**Copyright** ©2024 : Collabora
236