History log of /linux-6.15/lib/debugobjects.c (Results 1 – 25 of 100)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3
# ff8d523c 13-Oct-2024 Thomas Gleixner <[email protected]>

debugobjects: Track object usage to avoid premature freeing of objects

The freelist is freed at a constant rate independent of the actual usage
requirements. That's bad in scenarios where usage come

debugobjects: Track object usage to avoid premature freeing of objects

The freelist is freed at a constant rate independent of the actual usage
requirements. That's bad in scenarios where usage comes in bursts. The end
of a burst puts the objects on the free list and freeing proceeds even when
the next burst which requires objects started again.

Keep track of the usage with a exponentially wheighted moving average and
take that into account in the worker function which frees objects from the
free list.

This further reduces the kmem_cache allocation/free rate for a full kernel
compile:

kmem_cache_alloc() kmem_cache_free()
Baseline: 225k 173k
Usage: 170k 117k

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Zhen Lei <[email protected]>
Link: https://lore.kernel.org/all/87bjznhme2.ffs@tglx

show more ...


# 13f9ca72 07-Oct-2024 Thomas Gleixner <[email protected]>

debugobjects: Refill per CPU pool more agressively

Right now the per CPU pools are only refilled when they become
empty. That's suboptimal especially when there are still non-freed objects
in the to

debugobjects: Refill per CPU pool more agressively

Right now the per CPU pools are only refilled when they become
empty. That's suboptimal especially when there are still non-freed objects
in the to free list.

Check whether an allocation from the per CPU pool emptied a batch and try
to allocate from the free pool if that still has objects available.

kmem_cache_alloc() kmem_cache_free()
Baseline: 295k 245k
Refill: 225k 173k

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Zhen Lei <[email protected]>
Link: https://lore.kernel.org/all/[email protected]

show more ...


# a201a96b 07-Oct-2024 Thomas Gleixner <[email protected]>

debugobjects: Double the per CPU slots

In situations where objects are rapidly allocated from the pool and handed
back, the size of the per CPU pool turns out to be too small.

Double the size of th

debugobjects: Double the per CPU slots

In situations where objects are rapidly allocated from the pool and handed
back, the size of the per CPU pool turns out to be too small.

Double the size of the per CPU pool.

This reduces the kmem cache allocation and free operations during a kernel compile:

alloc free
Baseline: 380k 330k
Double size: 295k 245k

Especially the reduction of allocations is important because that happens
in the hot path when objects are initialized.

The maximum increase in per CPU pool memory consumption is about 2.5K per
online CPU, which is acceptable.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Zhen Lei <[email protected]>
Link: https://lore.kernel.org/all/[email protected]

show more ...


# 2638345d 07-Oct-2024 Thomas Gleixner <[email protected]>

debugobjects: Move pool statistics into global_pool struct

Keep it along with the pool as that's a hot cache line anyway and it makes
the code more comprehensible.

Signed-off-by: Thomas Gleixner <t

debugobjects: Move pool statistics into global_pool struct

Keep it along with the pool as that's a hot cache line anyway and it makes
the code more comprehensible.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Zhen Lei <[email protected]>
Link: https://lore.kernel.org/all/[email protected]

show more ...


# f57ebb92 07-Oct-2024 Thomas Gleixner <[email protected]>

debugobjects: Implement batch processing

Adding and removing single objects in a loop is bad in terms of lock
contention and cache line accesses.

To implement batching, record the last object in a

debugobjects: Implement batch processing

Adding and removing single objects in a loop is bad in terms of lock
contention and cache line accesses.

To implement batching, record the last object in a batch in the object
itself. This is trivialy possible as hlists are strictly stacks. At a batch
boundary, when the first object is added to the list the object stores a
pointer to itself in debug_obj::batch_last. When the next object is added
to the list then the batch_last pointer is retrieved from the first object
in the list and stored in the to be added one.

That means for batch processing the first object always has a pointer to
the last object in a batch, which allows to move batches in a cache line
efficient way and reduces the lock held time.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Zhen Lei <[email protected]>
Link: https://lore.kernel.org/all/[email protected]

show more ...


# aebbfe07 07-Oct-2024 Thomas Gleixner <[email protected]>

debugobjects: Prepare kmem_cache allocations for batching

Allocate a batch and then push it into the pool. Utilize the
debug_obj::last_node pointer for keeping track of the batch boundary.

Signed-o

debugobjects: Prepare kmem_cache allocations for batching

Allocate a batch and then push it into the pool. Utilize the
debug_obj::last_node pointer for keeping track of the batch boundary.

Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/all/[email protected]

show more ...


# 74fe1ad4 07-Oct-2024 Thomas Gleixner <[email protected]>

debugobjects: Prepare for batching

Move the debug_obj::object pointer into a union and add a pointer to the
last node in a batch. That allows to implement batch processing efficiently
by utilizing t

debugobjects: Prepare for batching

Move the debug_obj::object pointer into a union and add a pointer to the
last node in a batch. That allows to implement batch processing efficiently
by utilizing the stack property of hlist:

When the first object of a batch is added to the list, then the batch
pointer is set to the hlist node of the object itself. Any subsequent add
retrieves the pointer to the last node from the first object in the list
and uses that for storing the last node pointer in the newly added object.

Add the pointer to the data structure and ensure that all relevant pool
sizes are strictly batch sized. The actual batching implementation follows
in subsequent changes.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Zhen Lei <[email protected]>
Link: https://lore.kernel.org/all/[email protected]

show more ...


# 14077b9e 07-Oct-2024 Thomas Gleixner <[email protected]>

debugobjects: Use static key for boot pool selection

Get rid of the conditional in the hot path.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Zhen Lei <[email protected]

debugobjects: Use static key for boot pool selection

Get rid of the conditional in the hot path.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Zhen Lei <[email protected]>
Link: https://lore.kernel.org/all/[email protected]

show more ...


# 9ce99c6d 07-Oct-2024 Thomas Gleixner <[email protected]>

debugobjects: Rework free_object_work()

Convert it to batch processing with intermediate helper functions. This
reduces the final changes for batch processing.

Signed-off-by: Thomas Gleixner <tglx@

debugobjects: Rework free_object_work()

Convert it to batch processing with intermediate helper functions. This
reduces the final changes for batch processing.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Zhen Lei <[email protected]>
Link: https://lore.kernel.org/all/[email protected]

show more ...


# a3b9e191 07-Oct-2024 Thomas Gleixner <[email protected]>

debugobjects: Rework object freeing

__free_object() is uncomprehensibly complex. The same can be achieved by:

1) Adding the object to the per CPU pool

2) If that pool is full, move a batch o

debugobjects: Rework object freeing

__free_object() is uncomprehensibly complex. The same can be achieved by:

1) Adding the object to the per CPU pool

2) If that pool is full, move a batch of objects into the global pool
or if the global pool is full into the to free pool

This also prepares for batch processing.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Zhen Lei <[email protected]>
Link: https://lore.kernel.org/all/[email protected]

show more ...


# fb60c004 07-Oct-2024 Thomas Gleixner <[email protected]>

debugobjects: Rework object allocation

The current allocation scheme tries to allocate from the per CPU pool
first. If that fails it allocates one object from the global pool and then
refills the pe

debugobjects: Rework object allocation

The current allocation scheme tries to allocate from the per CPU pool
first. If that fails it allocates one object from the global pool and then
refills the per CPU pool from the global pool.

That is in the way of switching the pool management to batch mode as the
global pool needs to be a strict stack of batches, which does not allow
to allocate single objects.

Rework the code to refill the per CPU pool first and then allocate the
object from the refilled batch. Also try to allocate from the to free pool
first to avoid freeing and reallocating objects.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Zhen Lei <[email protected]>
Link: https://lore.kernel.org/all/[email protected]

show more ...


# 96a9a042 07-Oct-2024 Thomas Gleixner <[email protected]>

debugobjects: Move min/max count into pool struct

Having the accounting in the datastructure is better in terms of cache
lines and allows more optimizations later on.

Signed-off-by: Thomas Gleixner

debugobjects: Move min/max count into pool struct

Having the accounting in the datastructure is better in terms of cache
lines and allows more optimizations later on.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Zhen Lei <[email protected]>
Link: https://lore.kernel.org/all/[email protected]

show more ...


# 18b8afcb 07-Oct-2024 Thomas Gleixner <[email protected]>

debugobjects: Rename and tidy up per CPU pools

No point in having a separate data structure. Reuse struct obj_pool and
tidy up the code.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed

debugobjects: Rename and tidy up per CPU pools

No point in having a separate data structure. Reuse struct obj_pool and
tidy up the code.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Zhen Lei <[email protected]>
Link: https://lore.kernel.org/all/[email protected]

show more ...


# cb58d190 07-Oct-2024 Thomas Gleixner <[email protected]>

debugobjects: Use separate list head for boot pool

There is no point to handle the statically allocated objects during early
boot in the actual pool list. This phase does not require accounting, so

debugobjects: Use separate list head for boot pool

There is no point to handle the statically allocated objects during early
boot in the actual pool list. This phase does not require accounting, so
all of the related complexity can be avoided.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Zhen Lei <[email protected]>
Link: https://lore.kernel.org/all/[email protected]

show more ...


# e18328ff 07-Oct-2024 Thomas Gleixner <[email protected]>

debugobjects: Move pools into a datastructure

The contention on the global pool lock can be reduced by strict batch
processing where batches of objects are moved from one list head to another
instea

debugobjects: Move pools into a datastructure

The contention on the global pool lock can be reduced by strict batch
processing where batches of objects are moved from one list head to another
instead of moving them object by object. This also reduces the cache
footprint because it avoids the list walk and dirties at maximum three
cache lines instead of potentially up to eighteen.

To prepare for that, move the hlist head and related counters into a
struct.

No functional change.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Zhen Lei <[email protected]>
Link: https://lore.kernel.org/all/[email protected]

show more ...


# d8c6cd3a 07-Oct-2024 Zhen Lei <[email protected]>

debugobjects: Reduce parallel pool fill attempts

The contention on the global pool_lock can be massive when the global pool
needs to be refilled and many CPUs try to handle this.

Address this by:

debugobjects: Reduce parallel pool fill attempts

The contention on the global pool_lock can be massive when the global pool
needs to be refilled and many CPUs try to handle this.

Address this by:

- splitting the refill from free list and allocation.

Refill from free list has no constraints vs. the context on RT, so
it can be tried outside of the RT specific preemptible() guard

- Let only one CPU handle the free list

- Let only one CPU do allocations unless the pool level is below
half of the minimum fill level.

Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Zhen Lei <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Link: https://lore.kernel.org/all/[email protected]

--
lib/debugobjects.c | 84 +++++++++++++++++++++++++++++++++++++----------------
1 file changed, 59 insertions(+), 25 deletions(-)

show more ...


# 661cc28b 07-Oct-2024 Thomas Gleixner <[email protected]>

debugobjects: Make debug_objects_enabled bool

Make it what it is.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Zhen Lei <[email protected]>
Link: https://lore.kernel.or

debugobjects: Make debug_objects_enabled bool

Make it what it is.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Zhen Lei <[email protected]>
Link: https://lore.kernel.org/all/[email protected]

show more ...


# 49a5cb82 07-Oct-2024 Thomas Gleixner <[email protected]>

debugobjects: Provide and use free_object_list()

Move the loop to free a list of objects into a helper function so it can be
reused later.

Signed-off-by: Thomas Gleixner <[email protected]>
Link:

debugobjects: Provide and use free_object_list()

Move the loop to free a list of objects into a helper function so it can be
reused later.

Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/all/[email protected]

show more ...


# 241463f4 07-Oct-2024 Thomas Gleixner <[email protected]>

debugobjects: Remove pointless debug printk

It has zero value.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Zhen Lei <[email protected]>
Link: https://lore.kernel.org/a

debugobjects: Remove pointless debug printk

It has zero value.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Zhen Lei <[email protected]>
Link: https://lore.kernel.org/all/[email protected]

show more ...


# 49968cf1 07-Oct-2024 Thomas Gleixner <[email protected]>

debugobjects: Reuse put_objects() on OOM

Reuse the helper function instead of having a open coded copy.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Zhen Lei <thunder.leizhen@hu

debugobjects: Reuse put_objects() on OOM

Reuse the helper function instead of having a open coded copy.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Zhen Lei <[email protected]>
Link: https://lore.kernel.org/all/[email protected]

show more ...


# a2a70238 07-Oct-2024 Thomas Gleixner <[email protected]>

debugobjects: Dont free objects directly on CPU hotplug

Freeing the per CPU pool of the unplugged CPU directly is suboptimal as the
objects can be reused in the real pool if there is room. Aside of

debugobjects: Dont free objects directly on CPU hotplug

Freeing the per CPU pool of the unplugged CPU directly is suboptimal as the
objects can be reused in the real pool if there is room. Aside of that this
gets the accounting wrong.

Use the regular free path, which allows reuse and has the accounting correct.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Zhen Lei <[email protected]>
Link: https://lore.kernel.org/all/[email protected]

show more ...


# 3f397bf9 07-Oct-2024 Thomas Gleixner <[email protected]>

debugobjects: Remove pointless hlist initialization

It's BSS zero initialized.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Zhen Lei <[email protected]>
Link: https://l

debugobjects: Remove pointless hlist initialization

It's BSS zero initialized.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Zhen Lei <[email protected]>
Link: https://lore.kernel.org/all/[email protected]

show more ...


# 55fb412e 07-Oct-2024 Thomas Gleixner <[email protected]>

debugobjects: Dont destroy kmem cache in init()

debug_objects_mem_init() is invoked from mm_core_init() before work queues
are available. If debug_objects_mem_init() destroys the kmem cache in the
e

debugobjects: Dont destroy kmem cache in init()

debug_objects_mem_init() is invoked from mm_core_init() before work queues
are available. If debug_objects_mem_init() destroys the kmem cache in the
error path it causes an Oops in __queue_work():

Oops: Oops: 0000 [#1] PREEMPT SMP PTI
RIP: 0010:__queue_work+0x35/0x6a0
queue_work_on+0x66/0x70
flush_all_cpus_locked+0xdf/0x1a0
__kmem_cache_shutdown+0x2f/0x340
kmem_cache_destroy+0x4e/0x150
mm_core_init+0x9e/0x120
start_kernel+0x298/0x800
x86_64_start_reservations+0x18/0x30
x86_64_start_kernel+0xc5/0xe0
common_startup_64+0x12c/0x138

Further the object cache pointer is used in various places to check for
early boot operation. It is exposed before the replacments for the static
boot time objects are allocated and the self test operates on it.

This can be avoided by:

1) Running the self test with the static boot objects

2) Exposing it only after the replacement objects have been added to
the pool.

Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/all/[email protected]

show more ...


# 813fd078 07-Oct-2024 Zhen Lei <[email protected]>

debugobjects: Collect newly allocated objects in a list to reduce lock contention

Collect the newly allocated debug objects in a list outside the lock, so
that the lock held time and the potential l

debugobjects: Collect newly allocated objects in a list to reduce lock contention

Collect the newly allocated debug objects in a list outside the lock, so
that the lock held time and the potential lock contention is reduced.

Signed-off-by: Zhen Lei <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Link: https://lore.kernel.org/all/[email protected]

show more ...


# a0ae9504 07-Oct-2024 Zhen Lei <[email protected]>

debugobjects: Delete a piece of redundant code

The statically allocated objects are all located in obj_static_pool[],
the whole memory of obj_static_pool[] will be reclaimed later. Therefore,
there

debugobjects: Delete a piece of redundant code

The statically allocated objects are all located in obj_static_pool[],
the whole memory of obj_static_pool[] will be reclaimed later. Therefore,
there is no need to split the remaining statically nodes in list obj_pool
into isolated ones, no one will use them anymore. Just write
INIT_HLIST_HEAD(&obj_pool) is enough. Since hlist_move_list() directly
discards the old list, even this can be omitted.

Signed-off-by: Zhen Lei <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Link: https://lore.kernel.org/all/[email protected]

show more ...


1234