History log of /linux-6.15/fs/dlm/midcomms.c (Results 1 – 25 of 66)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2
# f49da8c0 28-May-2024 Alexander Aring <[email protected]>

dlm: remove unused parameter in dlm_midcomms_addr

This patch removes an parameter which is currently not used by
dlm_midcomms_addr().

Signed-off-by: Alexander Aring <[email protected]>
Signed-off

dlm: remove unused parameter in dlm_midcomms_addr

This patch removes an parameter which is currently not used by
dlm_midcomms_addr().

Signed-off-by: Alexander Aring <[email protected]>
Signed-off-by: David Teigland <[email protected]>

show more ...


Revision tags: v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3
# 578acf9a 02-Apr-2024 Alexander Aring <[email protected]>

dlm: use spin_lock_bh for message processing

Use spin_lock_bh for all spinlocks involved in message processing,
in preparation for softirq message processing. DLM lock requests
from user space invo

dlm: use spin_lock_bh for message processing

Use spin_lock_bh for all spinlocks involved in message processing,
in preparation for softirq message processing. DLM lock requests
from user space involve dlm processing in user context, in addition
to the standard kernel context, necessitating bh variants.

Signed-off-by: Alexander Aring <[email protected]>
Signed-off-by: David Teigland <[email protected]>

show more ...


# 98808644 02-Apr-2024 Alexander Aring <[email protected]>

dlm: remove allocation parameter in msg allocation

Remove the context parameter for message allocations and
always use GFP_ATOMIC. This prepares for softirq message
processing.

Signed-off-by: Alexa

dlm: remove allocation parameter in msg allocation

Remove the context parameter for message allocations and
always use GFP_ATOMIC. This prepares for softirq message
processing.

Signed-off-by: Alexander Aring <[email protected]>
Signed-off-by: David Teigland <[email protected]>

show more ...


Revision tags: v6.9-rc2
# 609ed5bd 28-Mar-2024 Kunwu Chan <[email protected]>

dlm: Simplify the allocation of slab caches in dlm_midcomms_cache_create

Use the new KMEM_CACHE() macro instead of direct kmem_cache_create
to simplify the creation of SLAB caches.

Signed-off-by: K

dlm: Simplify the allocation of slab caches in dlm_midcomms_cache_create

Use the new KMEM_CACHE() macro instead of direct kmem_cache_create
to simplify the creation of SLAB caches.

Signed-off-by: Kunwu Chan <[email protected]>
Signed-off-by: Alexander Aring <[email protected]>
Signed-off-by: David Teigland <[email protected]>

show more ...


Revision tags: v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6
# 6212e452 10-Oct-2023 Alexander Aring <[email protected]>

dlm: fix no ack after final message

In case of an final DLM message we can't should not send an ack out
after the final message. This patch moves the ack message before the
messages will be transmit

dlm: fix no ack after final message

In case of an final DLM message we can't should not send an ack out
after the final message. This patch moves the ack message before the
messages will be transmitted. If it's the final message and the
receiving node turns into DLM_CLOSED state another ack messages will
being received and turning the receiving node into DLM_ESTABLISHED
again.

Fixes: 1696c75f1864 ("fs: dlm: add send ack threshold and append acks to msgs")
Signed-off-by: Alexander Aring <[email protected]>
Signed-off-by: David Teigland <[email protected]>

show more ...


# e759eb3e 10-Oct-2023 Alexander Aring <[email protected]>

dlm: be sure we reset all nodes at forced shutdown

In case we running in a force shutdown in either midcomms or lowcomms
implementation we will make sure we reset all per midcomms node
information.

dlm: be sure we reset all nodes at forced shutdown

In case we running in a force shutdown in either midcomms or lowcomms
implementation we will make sure we reset all per midcomms node
information.

Fixes: 63e711b08160 ("fs: dlm: create midcomms nodes when configure")
Signed-off-by: Alexander Aring <[email protected]>
Signed-off-by: David Teigland <[email protected]>

show more ...


# 2776635e 10-Oct-2023 Alexander Aring <[email protected]>

dlm: fix remove member after close call

The idea of commit 63e711b08160 ("fs: dlm: create midcomms nodes when
configure") is to set the midcomms node lifetime when a node joins or
leaves the cluster

dlm: fix remove member after close call

The idea of commit 63e711b08160 ("fs: dlm: create midcomms nodes when
configure") is to set the midcomms node lifetime when a node joins or
leaves the cluster. Currently we can hit the following warning:

[10844.611495] ------------[ cut here ]------------
[10844.615913] WARNING: CPU: 4 PID: 84304 at fs/dlm/midcomms.c:1263
dlm_midcomms_remove_member+0x13f/0x180 [dlm]

or running in a state where we hit a midcomms node usage count in a
negative value:

[ 260.830782] node 2 users dec count -1

The first warning happens when the a specific node does not exists and
it was probably removed but dlm_midcomms_close() which is called when a
node leaves the cluster. The second kernel log message is probably in a
case when dlm_midcomms_addr() is called when a joined the cluster but
due fencing a node leaved the cluster without getting removed from the
lockspace. If the node joins the cluster and it was removed from the
cluster due fencing the first call is to remove the node from lockspaces
triggered by the user space. In both cases if the node wasn't found or
the user count is zero, we should ignore any additional midcomms handling
of dlm_midcomms_remove_member().

Fixes: 63e711b08160 ("fs: dlm: create midcomms nodes when configure")
Signed-off-by: Alexander Aring <[email protected]>
Signed-off-by: David Teigland <[email protected]>

show more ...


# fe9b619e 10-Oct-2023 Alexander Aring <[email protected]>

dlm: fix creating multiple node structures

This patch will lookup existing nodes instead of always creating them
when dlm_midcomms_addr() is called. The idea is here to create midcomms
nodes when us

dlm: fix creating multiple node structures

This patch will lookup existing nodes instead of always creating them
when dlm_midcomms_addr() is called. The idea is here to create midcomms
nodes when user space getting informed that nodes joins the cluster. This
is the case when dlm_midcomms_addr() is called, however it can be called
multiple times by user space to add several address configurations to one
node e.g. when using SCTP. Those multiple times need to be filtered out
and we doing that by looking up if the node exists before. Due configfs
entry it is safe that this function gets only called once at a time.

Fixes: 63e711b08160 ("fs: dlm: create midcomms nodes when configure")
Signed-off-by: Alexander Aring <[email protected]>
Signed-off-by: David Teigland <[email protected]>

show more ...


Revision tags: v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5
# 63e711b0 01-Aug-2023 Alexander Aring <[email protected]>

fs: dlm: create midcomms nodes when configure

This patch puts the life of a midcomms node the same as a lowcomms
connection. The lowcomms connection lifetime was changed by commit
6f0b0b5d7ae7 ("fs:

fs: dlm: create midcomms nodes when configure

This patch puts the life of a midcomms node the same as a lowcomms
connection. The lowcomms connection lifetime was changed by commit
6f0b0b5d7ae7 ("fs: dlm: remove dlm_node_addrs lookup list"). In the
future the midcomms node instances can be merged with lowcomms
connection structure as the lifetime is the same and states can be
controlled over values or flags.

Before midcomms nodes were generated during version detection. This is
not necessary anymore when the nodes are created when the cluster
manager configures DLM via configfs. When a midcomms node is created over
configfs it well set DLM_VERSION_NOT_SET as version. This indicates that
the version of the midcomms node is still unknown and need to be probed
via certain rcom messages.

Signed-off-by: Alexander Aring <[email protected]>
Signed-off-by: David Teigland <[email protected]>

show more ...


# 11519351 01-Aug-2023 Alexander Aring <[email protected]>

fs: dlm: constify receive buffer

The dlm receive buffer should be never manipulated as DLM is the last
instance of parsing layer. This patch constify the whole receive buffer
so we are sure it never

fs: dlm: constify receive buffer

The dlm receive buffer should be never manipulated as DLM is the last
instance of parsing layer. This patch constify the whole receive buffer
so we are sure it never gets manipulated when it's being parsed.

Signed-off-by: Alexander Aring <[email protected]>
Signed-off-by: David Teigland <[email protected]>

show more ...


# 643f5cfa 01-Aug-2023 Alexander Aring <[email protected]>

fs: dlm: cleanup lock order

This patch cleanups the lock order to hold at first the close_lock and
then held the nodes_srcu read lock. Probably it will never be a problem
as nodes_srcu is only a rea

fs: dlm: cleanup lock order

This patch cleanups the lock order to hold at first the close_lock and
then held the nodes_srcu read lock. Probably it will never be a problem
as nodes_srcu is only a read lock preventing the node pointer getting
freed.

Signed-off-by: Alexander Aring <[email protected]>
Signed-off-by: David Teigland <[email protected]>

show more ...


Revision tags: v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5
# 1696c75f 29-May-2023 Alexander Aring <[email protected]>

fs: dlm: add send ack threshold and append acks to msgs

This patch changes the time when we sending an ack back to tell the
other side it can free some message because it is arrived on the
receiver

fs: dlm: add send ack threshold and append acks to msgs

This patch changes the time when we sending an ack back to tell the
other side it can free some message because it is arrived on the
receiver node, due random reconnects e.g. TCP resets this is handled as
well on application layer to not let DLM run into a deadlock state.

The current handling has the following problems:

1. We end in situations that we only send an ack back message of 16
bytes out and no other messages. Whereas DLM has logic to combine
so much messages as it can in one send() socket call. This behaviour
can be discovered by "trace-cmd start -e dlm_recv" and observing the
ret field being 16 bytes.

2. When processing of DLM messages will never end because we receive a
lot of messages, we will not send an ack back as it happens when
the processing loop ends.

This patch introduces a likely and unlikely threshold case. The likely
case will send an ack back on a transmit path if the threshold is
triggered of amount of processed upper layer protocol. This will solve
issue 1 because it will be send when another normal DLM message will be
sent. It solves issue 2 because it is not part of the processing loop.

There is however a unlikely case, the unlikely case has a bigger
threshold and will be triggered when we only receive messages and do not
sent any message back. This case avoids that the sending node will keep
a lot of message for a long time as we send sometimes ack backs to tell
the sender to finally release messages.

The atomic cmpxchg() is there to provide a atomically ack send with
reset of the upper layer protocol delivery counter.

Signed-off-by: Alexander Aring <[email protected]>
Signed-off-by: David Teigland <[email protected]>

show more ...


# d00725ca 29-May-2023 Alexander Aring <[email protected]>

fs: dlm: handle sequence numbers as atomic

Currently seq_next is only be read on the receive side which processed
in an ordered way. The seq_send is being protected by locks. To being
able to read t

fs: dlm: handle sequence numbers as atomic

Currently seq_next is only be read on the receive side which processed
in an ordered way. The seq_send is being protected by locks. To being
able to read the seq_next value on send side as well we convert it to an
atomic_t value. The atomic_cmpxchg() is probably not necessary, however
the atomic_inc() depends on a if coniditional and this should be handled
in an atomic context.

Signed-off-by: Alexander Aring <[email protected]>
Signed-off-by: David Teigland <[email protected]>

show more ...


# 07ee3867 29-May-2023 Alexander Aring <[email protected]>

fs: dlm: filter ourself midcomms calls

It makes no sense to call midcomms/lowcomms functionality for the local
node as socket functionality is only required for remote nodes. This
patch filters thos

fs: dlm: filter ourself midcomms calls

It makes no sense to call midcomms/lowcomms functionality for the local
node as socket functionality is only required for remote nodes. This
patch filters those calls in the upper layer of lockspace membership
handling instead of doing it in midcomms/lowcomms layer as they should
never be aware of local nodeid.

Signed-off-by: Alexander Aring <[email protected]>
Signed-off-by: David Teigland <[email protected]>

show more ...


# c6b6d6dc 29-May-2023 Alexander Aring <[email protected]>

fs: dlm: revert check required context while close

This patch reverts commit 2c3fa6ae4d52 ("dlm: check required context
while close"). The function dlm_midcomms_close(), which will call later
dlm_lo

fs: dlm: revert check required context while close

This patch reverts commit 2c3fa6ae4d52 ("dlm: check required context
while close"). The function dlm_midcomms_close(), which will call later
dlm_lowcomms_close(), is called when the cluster manager tells the node
got fenced which means on midcomms/lowcomms layer to disconnect the node
from the cluster communication. The node can rejoin the cluster later.
This patch was ensuring no new message were able to be triggered when we
are in the close() function context. This was done by checking if the
lockspace has been stopped. However there is a missing check that we
only need to check specific lockspaces where the fenced node is member
of. This is currently complicated because there is no way to easily
check if a node is part of a specific lockspace without stopping the
recovery. For now we just revert this commit as it is just a check to
finding possible leaks of stopping lockspaces before close() is called.

Cc: [email protected]
Fixes: 2c3fa6ae4d52 ("dlm: check required context while close")
Signed-off-by: Alexander Aring <[email protected]>
Signed-off-by: David Teigland <[email protected]>

show more ...


Revision tags: v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4
# 723b197b 12-Jan-2023 Alexander Aring <[email protected]>

fs: dlm: remove unnecessary waker_up() calls

The wake_up() is already handled inside of midcomms_node_reset() when
switching the state to CLOSED state. So there is not need to call it
after midcomms

fs: dlm: remove unnecessary waker_up() calls

The wake_up() is already handled inside of midcomms_node_reset() when
switching the state to CLOSED state. So there is not need to call it
after midcomms_node_reset() again.

Signed-off-by: Alexander Aring <[email protected]>
Signed-off-by: David Teigland <[email protected]>

show more ...


# ef7ef015 12-Jan-2023 Alexander Aring <[email protected]>

fs: dlm: move state change into else branch

Currently we can switch at first into DLM_CLOSE_WAIT state and then do
another state change if a condition is true. Instead of doing two state
changes we

fs: dlm: move state change into else branch

Currently we can switch at first into DLM_CLOSE_WAIT state and then do
another state change if a condition is true. Instead of doing two state
changes we handle the other state change inside an else branch of this
condition.

Signed-off-by: Alexander Aring <[email protected]>
Signed-off-by: David Teigland <[email protected]>

show more ...


# 31864097 12-Jan-2023 Alexander Aring <[email protected]>

fs: dlm: remove newline in log_print

There is an API difference between log_print() and other printk()s to
put a newline or not. This one was introduced by mistake because
log_print() adds a newline

fs: dlm: remove newline in log_print

There is an API difference between log_print() and other printk()s to
put a newline or not. This one was introduced by mistake because
log_print() adds a newline.

Signed-off-by: Alexander Aring <[email protected]>
Signed-off-by: David Teigland <[email protected]>

show more ...


# 11605353 12-Jan-2023 Alexander Aring <[email protected]>

fs: dlm: reduce the shutdown timeout to 5 secs

When a shutdown is stuck, time out after 5 seconds instead of
3 minutes. After this timeout we try a forced shutdown.

Signed-off-by: Alexander Aring

fs: dlm: reduce the shutdown timeout to 5 secs

When a shutdown is stuck, time out after 5 seconds instead of
3 minutes. After this timeout we try a forced shutdown.

Signed-off-by: Alexander Aring <[email protected]>
Signed-off-by: David Teigland <[email protected]>

show more ...


# b8b750e0 12-Jan-2023 Alexander Aring <[email protected]>

fs: dlm: wait until all midcomms nodes detect version

The current dlm version detection is very complex due to backwards
compatablilty with earlier dlm protocol versions. It takes some time to
detec

fs: dlm: wait until all midcomms nodes detect version

The current dlm version detection is very complex due to backwards
compatablilty with earlier dlm protocol versions. It takes some time to
detect if a peer node has a specific DLM version. If it's not detected,
we just cut the socket connection. There could be cases where the local
node has not detected the version yet, but the peer node has. In these
cases, we are trying to shutdown the dlm connection with a FIN/ACK message
exchange to be sure the other peer is ready to shutdown the connection on
dlm application level. However this mechanism is only available on DLM
protocol version 3.2 and we need to be sure the DLM version is detected
before.

To make it more robust we introduce a a "best effort" wait to wait for the
version detection before shutdown the dlm connection. This need to be
done before the kthread recoverd for recovery handling is stopped,
because recovery handling will trigger enough messages to have a version
detection going on.

It is a corner case which was detected by modprobe dlm_locktroture module
and rmmod dlm_locktorture module directly afterwards (in a looping
behaviour). In practice probably nobody would leave a lockspace immediately
after joining it.

Signed-off-by: Alexander Aring <[email protected]>
Signed-off-by: David Teigland <[email protected]>

show more ...


# 89835b06 12-Jan-2023 Alexander Aring <[email protected]>

fs: dlm: ignore unexpected non dlm opts msgs

This patch ignores unexpected RCOM_NAMES/RCOM_STATUS messages.
To be backwards compatible, those messages are not part of the new
reliable DLM OPTS encap

fs: dlm: ignore unexpected non dlm opts msgs

This patch ignores unexpected RCOM_NAMES/RCOM_STATUS messages.
To be backwards compatible, those messages are not part of the new
reliable DLM OPTS encapsulation header, and have their own
retransmit handling using sequence number matching When we get
unexpected non dlm opts messages, we should allow them and let
RCOM message handling filter them out using sequence numbers.

Signed-off-by: Alexander Aring <[email protected]>
Signed-off-by: David Teigland <[email protected]>

show more ...


# 54fbe0c1 12-Jan-2023 Alexander Aring <[email protected]>

fs: dlm: bring back previous shutdown handling

This patch mostly reverts commit 4f567acb0b86 ("fs: dlm: remove socket
shutdown handling"). There can be situations where the dlm midcomms nodes
hash a

fs: dlm: bring back previous shutdown handling

This patch mostly reverts commit 4f567acb0b86 ("fs: dlm: remove socket
shutdown handling"). There can be situations where the dlm midcomms nodes
hash and lowcomms connection hash are not equal, but we need to guarantee
that the lowcomms are all closed on a last release of a dlm lockspace,
when a shutdown is invoked. This patch guarantees that we always close
all sockets managed by the lowcomms connection hash, and calls shutdown
for the last message sent. This ensures we don't cut the socket, which
could cause the peer to get a connection reset.

In future we should try to merge the midcomms/lowcomms hashes into one
hash and not handle both in separate hashes.

Signed-off-by: Alexander Aring <[email protected]>
Signed-off-by: David Teigland <[email protected]>

show more ...


# 00908b33 12-Jan-2023 Alexander Aring <[email protected]>

fs: dlm: send FIN ack back in right cases

This patch moves to send a ack back for receiving a FIN message only
when we are in valid states. In other cases and there might be a sender
waiting for a a

fs: dlm: send FIN ack back in right cases

This patch moves to send a ack back for receiving a FIN message only
when we are in valid states. In other cases and there might be a sender
waiting for a ack we just let it timeout at the senders time and
hopefully all other cleanups will remove the FIN message on their
sending queue. As an example we should never send out an ACK being in
LAST_ACK state or we cannot assume a working socket communication when
we are in CLOSED state.

Cc: [email protected]
Fixes: 489d8e559c65 ("fs: dlm: add reliable connection if reconnect")
Signed-off-by: Alexander Aring <[email protected]>
Signed-off-by: David Teigland <[email protected]>

show more ...


# a5849636 12-Jan-2023 Alexander Aring <[email protected]>

fs: dlm: move sending fin message into state change handling

This patch moves the send fin handling, which should appear in a specific
state change, into the state change handling while the per node

fs: dlm: move sending fin message into state change handling

This patch moves the send fin handling, which should appear in a specific
state change, into the state change handling while the per node
state_lock is held. I experienced issues with other messages because
we changed the state and a fin message was sent out in a different state.

Cc: [email protected]
Fixes: 489d8e559c65 ("fs: dlm: add reliable connection if reconnect")
Signed-off-by: Alexander Aring <[email protected]>
Signed-off-by: David Teigland <[email protected]>

show more ...


# 15c63db8 12-Jan-2023 Alexander Aring <[email protected]>

fs: dlm: don't set stop rx flag after node reset

Similar to the stop tx flag, the rx flag should warn about a dlm message
being received at DLM_FIN state change, when we are assuming no other
dlm ap

fs: dlm: don't set stop rx flag after node reset

Similar to the stop tx flag, the rx flag should warn about a dlm message
being received at DLM_FIN state change, when we are assuming no other
dlm application messages. If we receive a FIN message and we are in the
state DLM_FIN_WAIT2 we call midcomms_node_reset() which puts the
midcomms node into DLM_CLOSED state. Afterwards we should not set the
DLM_NODE_FLAG_STOP_RX flag any more. This patch changes the setting
DLM_NODE_FLAG_STOP_RX in those state changes when we receive a FIN
message and we assume there will be no other dlm application messages
received until we hit DLM_CLOSED state.

Cc: [email protected]
Fixes: 489d8e559c65 ("fs: dlm: add reliable connection if reconnect")
Signed-off-by: Alexander Aring <[email protected]>
Signed-off-by: David Teigland <[email protected]>

show more ...


123