|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2 |
|
| #
f49da8c0 |
| 28-May-2024 |
Alexander Aring <[email protected]> |
dlm: remove unused parameter in dlm_midcomms_addr
This patch removes an parameter which is currently not used by dlm_midcomms_addr().
Signed-off-by: Alexander Aring <[email protected]> Signed-off
dlm: remove unused parameter in dlm_midcomms_addr
This patch removes an parameter which is currently not used by dlm_midcomms_addr().
Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: David Teigland <[email protected]>
show more ...
|
|
Revision tags: v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3 |
|
| #
578acf9a |
| 02-Apr-2024 |
Alexander Aring <[email protected]> |
dlm: use spin_lock_bh for message processing
Use spin_lock_bh for all spinlocks involved in message processing, in preparation for softirq message processing. DLM lock requests from user space invo
dlm: use spin_lock_bh for message processing
Use spin_lock_bh for all spinlocks involved in message processing, in preparation for softirq message processing. DLM lock requests from user space involve dlm processing in user context, in addition to the standard kernel context, necessitating bh variants.
Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: David Teigland <[email protected]>
show more ...
|
| #
98808644 |
| 02-Apr-2024 |
Alexander Aring <[email protected]> |
dlm: remove allocation parameter in msg allocation
Remove the context parameter for message allocations and always use GFP_ATOMIC. This prepares for softirq message processing.
Signed-off-by: Alexa
dlm: remove allocation parameter in msg allocation
Remove the context parameter for message allocations and always use GFP_ATOMIC. This prepares for softirq message processing.
Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: David Teigland <[email protected]>
show more ...
|
|
Revision tags: v6.9-rc2 |
|
| #
609ed5bd |
| 28-Mar-2024 |
Kunwu Chan <[email protected]> |
dlm: Simplify the allocation of slab caches in dlm_midcomms_cache_create
Use the new KMEM_CACHE() macro instead of direct kmem_cache_create to simplify the creation of SLAB caches.
Signed-off-by: K
dlm: Simplify the allocation of slab caches in dlm_midcomms_cache_create
Use the new KMEM_CACHE() macro instead of direct kmem_cache_create to simplify the creation of SLAB caches.
Signed-off-by: Kunwu Chan <[email protected]> Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: David Teigland <[email protected]>
show more ...
|
|
Revision tags: v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6 |
|
| #
6212e452 |
| 10-Oct-2023 |
Alexander Aring <[email protected]> |
dlm: fix no ack after final message
In case of an final DLM message we can't should not send an ack out after the final message. This patch moves the ack message before the messages will be transmit
dlm: fix no ack after final message
In case of an final DLM message we can't should not send an ack out after the final message. This patch moves the ack message before the messages will be transmitted. If it's the final message and the receiving node turns into DLM_CLOSED state another ack messages will being received and turning the receiving node into DLM_ESTABLISHED again.
Fixes: 1696c75f1864 ("fs: dlm: add send ack threshold and append acks to msgs") Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: David Teigland <[email protected]>
show more ...
|
| #
e759eb3e |
| 10-Oct-2023 |
Alexander Aring <[email protected]> |
dlm: be sure we reset all nodes at forced shutdown
In case we running in a force shutdown in either midcomms or lowcomms implementation we will make sure we reset all per midcomms node information.
dlm: be sure we reset all nodes at forced shutdown
In case we running in a force shutdown in either midcomms or lowcomms implementation we will make sure we reset all per midcomms node information.
Fixes: 63e711b08160 ("fs: dlm: create midcomms nodes when configure") Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: David Teigland <[email protected]>
show more ...
|
| #
2776635e |
| 10-Oct-2023 |
Alexander Aring <[email protected]> |
dlm: fix remove member after close call
The idea of commit 63e711b08160 ("fs: dlm: create midcomms nodes when configure") is to set the midcomms node lifetime when a node joins or leaves the cluster
dlm: fix remove member after close call
The idea of commit 63e711b08160 ("fs: dlm: create midcomms nodes when configure") is to set the midcomms node lifetime when a node joins or leaves the cluster. Currently we can hit the following warning:
[10844.611495] ------------[ cut here ]------------ [10844.615913] WARNING: CPU: 4 PID: 84304 at fs/dlm/midcomms.c:1263 dlm_midcomms_remove_member+0x13f/0x180 [dlm]
or running in a state where we hit a midcomms node usage count in a negative value:
[ 260.830782] node 2 users dec count -1
The first warning happens when the a specific node does not exists and it was probably removed but dlm_midcomms_close() which is called when a node leaves the cluster. The second kernel log message is probably in a case when dlm_midcomms_addr() is called when a joined the cluster but due fencing a node leaved the cluster without getting removed from the lockspace. If the node joins the cluster and it was removed from the cluster due fencing the first call is to remove the node from lockspaces triggered by the user space. In both cases if the node wasn't found or the user count is zero, we should ignore any additional midcomms handling of dlm_midcomms_remove_member().
Fixes: 63e711b08160 ("fs: dlm: create midcomms nodes when configure") Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: David Teigland <[email protected]>
show more ...
|
| #
fe9b619e |
| 10-Oct-2023 |
Alexander Aring <[email protected]> |
dlm: fix creating multiple node structures
This patch will lookup existing nodes instead of always creating them when dlm_midcomms_addr() is called. The idea is here to create midcomms nodes when us
dlm: fix creating multiple node structures
This patch will lookup existing nodes instead of always creating them when dlm_midcomms_addr() is called. The idea is here to create midcomms nodes when user space getting informed that nodes joins the cluster. This is the case when dlm_midcomms_addr() is called, however it can be called multiple times by user space to add several address configurations to one node e.g. when using SCTP. Those multiple times need to be filtered out and we doing that by looking up if the node exists before. Due configfs entry it is safe that this function gets only called once at a time.
Fixes: 63e711b08160 ("fs: dlm: create midcomms nodes when configure") Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: David Teigland <[email protected]>
show more ...
|
|
Revision tags: v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5 |
|
| #
63e711b0 |
| 01-Aug-2023 |
Alexander Aring <[email protected]> |
fs: dlm: create midcomms nodes when configure
This patch puts the life of a midcomms node the same as a lowcomms connection. The lowcomms connection lifetime was changed by commit 6f0b0b5d7ae7 ("fs:
fs: dlm: create midcomms nodes when configure
This patch puts the life of a midcomms node the same as a lowcomms connection. The lowcomms connection lifetime was changed by commit 6f0b0b5d7ae7 ("fs: dlm: remove dlm_node_addrs lookup list"). In the future the midcomms node instances can be merged with lowcomms connection structure as the lifetime is the same and states can be controlled over values or flags.
Before midcomms nodes were generated during version detection. This is not necessary anymore when the nodes are created when the cluster manager configures DLM via configfs. When a midcomms node is created over configfs it well set DLM_VERSION_NOT_SET as version. This indicates that the version of the midcomms node is still unknown and need to be probed via certain rcom messages.
Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: David Teigland <[email protected]>
show more ...
|
| #
11519351 |
| 01-Aug-2023 |
Alexander Aring <[email protected]> |
fs: dlm: constify receive buffer
The dlm receive buffer should be never manipulated as DLM is the last instance of parsing layer. This patch constify the whole receive buffer so we are sure it never
fs: dlm: constify receive buffer
The dlm receive buffer should be never manipulated as DLM is the last instance of parsing layer. This patch constify the whole receive buffer so we are sure it never gets manipulated when it's being parsed.
Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: David Teigland <[email protected]>
show more ...
|
| #
643f5cfa |
| 01-Aug-2023 |
Alexander Aring <[email protected]> |
fs: dlm: cleanup lock order
This patch cleanups the lock order to hold at first the close_lock and then held the nodes_srcu read lock. Probably it will never be a problem as nodes_srcu is only a rea
fs: dlm: cleanup lock order
This patch cleanups the lock order to hold at first the close_lock and then held the nodes_srcu read lock. Probably it will never be a problem as nodes_srcu is only a read lock preventing the node pointer getting freed.
Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: David Teigland <[email protected]>
show more ...
|
|
Revision tags: v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5 |
|
| #
1696c75f |
| 29-May-2023 |
Alexander Aring <[email protected]> |
fs: dlm: add send ack threshold and append acks to msgs
This patch changes the time when we sending an ack back to tell the other side it can free some message because it is arrived on the receiver
fs: dlm: add send ack threshold and append acks to msgs
This patch changes the time when we sending an ack back to tell the other side it can free some message because it is arrived on the receiver node, due random reconnects e.g. TCP resets this is handled as well on application layer to not let DLM run into a deadlock state.
The current handling has the following problems:
1. We end in situations that we only send an ack back message of 16 bytes out and no other messages. Whereas DLM has logic to combine so much messages as it can in one send() socket call. This behaviour can be discovered by "trace-cmd start -e dlm_recv" and observing the ret field being 16 bytes.
2. When processing of DLM messages will never end because we receive a lot of messages, we will not send an ack back as it happens when the processing loop ends.
This patch introduces a likely and unlikely threshold case. The likely case will send an ack back on a transmit path if the threshold is triggered of amount of processed upper layer protocol. This will solve issue 1 because it will be send when another normal DLM message will be sent. It solves issue 2 because it is not part of the processing loop.
There is however a unlikely case, the unlikely case has a bigger threshold and will be triggered when we only receive messages and do not sent any message back. This case avoids that the sending node will keep a lot of message for a long time as we send sometimes ack backs to tell the sender to finally release messages.
The atomic cmpxchg() is there to provide a atomically ack send with reset of the upper layer protocol delivery counter.
Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: David Teigland <[email protected]>
show more ...
|
| #
d00725ca |
| 29-May-2023 |
Alexander Aring <[email protected]> |
fs: dlm: handle sequence numbers as atomic
Currently seq_next is only be read on the receive side which processed in an ordered way. The seq_send is being protected by locks. To being able to read t
fs: dlm: handle sequence numbers as atomic
Currently seq_next is only be read on the receive side which processed in an ordered way. The seq_send is being protected by locks. To being able to read the seq_next value on send side as well we convert it to an atomic_t value. The atomic_cmpxchg() is probably not necessary, however the atomic_inc() depends on a if coniditional and this should be handled in an atomic context.
Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: David Teigland <[email protected]>
show more ...
|
| #
07ee3867 |
| 29-May-2023 |
Alexander Aring <[email protected]> |
fs: dlm: filter ourself midcomms calls
It makes no sense to call midcomms/lowcomms functionality for the local node as socket functionality is only required for remote nodes. This patch filters thos
fs: dlm: filter ourself midcomms calls
It makes no sense to call midcomms/lowcomms functionality for the local node as socket functionality is only required for remote nodes. This patch filters those calls in the upper layer of lockspace membership handling instead of doing it in midcomms/lowcomms layer as they should never be aware of local nodeid.
Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: David Teigland <[email protected]>
show more ...
|
| #
c6b6d6dc |
| 29-May-2023 |
Alexander Aring <[email protected]> |
fs: dlm: revert check required context while close
This patch reverts commit 2c3fa6ae4d52 ("dlm: check required context while close"). The function dlm_midcomms_close(), which will call later dlm_lo
fs: dlm: revert check required context while close
This patch reverts commit 2c3fa6ae4d52 ("dlm: check required context while close"). The function dlm_midcomms_close(), which will call later dlm_lowcomms_close(), is called when the cluster manager tells the node got fenced which means on midcomms/lowcomms layer to disconnect the node from the cluster communication. The node can rejoin the cluster later. This patch was ensuring no new message were able to be triggered when we are in the close() function context. This was done by checking if the lockspace has been stopped. However there is a missing check that we only need to check specific lockspaces where the fenced node is member of. This is currently complicated because there is no way to easily check if a node is part of a specific lockspace without stopping the recovery. For now we just revert this commit as it is just a check to finding possible leaks of stopping lockspaces before close() is called.
Cc: [email protected] Fixes: 2c3fa6ae4d52 ("dlm: check required context while close") Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: David Teigland <[email protected]>
show more ...
|
|
Revision tags: v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4 |
|
| #
723b197b |
| 12-Jan-2023 |
Alexander Aring <[email protected]> |
fs: dlm: remove unnecessary waker_up() calls
The wake_up() is already handled inside of midcomms_node_reset() when switching the state to CLOSED state. So there is not need to call it after midcomms
fs: dlm: remove unnecessary waker_up() calls
The wake_up() is already handled inside of midcomms_node_reset() when switching the state to CLOSED state. So there is not need to call it after midcomms_node_reset() again.
Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: David Teigland <[email protected]>
show more ...
|
| #
ef7ef015 |
| 12-Jan-2023 |
Alexander Aring <[email protected]> |
fs: dlm: move state change into else branch
Currently we can switch at first into DLM_CLOSE_WAIT state and then do another state change if a condition is true. Instead of doing two state changes we
fs: dlm: move state change into else branch
Currently we can switch at first into DLM_CLOSE_WAIT state and then do another state change if a condition is true. Instead of doing two state changes we handle the other state change inside an else branch of this condition.
Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: David Teigland <[email protected]>
show more ...
|
| #
31864097 |
| 12-Jan-2023 |
Alexander Aring <[email protected]> |
fs: dlm: remove newline in log_print
There is an API difference between log_print() and other printk()s to put a newline or not. This one was introduced by mistake because log_print() adds a newline
fs: dlm: remove newline in log_print
There is an API difference between log_print() and other printk()s to put a newline or not. This one was introduced by mistake because log_print() adds a newline.
Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: David Teigland <[email protected]>
show more ...
|
| #
11605353 |
| 12-Jan-2023 |
Alexander Aring <[email protected]> |
fs: dlm: reduce the shutdown timeout to 5 secs
When a shutdown is stuck, time out after 5 seconds instead of 3 minutes. After this timeout we try a forced shutdown.
Signed-off-by: Alexander Aring
fs: dlm: reduce the shutdown timeout to 5 secs
When a shutdown is stuck, time out after 5 seconds instead of 3 minutes. After this timeout we try a forced shutdown.
Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: David Teigland <[email protected]>
show more ...
|
| #
b8b750e0 |
| 12-Jan-2023 |
Alexander Aring <[email protected]> |
fs: dlm: wait until all midcomms nodes detect version
The current dlm version detection is very complex due to backwards compatablilty with earlier dlm protocol versions. It takes some time to detec
fs: dlm: wait until all midcomms nodes detect version
The current dlm version detection is very complex due to backwards compatablilty with earlier dlm protocol versions. It takes some time to detect if a peer node has a specific DLM version. If it's not detected, we just cut the socket connection. There could be cases where the local node has not detected the version yet, but the peer node has. In these cases, we are trying to shutdown the dlm connection with a FIN/ACK message exchange to be sure the other peer is ready to shutdown the connection on dlm application level. However this mechanism is only available on DLM protocol version 3.2 and we need to be sure the DLM version is detected before.
To make it more robust we introduce a a "best effort" wait to wait for the version detection before shutdown the dlm connection. This need to be done before the kthread recoverd for recovery handling is stopped, because recovery handling will trigger enough messages to have a version detection going on.
It is a corner case which was detected by modprobe dlm_locktroture module and rmmod dlm_locktorture module directly afterwards (in a looping behaviour). In practice probably nobody would leave a lockspace immediately after joining it.
Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: David Teigland <[email protected]>
show more ...
|
| #
89835b06 |
| 12-Jan-2023 |
Alexander Aring <[email protected]> |
fs: dlm: ignore unexpected non dlm opts msgs
This patch ignores unexpected RCOM_NAMES/RCOM_STATUS messages. To be backwards compatible, those messages are not part of the new reliable DLM OPTS encap
fs: dlm: ignore unexpected non dlm opts msgs
This patch ignores unexpected RCOM_NAMES/RCOM_STATUS messages. To be backwards compatible, those messages are not part of the new reliable DLM OPTS encapsulation header, and have their own retransmit handling using sequence number matching When we get unexpected non dlm opts messages, we should allow them and let RCOM message handling filter them out using sequence numbers.
Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: David Teigland <[email protected]>
show more ...
|
| #
54fbe0c1 |
| 12-Jan-2023 |
Alexander Aring <[email protected]> |
fs: dlm: bring back previous shutdown handling
This patch mostly reverts commit 4f567acb0b86 ("fs: dlm: remove socket shutdown handling"). There can be situations where the dlm midcomms nodes hash a
fs: dlm: bring back previous shutdown handling
This patch mostly reverts commit 4f567acb0b86 ("fs: dlm: remove socket shutdown handling"). There can be situations where the dlm midcomms nodes hash and lowcomms connection hash are not equal, but we need to guarantee that the lowcomms are all closed on a last release of a dlm lockspace, when a shutdown is invoked. This patch guarantees that we always close all sockets managed by the lowcomms connection hash, and calls shutdown for the last message sent. This ensures we don't cut the socket, which could cause the peer to get a connection reset.
In future we should try to merge the midcomms/lowcomms hashes into one hash and not handle both in separate hashes.
Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: David Teigland <[email protected]>
show more ...
|
| #
00908b33 |
| 12-Jan-2023 |
Alexander Aring <[email protected]> |
fs: dlm: send FIN ack back in right cases
This patch moves to send a ack back for receiving a FIN message only when we are in valid states. In other cases and there might be a sender waiting for a a
fs: dlm: send FIN ack back in right cases
This patch moves to send a ack back for receiving a FIN message only when we are in valid states. In other cases and there might be a sender waiting for a ack we just let it timeout at the senders time and hopefully all other cleanups will remove the FIN message on their sending queue. As an example we should never send out an ACK being in LAST_ACK state or we cannot assume a working socket communication when we are in CLOSED state.
Cc: [email protected] Fixes: 489d8e559c65 ("fs: dlm: add reliable connection if reconnect") Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: David Teigland <[email protected]>
show more ...
|
| #
a5849636 |
| 12-Jan-2023 |
Alexander Aring <[email protected]> |
fs: dlm: move sending fin message into state change handling
This patch moves the send fin handling, which should appear in a specific state change, into the state change handling while the per node
fs: dlm: move sending fin message into state change handling
This patch moves the send fin handling, which should appear in a specific state change, into the state change handling while the per node state_lock is held. I experienced issues with other messages because we changed the state and a fin message was sent out in a different state.
Cc: [email protected] Fixes: 489d8e559c65 ("fs: dlm: add reliable connection if reconnect") Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: David Teigland <[email protected]>
show more ...
|
| #
15c63db8 |
| 12-Jan-2023 |
Alexander Aring <[email protected]> |
fs: dlm: don't set stop rx flag after node reset
Similar to the stop tx flag, the rx flag should warn about a dlm message being received at DLM_FIN state change, when we are assuming no other dlm ap
fs: dlm: don't set stop rx flag after node reset
Similar to the stop tx flag, the rx flag should warn about a dlm message being received at DLM_FIN state change, when we are assuming no other dlm application messages. If we receive a FIN message and we are in the state DLM_FIN_WAIT2 we call midcomms_node_reset() which puts the midcomms node into DLM_CLOSED state. Afterwards we should not set the DLM_NODE_FLAG_STOP_RX flag any more. This patch changes the setting DLM_NODE_FLAG_STOP_RX in those state changes when we receive a FIN message and we assume there will be no other dlm application messages received until we hit DLM_CLOSED state.
Cc: [email protected] Fixes: 489d8e559c65 ("fs: dlm: add reliable connection if reconnect") Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: David Teigland <[email protected]>
show more ...
|