1Multithreading in memcached *was* originally simple:
2
3- One listener thread
4- N "event worker" threads
5- Some misc background threads
6
7Each worker thread is assigned connections, and runs its own epoll loop. The
8central hash table, LRU lists, and some statistics counters are covered by
9global locks. Protocol parsing, data transfer happens in threads. Data lookups
10and modifications happen under central locks.
11
12THIS HAS CHANGED!
13
14- A secondary small hash table of locks is used to lock an item by its hash
15  value. This prevents multiple threads from acting on the same item at the
16  same time.
17- This secondary hash table is mapped to the central hash tables buckets. This
18  allows multiple threads to access the hash table in parallel. Only one
19  thread may read or write against a particular hash table bucket.
20- atomic refcounts per item are used to manage garbage collection and
21  mutability.
22
23- When pulling an item off of the LRU tail for eviction or re-allocation, the
24  system must attempt to lock the item's bucket, which is done with a trylock
25  to avoid deadlocks. If a bucket is in use (and not by that thread) it will
26  walk up the LRU a little in an attempt to fetch a non-busy item.
27
28- Each LRU (and sub-LRU's in newer modes) has an independent lock.
29
30- Raw accesses to the slab class are protected by a global slabs_lock. This
31  is a short lock which covers pushing and popping free memory.
32
33- item_lock must be held while modifying an item.
34- slabs_lock must be held while modifying the ITEM_SLABBED flag bit within an item.
35- ITEM_LINKED must not be set before an item has a key copied into it.
36- items without ITEM_SLABBED set cannot have their memory zeroed out.
37
38LOCK ORDERS:
39
40(incomplete as of writing, sorry):
41
42item_lock -> lru_lock -> slabs_lock
43
44lru_lock -> item_trylock
45
46Various stats_locks should never have other locks as dependencies.
47
48Various locks exist for background threads. They can be used to pause the
49thread execution or update settings while the threads are idle. They may call
50item or lru locks.
51
52A low priority issue:
53
54- If you remove the per-thread stats lock, CPU usage goes down by less than a
55  point of a percent, and it does not improve scalability.
56- In my testing, the remaining global STATS_LOCK calls never seem to collide.
57
58Yes, more stats can be moved to threads, and those locks can actually be
59removed entirely on x86-64 systems. However my tests haven't shown that as
60beneficial so far, so I've prioritized other work.
61