1Multithreading in memcached *was* originally simple: 2 3- One listener thread 4- N "event worker" threads 5- Some misc background threads 6 7Each worker thread is assigned connections, and runs its own epoll loop. The 8central hash table, LRU lists, and some statistics counters are covered by 9global locks. Protocol parsing, data transfer happens in threads. Data lookups 10and modifications happen under central locks. 11 12THIS HAS CHANGED! 13 14- A secondary small hash table of locks is used to lock an item by its hash 15 value. This prevents multiple threads from acting on the same item at the 16 same time. 17- This secondary hash table is mapped to the central hash tables buckets. This 18 allows multiple threads to access the hash table in parallel. Only one 19 thread may read or write against a particular hash table bucket. 20- atomic refcounts per item are used to manage garbage collection and 21 mutability. 22 23- When pulling an item off of the LRU tail for eviction or re-allocation, the 24 system must attempt to lock the item's bucket, which is done with a trylock 25 to avoid deadlocks. If a bucket is in use (and not by that thread) it will 26 walk up the LRU a little in an attempt to fetch a non-busy item. 27 28- Each LRU (and sub-LRU's in newer modes) has an independent lock. 29 30- Raw accesses to the slab class are protected by a global slabs_lock. This 31 is a short lock which covers pushing and popping free memory. 32 33- item_lock must be held while modifying an item. 34- slabs_lock must be held while modifying the ITEM_SLABBED flag bit within an item. 35- ITEM_LINKED must not be set before an item has a key copied into it. 36- items without ITEM_SLABBED set cannot have their memory zeroed out. 37 38LOCK ORDERS: 39 40(incomplete as of writing, sorry): 41 42item_lock -> lru_lock -> slabs_lock 43 44lru_lock -> item_trylock 45 46Various stats_locks should never have other locks as dependencies. 47 48Various locks exist for background threads. They can be used to pause the 49thread execution or update settings while the threads are idle. They may call 50item or lru locks. 51 52A low priority issue: 53 54- If you remove the per-thread stats lock, CPU usage goes down by less than a 55 point of a percent, and it does not improve scalability. 56- In my testing, the remaining global STATS_LOCK calls never seem to collide. 57 58Yes, more stats can be moved to threads, and those locks can actually be 59removed entirely on x86-64 systems. However my tests haven't shown that as 60beneficial so far, so I've prioritized other work. 61