1.. _appendix_A: 2 3Appendix A Costs of Time Slicing 4================================ 5 6 7Time slicing enables there to be more logical threads than physical 8threads. Each logical thread is serviced for a *time slice* by a 9physical thread. If a thread runs longer than a time slice, as most do, 10it relinquishes the physical thread until it gets another turn. This 11appendix details the costs incurred by time slicing. 12 13 14The most obvious is the time for *context switching* between logical 15threads. Each context switch requires that the processor save all its 16registers for the previous logical thread that it was executing, and 17load its registers for the next logical thread that it runs. 18 19 20A more subtle cost is *cache cooling*. Processors keep recently accessed 21data in cache memory, which is very fast, but also relatively small 22compared to main memory. When the processor runs out of cache memory, it 23has to evict items from cache and put them back into main memory. 24Typically, it chooses the least recently used items in the cache. (The 25reality of set-associative caches is a bit more complicated, but this is 26not a cache primer.) When a logical thread gets its time slice, as it 27references a piece of data for the first time, this data will be pulled 28into cache, taking hundreds of cycles. If it is referenced frequently 29enough to not be evicted, each subsequent reference will find it in 30cache, and only take a few cycles. Such data is called "hot in cache". 31Time slicing undoes this, because if a thread A finishes its time slice, 32and subsequently thread B runs on the same physical thread, B will tend 33to evict data that was hot in cache for A, unless both threads need the 34data. When thread A gets its next time slice, it will need to reload 35evicted data, at the cost of hundreds of cycles for each cache miss. Or 36worse yet, the next time slice for thread A may be on a different 37physical thread that has a different cache altogether. 38 39 40Another cost is *lock preemption.* This happens if a thread acquires a 41lock on a resource, and its time slice runs out before it releases the 42lock. No matter how short a time the thread intended to hold the lock, 43it is now going to hold it for at least as long as it takes for its next 44turn at a time slice to come up. Any other threads waiting on the lock 45either pointlessly busy-wait, or lose the rest of their time slice. The 46effect is called *convoying*, because the threads end up "bumper to 47bumper" waiting for the preempted thread in front to resume driving. 48 49