1898bd37aSMauro Carvalho Chehab========================================== 2898bd37aSMauro Carvalho ChehabExplicit volatile write back cache control 3898bd37aSMauro Carvalho Chehab========================================== 4898bd37aSMauro Carvalho Chehab 5898bd37aSMauro Carvalho ChehabIntroduction 6898bd37aSMauro Carvalho Chehab------------ 7898bd37aSMauro Carvalho Chehab 8898bd37aSMauro Carvalho ChehabMany storage devices, especially in the consumer market, come with volatile 9898bd37aSMauro Carvalho Chehabwrite back caches. That means the devices signal I/O completion to the 10898bd37aSMauro Carvalho Chehaboperating system before data actually has hit the non-volatile storage. This 11898bd37aSMauro Carvalho Chehabbehavior obviously speeds up various workloads, but it means the operating 12898bd37aSMauro Carvalho Chehabsystem needs to force data out to the non-volatile storage when it performs 13898bd37aSMauro Carvalho Chehaba data integrity operation like fsync, sync or an unmount. 14898bd37aSMauro Carvalho Chehab 15898bd37aSMauro Carvalho ChehabThe Linux block layer provides two simple mechanisms that let filesystems 16898bd37aSMauro Carvalho Chehabcontrol the caching behavior of the storage device. These mechanisms are 17898bd37aSMauro Carvalho Chehaba forced cache flush, and the Force Unit Access (FUA) flag for requests. 18898bd37aSMauro Carvalho Chehab 19898bd37aSMauro Carvalho Chehab 20898bd37aSMauro Carvalho ChehabExplicit cache flushes 21898bd37aSMauro Carvalho Chehab---------------------- 22898bd37aSMauro Carvalho Chehab 23898bd37aSMauro Carvalho ChehabThe REQ_PREFLUSH flag can be OR ed into the r/w flags of a bio submitted from 24898bd37aSMauro Carvalho Chehabthe filesystem and will make sure the volatile cache of the storage device 25898bd37aSMauro Carvalho Chehabhas been flushed before the actual I/O operation is started. This explicitly 26898bd37aSMauro Carvalho Chehabguarantees that previously completed write requests are on non-volatile 27898bd37aSMauro Carvalho Chehabstorage before the flagged bio starts. In addition the REQ_PREFLUSH flag can be 28898bd37aSMauro Carvalho Chehabset on an otherwise empty bio structure, which causes only an explicit cache 29898bd37aSMauro Carvalho Chehabflush without any dependent I/O. It is recommend to use 30898bd37aSMauro Carvalho Chehabthe blkdev_issue_flush() helper for a pure cache flush. 31898bd37aSMauro Carvalho Chehab 32898bd37aSMauro Carvalho Chehab 33898bd37aSMauro Carvalho ChehabForced Unit Access 34898bd37aSMauro Carvalho Chehab------------------ 35898bd37aSMauro Carvalho Chehab 36898bd37aSMauro Carvalho ChehabThe REQ_FUA flag can be OR ed into the r/w flags of a bio submitted from the 37898bd37aSMauro Carvalho Chehabfilesystem and will make sure that I/O completion for this request is only 38898bd37aSMauro Carvalho Chehabsignaled after the data has been committed to non-volatile storage. 39898bd37aSMauro Carvalho Chehab 40898bd37aSMauro Carvalho Chehab 41898bd37aSMauro Carvalho ChehabImplementation details for filesystems 42898bd37aSMauro Carvalho Chehab-------------------------------------- 43898bd37aSMauro Carvalho Chehab 44898bd37aSMauro Carvalho ChehabFilesystems can simply set the REQ_PREFLUSH and REQ_FUA bits and do not have to 45898bd37aSMauro Carvalho Chehabworry if the underlying devices need any explicit cache flushing and how 46898bd37aSMauro Carvalho Chehabthe Forced Unit Access is implemented. The REQ_PREFLUSH and REQ_FUA flags 47898bd37aSMauro Carvalho Chehabmay both be set on a single bio. 48898bd37aSMauro Carvalho Chehab 491122c0c1SChristoph HellwigFeature settings for block drivers 501122c0c1SChristoph Hellwig---------------------------------- 51898bd37aSMauro Carvalho Chehab 52898bd37aSMauro Carvalho ChehabFor devices that do not support volatile write caches there is no driver 53898bd37aSMauro Carvalho Chehabsupport required, the block layer completes empty REQ_PREFLUSH requests before 54898bd37aSMauro Carvalho Chehabentering the driver and strips off the REQ_PREFLUSH and REQ_FUA bits from 551122c0c1SChristoph Hellwigrequests that have a payload. 56898bd37aSMauro Carvalho Chehab 571122c0c1SChristoph HellwigFor devices with volatile write caches the driver needs to tell the block layer 581122c0c1SChristoph Hellwigthat it supports flushing caches by setting the 59898bd37aSMauro Carvalho Chehab 601122c0c1SChristoph Hellwig BLK_FEAT_WRITE_CACHE 61898bd37aSMauro Carvalho Chehab 621122c0c1SChristoph Hellwigflag in the queue_limits feature field. For devices that also support the FUA 631122c0c1SChristoph Hellwigbit the block layer needs to be told to pass on the REQ_FUA bit by also setting 641122c0c1SChristoph Hellwigthe 65898bd37aSMauro Carvalho Chehab 661122c0c1SChristoph Hellwig BLK_FEAT_FUA 671122c0c1SChristoph Hellwig 681122c0c1SChristoph Hellwigflag in the features field of the queue_limits structure. 691122c0c1SChristoph Hellwig 701122c0c1SChristoph HellwigImplementation details for bio based block drivers 711122c0c1SChristoph Hellwig-------------------------------------------------- 721122c0c1SChristoph Hellwig 73*4e54ea72SChristoph HellwigFor bio based drivers the REQ_PREFLUSH and REQ_FUA bit are simply passed on to 74*4e54ea72SChristoph Hellwigthe driver if the driver sets the BLK_FEAT_WRITE_CACHE flag and the driver 751122c0c1SChristoph Hellwigneeds to handle them. 761122c0c1SChristoph Hellwig 771122c0c1SChristoph Hellwig*NOTE*: The REQ_FUA bit also gets passed on when the BLK_FEAT_FUA flags is 781122c0c1SChristoph Hellwig_not_ set. Any bio based driver that sets BLK_FEAT_WRITE_CACHE also needs to 791122c0c1SChristoph Hellwighandle REQ_FUA. 801122c0c1SChristoph Hellwig 811122c0c1SChristoph HellwigFor remapping drivers the REQ_FUA bits need to be propagated to underlying 821122c0c1SChristoph Hellwigdevices, and a global flush needs to be implemented for bios with the 831122c0c1SChristoph HellwigREQ_PREFLUSH bit set. 841122c0c1SChristoph Hellwig 851122c0c1SChristoph HellwigImplementation details for blk-mq drivers 861122c0c1SChristoph Hellwig----------------------------------------- 871122c0c1SChristoph Hellwig 881122c0c1SChristoph HellwigWhen the BLK_FEAT_WRITE_CACHE flag is set, REQ_OP_WRITE | REQ_PREFLUSH requests 891122c0c1SChristoph Hellwigwith a payload are automatically turned into a sequence of a REQ_OP_FLUSH 901122c0c1SChristoph Hellwigrequest followed by the actual write by the block layer. 911122c0c1SChristoph Hellwig 92*4e54ea72SChristoph HellwigWhen the BLK_FEAT_FUA flags is set, the REQ_FUA bit is simply passed on for the 931122c0c1SChristoph HellwigREQ_OP_WRITE request, else a REQ_OP_FLUSH request is sent by the block layer 941122c0c1SChristoph Hellwigafter the completion of the write request for bio submissions with the REQ_FUA 951122c0c1SChristoph Hellwigbit set. 96