1ed7c855cSdrh /* 2b19a2bc6Sdrh ** 2001 September 15 3ed7c855cSdrh ** 4b19a2bc6Sdrh ** The author disclaims copyright to this source code. In place of 5b19a2bc6Sdrh ** a legal notice, here is a blessing: 6ed7c855cSdrh ** 7b19a2bc6Sdrh ** May you do good and not evil. 8b19a2bc6Sdrh ** May you find forgiveness for yourself and forgive others. 9b19a2bc6Sdrh ** May you share freely, never taking more than you give. 10ed7c855cSdrh ** 11ed7c855cSdrh ************************************************************************* 12b19a2bc6Sdrh ** This is the implementation of the page cache subsystem or "pager". 13ed7c855cSdrh ** 14b19a2bc6Sdrh ** The pager is used to access a database disk file. It implements 15b19a2bc6Sdrh ** atomic commit and rollback through the use of a journal file that 16b19a2bc6Sdrh ** is separate from the database file. The pager also implements file 17b19a2bc6Sdrh ** locking to prevent two processes from writing the same database 18b19a2bc6Sdrh ** file simultaneously, or one process from reading the database while 19b19a2bc6Sdrh ** another is writing. 20ed7c855cSdrh */ 212e66f0b9Sdrh #ifndef SQLITE_OMIT_DISKIO 22d9b0257aSdrh #include "sqliteInt.h" 23c438efd6Sdrh #include "wal.h" 24ed7c855cSdrh 25e5918c62Sdrh 26e5918c62Sdrh /******************* NOTES ON THE DESIGN OF THE PAGER ************************ 27e5918c62Sdrh ** 28e5918c62Sdrh ** This comment block describes invariants that hold when using a rollback 29e5918c62Sdrh ** journal. These invariants do not apply for journal_mode=WAL, 30e5918c62Sdrh ** journal_mode=MEMORY, or journal_mode=OFF. 3191781bd7Sdrh ** 3291781bd7Sdrh ** Within this comment block, a page is deemed to have been synced 3391781bd7Sdrh ** automatically as soon as it is written when PRAGMA synchronous=OFF. 3491781bd7Sdrh ** Otherwise, the page is not synced until the xSync method of the VFS 3591781bd7Sdrh ** is called successfully on the file containing the page. 3691781bd7Sdrh ** 3791781bd7Sdrh ** Definition: A page of the database file is said to be "overwriteable" if 3891781bd7Sdrh ** one or more of the following are true about the page: 3991781bd7Sdrh ** 4091781bd7Sdrh ** (a) The original content of the page as it was at the beginning of 4191781bd7Sdrh ** the transaction has been written into the rollback journal and 4291781bd7Sdrh ** synced. 4391781bd7Sdrh ** 4491781bd7Sdrh ** (b) The page was a freelist leaf page at the start of the transaction. 4591781bd7Sdrh ** 4691781bd7Sdrh ** (c) The page number is greater than the largest page that existed in 4791781bd7Sdrh ** the database file at the start of the transaction. 4891781bd7Sdrh ** 4991781bd7Sdrh ** (1) A page of the database file is never overwritten unless one of the 5091781bd7Sdrh ** following are true: 5191781bd7Sdrh ** 5291781bd7Sdrh ** (a) The page and all other pages on the same sector are overwriteable. 5391781bd7Sdrh ** 5491781bd7Sdrh ** (b) The atomic page write optimization is enabled, and the entire 5591781bd7Sdrh ** transaction other than the update of the transaction sequence 5691781bd7Sdrh ** number consists of a single page change. 5791781bd7Sdrh ** 5891781bd7Sdrh ** (2) The content of a page written into the rollback journal exactly matches 5991781bd7Sdrh ** both the content in the database when the rollback journal was written 6091781bd7Sdrh ** and the content in the database at the beginning of the current 6191781bd7Sdrh ** transaction. 6291781bd7Sdrh ** 6391781bd7Sdrh ** (3) Writes to the database file are an integer multiple of the page size 64e5918c62Sdrh ** in length and are aligned on a page boundary. 6591781bd7Sdrh ** 6691781bd7Sdrh ** (4) Reads from the database file are either aligned on a page boundary and 6791781bd7Sdrh ** an integer multiple of the page size in length or are taken from the 6891781bd7Sdrh ** first 100 bytes of the database file. 6991781bd7Sdrh ** 7091781bd7Sdrh ** (5) All writes to the database file are synced prior to the rollback journal 7191781bd7Sdrh ** being deleted, truncated, or zeroed. 7291781bd7Sdrh ** 7391781bd7Sdrh ** (6) If a master journal file is used, then all writes to the database file 7491781bd7Sdrh ** are synced prior to the master journal being deleted. 7591781bd7Sdrh ** 7691781bd7Sdrh ** Definition: Two databases (or the same database at two points it time) 7791781bd7Sdrh ** are said to be "logically equivalent" if they give the same answer to 78d5578433Smistachkin ** all queries. Note in particular the content of freelist leaf 7960ec914cSpeter.d.reid ** pages can be changed arbitrarily without affecting the logical equivalence 8091781bd7Sdrh ** of the database. 8191781bd7Sdrh ** 8291781bd7Sdrh ** (7) At any time, if any subset, including the empty set and the total set, 8391781bd7Sdrh ** of the unsynced changes to a rollback journal are removed and the 8460ec914cSpeter.d.reid ** journal is rolled back, the resulting database file will be logically 8591781bd7Sdrh ** equivalent to the database file at the beginning of the transaction. 8691781bd7Sdrh ** 8791781bd7Sdrh ** (8) When a transaction is rolled back, the xTruncate method of the VFS 8891781bd7Sdrh ** is called to restore the database file to the same size it was at 8991781bd7Sdrh ** the beginning of the transaction. (In some VFSes, the xTruncate 9091781bd7Sdrh ** method is a no-op, but that does not change the fact the SQLite will 9191781bd7Sdrh ** invoke it.) 9291781bd7Sdrh ** 9391781bd7Sdrh ** (9) Whenever the database file is modified, at least one bit in the range 9491781bd7Sdrh ** of bytes from 24 through 39 inclusive will be changed prior to releasing 95e5918c62Sdrh ** the EXCLUSIVE lock, thus signaling other connections on the same 96e5918c62Sdrh ** database to flush their caches. 9791781bd7Sdrh ** 9891781bd7Sdrh ** (10) The pattern of bits in bytes 24 through 39 shall not repeat in less 9991781bd7Sdrh ** than one billion transactions. 10091781bd7Sdrh ** 10191781bd7Sdrh ** (11) A database file is well-formed at the beginning and at the conclusion 10291781bd7Sdrh ** of every transaction. 10391781bd7Sdrh ** 10491781bd7Sdrh ** (12) An EXCLUSIVE lock is held on the database file when writing to 10591781bd7Sdrh ** the database file. 10691781bd7Sdrh ** 10791781bd7Sdrh ** (13) A SHARED lock is held on the database file while reading any 10891781bd7Sdrh ** content out of the database file. 109e5918c62Sdrh ** 110e5918c62Sdrh ******************************************************************************/ 11191781bd7Sdrh 11291781bd7Sdrh /* 113db48ee02Sdrh ** Macros for troubleshooting. Normally turned off 114db48ee02Sdrh */ 115466be56bSdanielk1977 #if 0 116f2c31ad8Sdanielk1977 int sqlite3PagerTrace=1; /* True to enable tracing */ 117d3627afcSdrh #define sqlite3DebugPrintf printf 11830d53701Sdrh #define PAGERTRACE(X) if( sqlite3PagerTrace ){ sqlite3DebugPrintf X; } 119db48ee02Sdrh #else 12030d53701Sdrh #define PAGERTRACE(X) 121db48ee02Sdrh #endif 122db48ee02Sdrh 123599fcbaeSdanielk1977 /* 12430d53701Sdrh ** The following two macros are used within the PAGERTRACE() macros above 125d86959f5Sdrh ** to print out file-descriptors. 126599fcbaeSdanielk1977 ** 12785b623f2Sdrh ** PAGERID() takes a pointer to a Pager struct as its argument. The 12862079060Sdanielk1977 ** associated file-descriptor is returned. FILEHANDLEID() takes an sqlite3_file 12985b623f2Sdrh ** struct as its argument. 130599fcbaeSdanielk1977 */ 131b7f4b6ccSdrh #define PAGERID(p) (SQLITE_PTR_TO_INT(p->fd)) 132b7f4b6ccSdrh #define FILEHANDLEID(fd) (SQLITE_PTR_TO_INT(fd)) 133db48ee02Sdrh 134db48ee02Sdrh /* 135d0864087Sdan ** The Pager.eState variable stores the current 'state' of a pager. A 136431b0b42Sdan ** pager may be in any one of the seven states shown in the following 137431b0b42Sdan ** state diagram. 138431b0b42Sdan ** 139de1ae34eSdan ** OPEN <------+------+ 140431b0b42Sdan ** | | | 141431b0b42Sdan ** V | | 142431b0b42Sdan ** +---------> READER-------+ | 143431b0b42Sdan ** | | | 144431b0b42Sdan ** | V | 145de1ae34eSdan ** |<-------WRITER_LOCKED------> ERROR 146431b0b42Sdan ** | | ^ 147431b0b42Sdan ** | V | 148431b0b42Sdan ** |<------WRITER_CACHEMOD-------->| 149431b0b42Sdan ** | | | 150431b0b42Sdan ** | V | 151431b0b42Sdan ** |<-------WRITER_DBMOD---------->| 152431b0b42Sdan ** | | | 153431b0b42Sdan ** | V | 154431b0b42Sdan ** +<------WRITER_FINISHED-------->+ 155d0864087Sdan ** 15611f47a9bSdan ** 15711f47a9bSdan ** List of state transitions and the C [function] that performs each: 15811f47a9bSdan ** 159de1ae34eSdan ** OPEN -> READER [sqlite3PagerSharedLock] 160de1ae34eSdan ** READER -> OPEN [pager_unlock] 16111f47a9bSdan ** 162de1ae34eSdan ** READER -> WRITER_LOCKED [sqlite3PagerBegin] 163de1ae34eSdan ** WRITER_LOCKED -> WRITER_CACHEMOD [pager_open_journal] 16411f47a9bSdan ** WRITER_CACHEMOD -> WRITER_DBMOD [syncJournal] 16511f47a9bSdan ** WRITER_DBMOD -> WRITER_FINISHED [sqlite3PagerCommitPhaseOne] 16611f47a9bSdan ** WRITER_*** -> READER [pager_end_transaction] 16711f47a9bSdan ** 16811f47a9bSdan ** WRITER_*** -> ERROR [pager_error] 169de1ae34eSdan ** ERROR -> OPEN [pager_unlock] 17011f47a9bSdan ** 17111f47a9bSdan ** 172de1ae34eSdan ** OPEN: 173937ac9daSdan ** 174763afe62Sdan ** The pager starts up in this state. Nothing is guaranteed in this 175763afe62Sdan ** state - the file may or may not be locked and the database size is 176763afe62Sdan ** unknown. The database may not be read or written. 177763afe62Sdan ** 178d0864087Sdan ** * No read or write transaction is active. 179d0864087Sdan ** * Any lock, or no lock at all, may be held on the database file. 180763afe62Sdan ** * The dbSize, dbOrigSize and dbFileSize variables may not be trusted. 181d0864087Sdan ** 182d0864087Sdan ** READER: 183b22aa4a6Sdan ** 184763afe62Sdan ** In this state all the requirements for reading the database in 185763afe62Sdan ** rollback (non-WAL) mode are met. Unless the pager is (or recently 186763afe62Sdan ** was) in exclusive-locking mode, a user-level read transaction is 187763afe62Sdan ** open. The database size is known in this state. 188763afe62Sdan ** 18954919f82Sdan ** A connection running with locking_mode=normal enters this state when 19054919f82Sdan ** it opens a read-transaction on the database and returns to state 191de1ae34eSdan ** OPEN after the read-transaction is completed. However a connection 19254919f82Sdan ** running in locking_mode=exclusive (including temp databases) remains in 19354919f82Sdan ** this state even after the read-transaction is closed. The only way 194de1ae34eSdan ** a locking_mode=exclusive connection can transition from READER to OPEN 19554919f82Sdan ** is via the ERROR state (see below). 19654919f82Sdan ** 19754919f82Sdan ** * A read transaction may be active (but a write-transaction cannot). 198d0864087Sdan ** * A SHARED or greater lock is held on the database file. 199763afe62Sdan ** * The dbSize variable may be trusted (even if a user-level read 200937ac9daSdan ** transaction is not active). The dbOrigSize and dbFileSize variables 201937ac9daSdan ** may not be trusted at this point. 20254919f82Sdan ** * If the database is a WAL database, then the WAL connection is open. 20354919f82Sdan ** * Even if a read-transaction is not open, it is guaranteed that 20454919f82Sdan ** there is no hot-journal in the file-system. 205d0864087Sdan ** 206de1ae34eSdan ** WRITER_LOCKED: 207b22aa4a6Sdan ** 20811f47a9bSdan ** The pager moves to this state from READER when a write-transaction 209de1ae34eSdan ** is first opened on the database. In WRITER_LOCKED state, all locks 210de1ae34eSdan ** required to start a write-transaction are held, but no actual 211de1ae34eSdan ** modifications to the cache or database have taken place. 212de1ae34eSdan ** 213de1ae34eSdan ** In rollback mode, a RESERVED or (if the transaction was opened with 214de1ae34eSdan ** BEGIN EXCLUSIVE) EXCLUSIVE lock is obtained on the database file when 215de1ae34eSdan ** moving to this state, but the journal file is not written to or opened 216de1ae34eSdan ** to in this state. If the transaction is committed or rolled back while 217de1ae34eSdan ** in WRITER_LOCKED state, all that is required is to unlock the database 218de1ae34eSdan ** file. 219de1ae34eSdan ** 220de1ae34eSdan ** IN WAL mode, WalBeginWriteTransaction() is called to lock the log file. 221de1ae34eSdan ** If the connection is running with locking_mode=exclusive, an attempt 222de1ae34eSdan ** is made to obtain an EXCLUSIVE lock on the database file. 22311f47a9bSdan ** 224d0864087Sdan ** * A write transaction is active. 22511f47a9bSdan ** * If the connection is open in rollback-mode, a RESERVED or greater 22611f47a9bSdan ** lock is held on the database file. 22711f47a9bSdan ** * If the connection is open in WAL-mode, a WAL write transaction 22811f47a9bSdan ** is open (i.e. sqlite3WalBeginWriteTransaction() has been successfully 22911f47a9bSdan ** called). 230d0864087Sdan ** * The dbSize, dbOrigSize and dbFileSize variables are all valid. 231d0864087Sdan ** * The contents of the pager cache have not been modified. 232b22aa4a6Sdan ** * The journal file may or may not be open. 233b22aa4a6Sdan ** * Nothing (not even the first header) has been written to the journal. 234d0864087Sdan ** 235d0864087Sdan ** WRITER_CACHEMOD: 236b22aa4a6Sdan ** 237de1ae34eSdan ** A pager moves from WRITER_LOCKED state to this state when a page is 238de1ae34eSdan ** first modified by the upper layer. In rollback mode the journal file 239de1ae34eSdan ** is opened (if it is not already open) and a header written to the 240de1ae34eSdan ** start of it. The database file on disk has not been modified. 241de1ae34eSdan ** 242d0864087Sdan ** * A write transaction is active. 243d0864087Sdan ** * A RESERVED or greater lock is held on the database file. 244d0864087Sdan ** * The journal file is open and the first header has been written 245d0864087Sdan ** to it, but the header has not been synced to disk. 246d0864087Sdan ** * The contents of the page cache have been modified. 247d0864087Sdan ** 248d0864087Sdan ** WRITER_DBMOD: 249b22aa4a6Sdan ** 250de5fd22fSdan ** The pager transitions from WRITER_CACHEMOD into WRITER_DBMOD state 251de5fd22fSdan ** when it modifies the contents of the database file. WAL connections 252de5fd22fSdan ** never enter this state (since they do not modify the database file, 253de5fd22fSdan ** just the log file). 254de5fd22fSdan ** 255d0864087Sdan ** * A write transaction is active. 256d0864087Sdan ** * An EXCLUSIVE or greater lock is held on the database file. 257d0864087Sdan ** * The journal file is open and the first header has been written 258d0864087Sdan ** and synced to disk. 259d0864087Sdan ** * The contents of the page cache have been modified (and possibly 260d0864087Sdan ** written to disk). 261d0864087Sdan ** 262d0864087Sdan ** WRITER_FINISHED: 263b22aa4a6Sdan ** 264de5fd22fSdan ** It is not possible for a WAL connection to enter this state. 265de5fd22fSdan ** 266de5fd22fSdan ** A rollback-mode pager changes to WRITER_FINISHED state from WRITER_DBMOD 267de5fd22fSdan ** state after the entire transaction has been successfully written into the 268de5fd22fSdan ** database file. In this state the transaction may be committed simply 269de5fd22fSdan ** by finalizing the journal file. Once in WRITER_FINISHED state, it is 270de5fd22fSdan ** not possible to modify the database further. At this point, the upper 271de5fd22fSdan ** layer must either commit or rollback the transaction. 272de5fd22fSdan ** 273d0864087Sdan ** * A write transaction is active. 274d0864087Sdan ** * An EXCLUSIVE or greater lock is held on the database file. 275d0864087Sdan ** * All writing and syncing of journal and database data has finished. 27648864df9Smistachkin ** If no error occurred, all that remains is to finalize the journal to 277d0864087Sdan ** commit the transaction. If an error did occur, the caller will need 278d0864087Sdan ** to rollback the transaction. 279d0864087Sdan ** 280b22aa4a6Sdan ** ERROR: 281b22aa4a6Sdan ** 28222b328b2Sdan ** The ERROR state is entered when an IO or disk-full error (including 28322b328b2Sdan ** SQLITE_IOERR_NOMEM) occurs at a point in the code that makes it 28422b328b2Sdan ** difficult to be sure that the in-memory pager state (cache contents, 28522b328b2Sdan ** db size etc.) are consistent with the contents of the file-system. 28622b328b2Sdan ** 28722b328b2Sdan ** Temporary pager files may enter the ERROR state, but in-memory pagers 28822b328b2Sdan ** cannot. 289b22aa4a6Sdan ** 290b22aa4a6Sdan ** For example, if an IO error occurs while performing a rollback, 291b22aa4a6Sdan ** the contents of the page-cache may be left in an inconsistent state. 292b22aa4a6Sdan ** At this point it would be dangerous to change back to READER state 293b22aa4a6Sdan ** (as usually happens after a rollback). Any subsequent readers might 294b22aa4a6Sdan ** report database corruption (due to the inconsistent cache), and if 295b22aa4a6Sdan ** they upgrade to writers, they may inadvertently corrupt the database 296b22aa4a6Sdan ** file. To avoid this hazard, the pager switches into the ERROR state 297b22aa4a6Sdan ** instead of READER following such an error. 298b22aa4a6Sdan ** 299b22aa4a6Sdan ** Once it has entered the ERROR state, any attempt to use the pager 300b22aa4a6Sdan ** to read or write data returns an error. Eventually, once all 301b22aa4a6Sdan ** outstanding transactions have been abandoned, the pager is able to 302de1ae34eSdan ** transition back to OPEN state, discarding the contents of the 303b22aa4a6Sdan ** page-cache and any other in-memory state at the same time. Everything 304b22aa4a6Sdan ** is reloaded from disk (and, if necessary, hot-journal rollback peformed) 305b22aa4a6Sdan ** when a read-transaction is next opened on the pager (transitioning 306b22aa4a6Sdan ** the pager into READER state). At that point the system has recovered 307b22aa4a6Sdan ** from the error. 308b22aa4a6Sdan ** 309b22aa4a6Sdan ** Specifically, the pager jumps into the ERROR state if: 310b22aa4a6Sdan ** 311b22aa4a6Sdan ** 1. An error occurs while attempting a rollback. This happens in 312b22aa4a6Sdan ** function sqlite3PagerRollback(). 313b22aa4a6Sdan ** 314b22aa4a6Sdan ** 2. An error occurs while attempting to finalize a journal file 315b22aa4a6Sdan ** following a commit in function sqlite3PagerCommitPhaseTwo(). 316b22aa4a6Sdan ** 317b22aa4a6Sdan ** 3. An error occurs while attempting to write to the journal or 318b22aa4a6Sdan ** database file in function pagerStress() in order to free up 319b22aa4a6Sdan ** memory. 320b22aa4a6Sdan ** 321b22aa4a6Sdan ** In other cases, the error is returned to the b-tree layer. The b-tree 322b22aa4a6Sdan ** layer then attempts a rollback operation. If the error condition 323b22aa4a6Sdan ** persists, the pager enters the ERROR state via condition (1) above. 324b22aa4a6Sdan ** 325b22aa4a6Sdan ** Condition (3) is necessary because it can be triggered by a read-only 326b22aa4a6Sdan ** statement executed within a transaction. In this case, if the error 327b22aa4a6Sdan ** code were simply returned to the user, the b-tree layer would not 328b22aa4a6Sdan ** automatically attempt a rollback, as it assumes that an error in a 329b22aa4a6Sdan ** read-only statement cannot leave the pager in an internally inconsistent 330b22aa4a6Sdan ** state. 331b22aa4a6Sdan ** 332de1ae34eSdan ** * The Pager.errCode variable is set to something other than SQLITE_OK. 333de1ae34eSdan ** * There are one or more outstanding references to pages (after the 334de1ae34eSdan ** last reference is dropped the pager should move back to OPEN state). 33522b328b2Sdan ** * The pager is not an in-memory pager. 336de1ae34eSdan ** 337b22aa4a6Sdan ** 338763afe62Sdan ** Notes: 339763afe62Sdan ** 340763afe62Sdan ** * A pager is never in WRITER_DBMOD or WRITER_FINISHED state if the 341763afe62Sdan ** connection is open in WAL mode. A WAL connection is always in one 342763afe62Sdan ** of the first four states. 343763afe62Sdan ** 344de1ae34eSdan ** * Normally, a connection open in exclusive mode is never in PAGER_OPEN 345763afe62Sdan ** state. There are two exceptions: immediately after exclusive-mode has 346763afe62Sdan ** been turned on (and before any read or write transactions are 347763afe62Sdan ** executed), and when the pager is leaving the "error state". 348763afe62Sdan ** 349763afe62Sdan ** * See also: assert_pager_state(). 350d0864087Sdan */ 351de1ae34eSdan #define PAGER_OPEN 0 352d0864087Sdan #define PAGER_READER 1 353de1ae34eSdan #define PAGER_WRITER_LOCKED 2 354d0864087Sdan #define PAGER_WRITER_CACHEMOD 3 355d0864087Sdan #define PAGER_WRITER_DBMOD 4 356d0864087Sdan #define PAGER_WRITER_FINISHED 5 357a42c66bdSdan #define PAGER_ERROR 6 358d0864087Sdan 359d0864087Sdan /* 36054919f82Sdan ** The Pager.eLock variable is almost always set to one of the 36154919f82Sdan ** following locking-states, according to the lock currently held on 36254919f82Sdan ** the database file: NO_LOCK, SHARED_LOCK, RESERVED_LOCK or EXCLUSIVE_LOCK. 36354919f82Sdan ** This variable is kept up to date as locks are taken and released by 36454919f82Sdan ** the pagerLockDb() and pagerUnlockDb() wrappers. 365ed7c855cSdrh ** 36654919f82Sdan ** If the VFS xLock() or xUnlock() returns an error other than SQLITE_BUSY 36754919f82Sdan ** (i.e. one of the SQLITE_IOERR subtypes), it is not clear whether or not 36854919f82Sdan ** the operation was successful. In these circumstances pagerLockDb() and 36954919f82Sdan ** pagerUnlockDb() take a conservative approach - eLock is always updated 37054919f82Sdan ** when unlocking the file, and only updated when locking the file if the 37154919f82Sdan ** VFS call is successful. This way, the Pager.eLock variable may be set 37254919f82Sdan ** to a less exclusive (lower) value than the lock that is actually held 37354919f82Sdan ** at the system level, but it is never set to a more exclusive value. 374ed7c855cSdrh ** 37554919f82Sdan ** This is usually safe. If an xUnlock fails or appears to fail, there may 37654919f82Sdan ** be a few redundant xLock() calls or a lock may be held for longer than 37754919f82Sdan ** required, but nothing really goes wrong. 378ed7c855cSdrh ** 37954919f82Sdan ** The exception is when the database file is unlocked as the pager moves 380de1ae34eSdan ** from ERROR to OPEN state. At this point there may be a hot-journal file 38160ec914cSpeter.d.reid ** in the file-system that needs to be rolled back (as part of an OPEN->SHARED 38254919f82Sdan ** transition, by the same pager or any other). If the call to xUnlock() 38354919f82Sdan ** fails at this point and the pager is left holding an EXCLUSIVE lock, this 38454919f82Sdan ** can confuse the call to xCheckReservedLock() call made later as part 38554919f82Sdan ** of hot-journal detection. 386a6abd041Sdrh ** 38754919f82Sdan ** xCheckReservedLock() is defined as returning true "if there is a RESERVED 38854919f82Sdan ** lock held by this process or any others". So xCheckReservedLock may 38954919f82Sdan ** return true because the caller itself is holding an EXCLUSIVE lock (but 39054919f82Sdan ** doesn't know it because of a previous error in xUnlock). If this happens 39154919f82Sdan ** a hot-journal may be mistaken for a journal being created by an active 39254919f82Sdan ** transaction in another process, causing SQLite to read from the database 39354919f82Sdan ** without rolling it back. 394ed7c855cSdrh ** 39554919f82Sdan ** To work around this, if a call to xUnlock() fails when unlocking the 39654919f82Sdan ** database in the ERROR state, Pager.eLock is set to UNKNOWN_LOCK. It 39754919f82Sdan ** is only changed back to a real locking state after a successful call 398de1ae34eSdan ** to xLock(EXCLUSIVE). Also, the code to do the OPEN->SHARED state transition 39954919f82Sdan ** omits the check for a hot-journal if Pager.eLock is set to UNKNOWN_LOCK 40054919f82Sdan ** lock. Instead, it assumes a hot-journal exists and obtains an EXCLUSIVE 40154919f82Sdan ** lock on the database file before attempting to roll it back. See function 40254919f82Sdan ** PagerSharedLock() for more detail. 403aa5ccdf5Sdanielk1977 ** 40454919f82Sdan ** Pager.eLock may only be set to UNKNOWN_LOCK when the pager is in 405de1ae34eSdan ** PAGER_OPEN state. 406ed7c855cSdrh */ 4074e004aa6Sdan #define UNKNOWN_LOCK (EXCLUSIVE_LOCK+1) 4084e004aa6Sdan 409684917c2Sdrh /* 4109eb9e26bSdrh ** A macro used for invoking the codec if there is one 4119eb9e26bSdrh */ 4129eb9e26bSdrh #ifdef SQLITE_HAS_CODEC 41385d2bd22Sdrh # define CODEC1(P,D,N,X,E) \ 414fa9601a9Sdrh if( P->xCodec && P->xCodec(P->pCodec,D,N,X)==0 ){ E; } 41585d2bd22Sdrh # define CODEC2(P,D,N,X,E,O) \ 41685d2bd22Sdrh if( P->xCodec==0 ){ O=(char*)D; }else \ 417fa9601a9Sdrh if( (O=(char*)(P->xCodec(P->pCodec,D,N,X)))==0 ){ E; } 4189eb9e26bSdrh #else 41985d2bd22Sdrh # define CODEC1(P,D,N,X,E) /* NO-OP */ 42085d2bd22Sdrh # define CODEC2(P,D,N,X,E,O) O=(char*)D 4219eb9e26bSdrh #endif 4229eb9e26bSdrh 423ed7c855cSdrh /* 4241a5c00f8Sdrh ** The maximum allowed sector size. 64KiB. If the xSectorsize() method 4257cbd589dSdanielk1977 ** returns a value larger than this, then MAX_SECTOR_SIZE is used instead. 4267cbd589dSdanielk1977 ** This could conceivably cause corruption following a power failure on 4277cbd589dSdanielk1977 ** such a system. This is currently an undocumented limit. 4287cbd589dSdanielk1977 */ 4291a5c00f8Sdrh #define MAX_SECTOR_SIZE 0x10000 4307cbd589dSdanielk1977 431164c957bSdrh 432164c957bSdrh /* 433fd7f0452Sdanielk1977 ** An instance of the following structure is allocated for each active 434fd7f0452Sdanielk1977 ** savepoint and statement transaction in the system. All such structures 435fd7f0452Sdanielk1977 ** are stored in the Pager.aSavepoint[] array, which is allocated and 436fd7f0452Sdanielk1977 ** resized using sqlite3Realloc(). 437fd7f0452Sdanielk1977 ** 438fd7f0452Sdanielk1977 ** When a savepoint is created, the PagerSavepoint.iHdrOffset field is 439fd7f0452Sdanielk1977 ** set to 0. If a journal-header is written into the main journal while 440fd7f0452Sdanielk1977 ** the savepoint is active, then iHdrOffset is set to the byte offset 441fd7f0452Sdanielk1977 ** immediately following the last journal record written into the main 442fd7f0452Sdanielk1977 ** journal before the journal-header. This is required during savepoint 443fd7f0452Sdanielk1977 ** rollback (see pagerPlaybackSavepoint()). 444fd7f0452Sdanielk1977 */ 445fd7f0452Sdanielk1977 typedef struct PagerSavepoint PagerSavepoint; 446fd7f0452Sdanielk1977 struct PagerSavepoint { 447fd7f0452Sdanielk1977 i64 iOffset; /* Starting offset in main journal */ 448fd7f0452Sdanielk1977 i64 iHdrOffset; /* See above */ 449fd7f0452Sdanielk1977 Bitvec *pInSavepoint; /* Set of pages in this savepoint */ 450fd7f0452Sdanielk1977 Pgno nOrig; /* Original number of pages in file */ 451fd7f0452Sdanielk1977 Pgno iSubRec; /* Index of first record in sub-journal */ 45238e1a279Sdan #ifndef SQLITE_OMIT_WAL 45371d89919Sdan u32 aWalData[WAL_SAVEPOINT_NDATA]; /* WAL savepoint context */ 45438e1a279Sdan #endif 455fd7f0452Sdanielk1977 }; 456fd7f0452Sdanielk1977 457fd7f0452Sdanielk1977 /* 45840c3941cSdrh ** Bits of the Pager.doNotSpill flag. See further description below. 45940c3941cSdrh */ 46040c3941cSdrh #define SPILLFLAG_OFF 0x01 /* Never spill cache. Set via pragma */ 46140c3941cSdrh #define SPILLFLAG_ROLLBACK 0x02 /* Current rolling back, so do not spill */ 46240c3941cSdrh #define SPILLFLAG_NOSYNC 0x04 /* Spill is ok, but do not sync */ 46340c3941cSdrh 46440c3941cSdrh /* 46560ec914cSpeter.d.reid ** An open page cache is an instance of struct Pager. A description of 466de1ae34eSdan ** some of the more important member variables follows: 467efaaf579Sdanielk1977 ** 468de1ae34eSdan ** eState 469bea2a948Sdanielk1977 ** 470de1ae34eSdan ** The current 'state' of the pager object. See the comment and state 471de1ae34eSdan ** diagram above for a description of the pager state. 4723460d19cSdanielk1977 ** 473de1ae34eSdan ** eLock 474bea2a948Sdanielk1977 ** 475de1ae34eSdan ** For a real on-disk database, the current lock held on the database file - 476de1ae34eSdan ** NO_LOCK, SHARED_LOCK, RESERVED_LOCK or EXCLUSIVE_LOCK. 477de1ae34eSdan ** 478de1ae34eSdan ** For a temporary or in-memory database (neither of which require any 479de1ae34eSdan ** locks), this variable is always set to EXCLUSIVE_LOCK. Since such 480de1ae34eSdan ** databases always have Pager.exclusiveMode==1, this tricks the pager 481de1ae34eSdan ** logic into thinking that it already has all the locks it will ever 482de1ae34eSdan ** need (and no reason to release them). 483de1ae34eSdan ** 484de1ae34eSdan ** In some (obscure) circumstances, this variable may also be set to 485de1ae34eSdan ** UNKNOWN_LOCK. See the comment above the #define of UNKNOWN_LOCK for 486de1ae34eSdan ** details. 487bea2a948Sdanielk1977 ** 488bea2a948Sdanielk1977 ** changeCountDone 489bea2a948Sdanielk1977 ** 490bea2a948Sdanielk1977 ** This boolean variable is used to make sure that the change-counter 491bea2a948Sdanielk1977 ** (the 4-byte header field at byte offset 24 of the database file) is 492bea2a948Sdanielk1977 ** not updated more often than necessary. 493bea2a948Sdanielk1977 ** 494bea2a948Sdanielk1977 ** It is set to true when the change-counter field is updated, which 495bea2a948Sdanielk1977 ** can only happen if an exclusive lock is held on the database file. 496bea2a948Sdanielk1977 ** It is cleared (set to false) whenever an exclusive lock is 497bea2a948Sdanielk1977 ** relinquished on the database file. Each time a transaction is committed, 498bea2a948Sdanielk1977 ** The changeCountDone flag is inspected. If it is true, the work of 499bea2a948Sdanielk1977 ** updating the change-counter is omitted for the current transaction. 500bea2a948Sdanielk1977 ** 501bea2a948Sdanielk1977 ** This mechanism means that when running in exclusive mode, a connection 502bea2a948Sdanielk1977 ** need only update the change-counter once, for the first transaction 503bea2a948Sdanielk1977 ** committed. 504bea2a948Sdanielk1977 ** 505bea2a948Sdanielk1977 ** setMaster 506bea2a948Sdanielk1977 ** 5071e01cf1bSdan ** When PagerCommitPhaseOne() is called to commit a transaction, it may 5081e01cf1bSdan ** (or may not) specify a master-journal name to be written into the 5091e01cf1bSdan ** journal file before it is synced to disk. 510bea2a948Sdanielk1977 ** 5111e01cf1bSdan ** Whether or not a journal file contains a master-journal pointer affects 5121e01cf1bSdan ** the way in which the journal file is finalized after the transaction is 5131e01cf1bSdan ** committed or rolled back when running in "journal_mode=PERSIST" mode. 5141e01cf1bSdan ** If a journal file does not contain a master-journal pointer, it is 515de1ae34eSdan ** finalized by overwriting the first journal header with zeroes. If 516de1ae34eSdan ** it does contain a master-journal pointer the journal file is finalized 517de1ae34eSdan ** by truncating it to zero bytes, just as if the connection were 518de1ae34eSdan ** running in "journal_mode=truncate" mode. 5191e01cf1bSdan ** 5201e01cf1bSdan ** Journal files that contain master journal pointers cannot be finalized 5211e01cf1bSdan ** simply by overwriting the first journal-header with zeroes, as the 5221e01cf1bSdan ** master journal pointer could interfere with hot-journal rollback of any 5231e01cf1bSdan ** subsequently interrupted transaction that reuses the journal file. 5241e01cf1bSdan ** 5251e01cf1bSdan ** The flag is cleared as soon as the journal file is finalized (either 5261e01cf1bSdan ** by PagerCommitPhaseTwo or PagerRollback). If an IO error prevents the 5271e01cf1bSdan ** journal file from being successfully finalized, the setMaster flag 528de1ae34eSdan ** is cleared anyway (and the pager will move to ERROR state). 529bea2a948Sdanielk1977 ** 53040c3941cSdrh ** doNotSpill 531bea2a948Sdanielk1977 ** 53240c3941cSdrh ** This variables control the behavior of cache-spills (calls made by 53340c3941cSdrh ** the pcache module to the pagerStress() routine to write cached data 53440c3941cSdrh ** to the file-system in order to free up memory). 53585d14ed2Sdan ** 53640c3941cSdrh ** When bits SPILLFLAG_OFF or SPILLFLAG_ROLLBACK of doNotSpill are set, 53740c3941cSdrh ** writing to the database from pagerStress() is disabled altogether. 53840c3941cSdrh ** The SPILLFLAG_ROLLBACK case is done in a very obscure case that 53985d14ed2Sdan ** comes up during savepoint rollback that requires the pcache module 54085d14ed2Sdan ** to allocate a new page to prevent the journal file from being written 54140c3941cSdrh ** while it is being traversed by code in pager_playback(). The SPILLFLAG_OFF 54240c3941cSdrh ** case is a user preference. 54385d14ed2Sdan ** 544e399ac2eSdrh ** If the SPILLFLAG_NOSYNC bit is set, writing to the database from 545e399ac2eSdrh ** pagerStress() is permitted, but syncing the journal file is not. 546e399ac2eSdrh ** This flag is set by sqlite3PagerWrite() when the file-system sector-size 547e399ac2eSdrh ** is larger than the database page-size in order to prevent a journal sync 548e399ac2eSdrh ** from happening in between the journalling of two pages on the same sector. 549bea2a948Sdanielk1977 ** 550d829335eSdanielk1977 ** subjInMemory 551d829335eSdanielk1977 ** 552d829335eSdanielk1977 ** This is a boolean variable. If true, then any required sub-journal 553d829335eSdanielk1977 ** is opened as an in-memory journal file. If false, then in-memory 554d829335eSdanielk1977 ** sub-journals are only used for in-memory pager files. 555de1ae34eSdan ** 556de1ae34eSdan ** This variable is updated by the upper layer each time a new 557de1ae34eSdan ** write-transaction is opened. 558de1ae34eSdan ** 559de1ae34eSdan ** dbSize, dbOrigSize, dbFileSize 560de1ae34eSdan ** 561de1ae34eSdan ** Variable dbSize is set to the number of pages in the database file. 562de1ae34eSdan ** It is valid in PAGER_READER and higher states (all states except for 563de1ae34eSdan ** OPEN and ERROR). 564de1ae34eSdan ** 565de1ae34eSdan ** dbSize is set based on the size of the database file, which may be 566de1ae34eSdan ** larger than the size of the database (the value stored at offset 567de1ae34eSdan ** 28 of the database header by the btree). If the size of the file 568de1ae34eSdan ** is not an integer multiple of the page-size, the value stored in 569de1ae34eSdan ** dbSize is rounded down (i.e. a 5KB file with 2K page-size has dbSize==2). 570de1ae34eSdan ** Except, any file that is greater than 0 bytes in size is considered 571de1ae34eSdan ** to have at least one page. (i.e. a 1KB file with 2K page-size leads 572de1ae34eSdan ** to dbSize==1). 573de1ae34eSdan ** 574de1ae34eSdan ** During a write-transaction, if pages with page-numbers greater than 575de1ae34eSdan ** dbSize are modified in the cache, dbSize is updated accordingly. 576de1ae34eSdan ** Similarly, if the database is truncated using PagerTruncateImage(), 577de1ae34eSdan ** dbSize is updated. 578de1ae34eSdan ** 579de1ae34eSdan ** Variables dbOrigSize and dbFileSize are valid in states 580de1ae34eSdan ** PAGER_WRITER_LOCKED and higher. dbOrigSize is a copy of the dbSize 581de1ae34eSdan ** variable at the start of the transaction. It is used during rollback, 582de1ae34eSdan ** and to determine whether or not pages need to be journalled before 583de1ae34eSdan ** being modified. 584de1ae34eSdan ** 585de1ae34eSdan ** Throughout a write-transaction, dbFileSize contains the size of 586de1ae34eSdan ** the file on disk in pages. It is set to a copy of dbSize when the 587de1ae34eSdan ** write-transaction is first opened, and updated when VFS calls are made 588de1ae34eSdan ** to write or truncate the database file on disk. 589de1ae34eSdan ** 590c864912aSdan ** The only reason the dbFileSize variable is required is to suppress 591c864912aSdan ** unnecessary calls to xTruncate() after committing a transaction. If, 592c864912aSdan ** when a transaction is committed, the dbFileSize variable indicates 593c864912aSdan ** that the database file is larger than the database image (Pager.dbSize), 594c864912aSdan ** pager_truncate() is called. The pager_truncate() call uses xFilesize() 595c864912aSdan ** to measure the database file on disk, and then truncates it if required. 596c864912aSdan ** dbFileSize is not used when rolling back a transaction. In this case 597c864912aSdan ** pager_truncate() is called unconditionally (which means there may be 598c864912aSdan ** a call to xFilesize() that is not strictly required). In either case, 599c864912aSdan ** pager_truncate() may cause the file to become smaller or larger. 600c864912aSdan ** 601c864912aSdan ** dbHintSize 602c864912aSdan ** 603c864912aSdan ** The dbHintSize variable is used to limit the number of calls made to 604c864912aSdan ** the VFS xFileControl(FCNTL_SIZE_HINT) method. 605c864912aSdan ** 606c864912aSdan ** dbHintSize is set to a copy of the dbSize variable when a 607c864912aSdan ** write-transaction is opened (at the same time as dbFileSize and 608c864912aSdan ** dbOrigSize). If the xFileControl(FCNTL_SIZE_HINT) method is called, 609c864912aSdan ** dbHintSize is increased to the number of pages that correspond to the 610c864912aSdan ** size-hint passed to the method call. See pager_write_pagelist() for 611c864912aSdan ** details. 612c864912aSdan ** 613de1ae34eSdan ** errCode 614de1ae34eSdan ** 615de1ae34eSdan ** The Pager.errCode variable is only ever used in PAGER_ERROR state. It 616de1ae34eSdan ** is set to zero in all other states. In PAGER_ERROR state, Pager.errCode 617de1ae34eSdan ** is always set to SQLITE_FULL, SQLITE_IOERR or one of the SQLITE_IOERR_XXX 618de1ae34eSdan ** sub-codes. 619daaae7b9Sdrh ** 620daaae7b9Sdrh ** syncFlags, walSyncFlags 621daaae7b9Sdrh ** 622daaae7b9Sdrh ** syncFlags is either SQLITE_SYNC_NORMAL (0x02) or SQLITE_SYNC_FULL (0x03). 623daaae7b9Sdrh ** syncFlags is used for rollback mode. walSyncFlags is used for WAL mode 624daaae7b9Sdrh ** and contains the flags used to sync the checkpoint operations in the 625daaae7b9Sdrh ** lower two bits, and sync flags used for transaction commits in the WAL 626daaae7b9Sdrh ** file in bits 0x04 and 0x08. In other words, to get the correct sync flags 627daaae7b9Sdrh ** for checkpoint operations, use (walSyncFlags&0x03) and to get the correct 628daaae7b9Sdrh ** sync flags for transaction commit, use ((walSyncFlags>>2)&0x03). Note 629daaae7b9Sdrh ** that with synchronous=NORMAL in WAL mode, transaction commit is not synced 630daaae7b9Sdrh ** meaning that the 0x04 and 0x08 bits are both zero. 631ed7c855cSdrh */ 632ed7c855cSdrh struct Pager { 633b4b47411Sdanielk1977 sqlite3_vfs *pVfs; /* OS functions to use for IO */ 634bea2a948Sdanielk1977 u8 exclusiveMode; /* Boolean. True if locking_mode==EXCLUSIVE */ 6354d9c1b7fSdan u8 journalMode; /* One of the PAGER_JOURNALMODE_* values */ 63634e79ceeSdrh u8 useJournal; /* Use a rollback journal on this file */ 637603240cfSdrh u8 noSync; /* Do not sync the journal if true */ 638968af52aSdrh u8 fullSync; /* Do extra syncs of the journal for robustness */ 6396841b1cbSdrh u8 extraSync; /* sync directory after journal delete */ 640c97d8463Sdrh u8 syncFlags; /* SYNC_NORMAL or SYNC_FULL otherwise */ 641daaae7b9Sdrh u8 walSyncFlags; /* See description above */ 64257fe136bSdrh u8 tempFile; /* zFilename is a temporary or immutable file */ 64357fe136bSdrh u8 noLock; /* Do not lock (except in WAL mode) */ 644603240cfSdrh u8 readOnly; /* True for a read-only database */ 64545d6882fSdanielk1977 u8 memDb; /* True to inhibit all file I/O */ 646bea2a948Sdanielk1977 647e5918c62Sdrh /************************************************************************** 648e5918c62Sdrh ** The following block contains those class members that change during 64960ec914cSpeter.d.reid ** routine operation. Class members not in this block are either fixed 650e5918c62Sdrh ** when the pager is first created or else only change when there is a 651e5918c62Sdrh ** significant mode change (such as changing the page_size, locking_mode, 652e5918c62Sdrh ** or the journal_mode). From another view, these class members describe 653e5918c62Sdrh ** the "state" of the pager, while other class members describe the 654e5918c62Sdrh ** "configuration" of the pager. 655bea2a948Sdanielk1977 */ 656de1ae34eSdan u8 eState; /* Pager state (OPEN, READER, WRITER_LOCKED..) */ 657d0864087Sdan u8 eLock; /* Current lock held on database file */ 658bea2a948Sdanielk1977 u8 changeCountDone; /* Set after incrementing the change-counter */ 6596d156e46Sdrh u8 setMaster; /* True if a m-j name has been written to jrnl */ 6607cf4c7adSdrh u8 doNotSpill; /* Do not spill the cache when non-zero */ 661d829335eSdanielk1977 u8 subjInMemory; /* True to use in-memory sub-journals */ 66291618564Sdrh u8 bUseFetch; /* True to use xFetch() */ 663c98a4cc8Sdrh u8 hasHeldSharedLock; /* True if a shared lock has ever been held */ 6643460d19cSdanielk1977 Pgno dbSize; /* Number of pages in the database */ 6653460d19cSdanielk1977 Pgno dbOrigSize; /* dbSize before the current transaction */ 6663460d19cSdanielk1977 Pgno dbFileSize; /* Number of pages in the database file */ 667c864912aSdan Pgno dbHintSize; /* Value passed to FCNTL_SIZE_HINT call */ 66845d6882fSdanielk1977 int errCode; /* One of several kinds of errors */ 669bea2a948Sdanielk1977 int nRec; /* Pages journalled since last j-header written */ 67045d6882fSdanielk1977 u32 cksumInit; /* Quasi-random value added to every checksum */ 671bea2a948Sdanielk1977 u32 nSubRec; /* Number of records written to sub-journal */ 67245d6882fSdanielk1977 Bitvec *pInJournal; /* One bit for each page in the database file */ 673bea2a948Sdanielk1977 sqlite3_file *fd; /* File descriptor for database */ 674bea2a948Sdanielk1977 sqlite3_file *jfd; /* File descriptor for main journal */ 675bea2a948Sdanielk1977 sqlite3_file *sjfd; /* File descriptor for sub-journal */ 676bea2a948Sdanielk1977 i64 journalOff; /* Current write offset in the journal file */ 677bea2a948Sdanielk1977 i64 journalHdr; /* Byte offset to previous journal header */ 678e5918c62Sdrh sqlite3_backup *pBackup; /* Pointer to list of ongoing backup processes */ 679bea2a948Sdanielk1977 PagerSavepoint *aSavepoint; /* Array of active savepoints */ 680bea2a948Sdanielk1977 int nSavepoint; /* Number of elements in aSavepoint[] */ 681d7107b38Sdrh u32 iDataVersion; /* Changes whenever database content changes */ 682bea2a948Sdanielk1977 char dbFileVers[16]; /* Changes whenever database file changes */ 683b2d3de3bSdan 684b2d3de3bSdan int nMmapOut; /* Number of mmap pages currently outstanding */ 6859b4c59faSdrh sqlite3_int64 szMmap; /* Desired maximum mmap size */ 686c86e5135Sdrh PgHdr *pMmapFreelist; /* List of free mmap page headers (pDirty) */ 687e5918c62Sdrh /* 688e5918c62Sdrh ** End of the routinely-changing class members 689e5918c62Sdrh ***************************************************************************/ 690bea2a948Sdanielk1977 691fa9601a9Sdrh u16 nExtra; /* Add this many bytes to each in-memory page */ 692fa9601a9Sdrh i16 nReserve; /* Number of unused bytes at end of each page */ 693bea2a948Sdanielk1977 u32 vfsFlags; /* Flags for sqlite3_vfs.xOpen() */ 694e5918c62Sdrh u32 sectorSize; /* Assumed sector size during rollback */ 695bea2a948Sdanielk1977 int pageSize; /* Number of bytes in a page */ 696bea2a948Sdanielk1977 Pgno mxPgno; /* Maximum allowed size of the database */ 697e5918c62Sdrh i64 journalSizeLimit; /* Size limit for persistent journal files */ 698fcd35c7bSdrh char *zFilename; /* Name of the database file */ 699fcd35c7bSdrh char *zJournal; /* Name of the journal file */ 7001ceedd37Sdanielk1977 int (*xBusyHandler)(void*); /* Function to call when busy */ 7011ceedd37Sdanielk1977 void *pBusyHandlerArg; /* Context argument for xBusyHandler */ 702ffc78a41Sdrh int aStat[4]; /* Total cache hits, misses, writes, spills */ 703fcd35c7bSdrh #ifdef SQLITE_TEST 7049ad3ee40Sdrh int nRead; /* Database pages read */ 705fcd35c7bSdrh #endif 706eaa06f69Sdanielk1977 void (*xReiniter)(DbPage*); /* Call this routine when reloading pages */ 70712e6f682Sdrh int (*xGet)(Pager*,Pgno,DbPage**,int); /* Routine to fetch a patch */ 7087c4ac0c5Sdrh #ifdef SQLITE_HAS_CODEC 709c001c58aSdrh void *(*xCodec)(void*,void*,Pgno,int); /* Routine for en/decoding data */ 710fa9601a9Sdrh void (*xCodecSizeChng)(void*,int,int); /* Notify of page size changes */ 711fa9601a9Sdrh void (*xCodecFree)(void*); /* Destructor for the codec */ 712fa9601a9Sdrh void *pCodec; /* First argument to xCodec... methods */ 7137c4ac0c5Sdrh #endif 7148186df86Sdanielk1977 char *pTmpSpace; /* Pager.pageSize bytes of space for tmp use */ 7158c0a791aSdanielk1977 PCache *pPCache; /* Pointer to page cache object */ 7165cf53537Sdan #ifndef SQLITE_OMIT_WAL 7177ed91f23Sdrh Wal *pWal; /* Write-ahead log used by "journal_mode=wal" */ 7183e875ef3Sdan char *zWal; /* File name for write-ahead log */ 7195cf53537Sdan #endif 720d9b0257aSdrh }; 721d9b0257aSdrh 722d9b0257aSdrh /* 7239ad3ee40Sdrh ** Indexes for use with Pager.aStat[]. The Pager.aStat[] array contains 7249ad3ee40Sdrh ** the values accessed by passing SQLITE_DBSTATUS_CACHE_HIT, CACHE_MISS 7259ad3ee40Sdrh ** or CACHE_WRITE to sqlite3_db_status(). 7269ad3ee40Sdrh */ 7279ad3ee40Sdrh #define PAGER_STAT_HIT 0 7289ad3ee40Sdrh #define PAGER_STAT_MISS 1 7299ad3ee40Sdrh #define PAGER_STAT_WRITE 2 730ffc78a41Sdrh #define PAGER_STAT_SPILL 3 7319ad3ee40Sdrh 7329ad3ee40Sdrh /* 733538f570cSdrh ** The following global variables hold counters used for 734538f570cSdrh ** testing purposes only. These variables do not exist in 735538f570cSdrh ** a non-testing build. These variables are not thread-safe. 736fcd35c7bSdrh */ 737fcd35c7bSdrh #ifdef SQLITE_TEST 738538f570cSdrh int sqlite3_pager_readdb_count = 0; /* Number of full pages read from DB */ 739538f570cSdrh int sqlite3_pager_writedb_count = 0; /* Number of full pages written to DB */ 740538f570cSdrh int sqlite3_pager_writej_count = 0; /* Number of pages written to journal */ 741538f570cSdrh # define PAGER_INCR(v) v++ 742fcd35c7bSdrh #else 743538f570cSdrh # define PAGER_INCR(v) 744fcd35c7bSdrh #endif 745fcd35c7bSdrh 746538f570cSdrh 747538f570cSdrh 748fcd35c7bSdrh /* 7495e00f6c7Sdrh ** Journal files begin with the following magic string. The data 7505e00f6c7Sdrh ** was obtained from /dev/random. It is used only as a sanity check. 75194f3331aSdrh ** 752ae2b40c4Sdrh ** Since version 2.8.0, the journal format contains additional sanity 75330d53701Sdrh ** checking information. If the power fails while the journal is being 754ae2b40c4Sdrh ** written, semi-random garbage data might appear in the journal 755ae2b40c4Sdrh ** file after power is restored. If an attempt is then made 756968af52aSdrh ** to roll the journal back, the database could be corrupted. The additional 757968af52aSdrh ** sanity checking data is an attempt to discover the garbage in the 758968af52aSdrh ** journal and ignore it. 759968af52aSdrh ** 760ae2b40c4Sdrh ** The sanity checking information for the new journal format consists 761968af52aSdrh ** of a 32-bit checksum on each page of data. The checksum covers both 76290f5ecb3Sdrh ** the page number and the pPager->pageSize bytes of data for the page. 763968af52aSdrh ** This cksum is initialized to a 32-bit random value that appears in the 764968af52aSdrh ** journal file right after the header. The random initializer is important, 765968af52aSdrh ** because garbage data that appears at the end of a journal is likely 766968af52aSdrh ** data that was once in other files that have now been deleted. If the 767968af52aSdrh ** garbage data came from an obsolete journal file, the checksums might 768968af52aSdrh ** be correct. But by initializing the checksum to random value which 769968af52aSdrh ** is different for every journal, we minimize that risk. 770d9b0257aSdrh */ 771ae2b40c4Sdrh static const unsigned char aJournalMagic[] = { 772ae2b40c4Sdrh 0xd9, 0xd5, 0x05, 0xf9, 0x20, 0xa1, 0x63, 0xd7, 773ed7c855cSdrh }; 774ed7c855cSdrh 775ed7c855cSdrh /* 776bea2a948Sdanielk1977 ** The size of the of each page record in the journal is given by 777bea2a948Sdanielk1977 ** the following macro. 778968af52aSdrh */ 779ae2b40c4Sdrh #define JOURNAL_PG_SZ(pPager) ((pPager->pageSize) + 8) 780968af52aSdrh 7817657240aSdanielk1977 /* 782bea2a948Sdanielk1977 ** The journal header size for this pager. This is usually the same 783bea2a948Sdanielk1977 ** size as a single disk sector. See also setSectorSize(). 7847657240aSdanielk1977 */ 7857657240aSdanielk1977 #define JOURNAL_HDR_SZ(pPager) (pPager->sectorSize) 7867657240aSdanielk1977 787b7f9164eSdrh /* 788b7f9164eSdrh ** The macro MEMDB is true if we are dealing with an in-memory database. 789b7f9164eSdrh ** We do this as a macro so that if the SQLITE_OMIT_MEMORYDB macro is set, 790b7f9164eSdrh ** the value of MEMDB will be a constant and the compiler will optimize 791b7f9164eSdrh ** out code that would never execute. 792b7f9164eSdrh */ 793b7f9164eSdrh #ifdef SQLITE_OMIT_MEMORYDB 794b7f9164eSdrh # define MEMDB 0 795b7f9164eSdrh #else 796b7f9164eSdrh # define MEMDB pPager->memDb 797b7f9164eSdrh #endif 798b7f9164eSdrh 799b7f9164eSdrh /* 800188d4884Sdrh ** The macro USEFETCH is true if we are allowed to use the xFetch and xUnfetch 801188d4884Sdrh ** interfaces to access the database using memory-mapped I/O. 802188d4884Sdrh */ 8039b4c59faSdrh #if SQLITE_MAX_MMAP_SIZE>0 804188d4884Sdrh # define USEFETCH(x) ((x)->bUseFetch) 8059b4c59faSdrh #else 8069b4c59faSdrh # define USEFETCH(x) 0 807188d4884Sdrh #endif 808188d4884Sdrh 809188d4884Sdrh /* 81026836654Sdanielk1977 ** The maximum legal page number is (2^31 - 1). 81126836654Sdanielk1977 */ 81226836654Sdanielk1977 #define PAGER_MAX_PGNO 2147483647 81326836654Sdanielk1977 814d0864087Sdan /* 815d0864087Sdan ** The argument to this macro is a file descriptor (type sqlite3_file*). 816d0864087Sdan ** Return 0 if it is not open, or non-zero (but not 1) if it is. 817d0864087Sdan ** 818d0864087Sdan ** This is so that expressions can be written as: 819d0864087Sdan ** 820d0864087Sdan ** if( isOpen(pPager->jfd) ){ ... 821d0864087Sdan ** 822d0864087Sdan ** instead of 823d0864087Sdan ** 824d0864087Sdan ** if( pPager->jfd->pMethods ){ ... 825d0864087Sdan */ 82682ef8775Sdrh #define isOpen(pFd) ((pFd)->pMethods!=0) 827d0864087Sdan 828d0864087Sdan /* 829d930b5cbSdrh ** Return true if this pager uses a write-ahead log to read page pgno. 830d930b5cbSdrh ** Return false if the pager reads pgno directly from the database. 831d0864087Sdan */ 832d930b5cbSdrh #if !defined(SQLITE_OMIT_WAL) && defined(SQLITE_DIRECT_OVERFLOW_READ) 833d930b5cbSdrh int sqlite3PagerUseWal(Pager *pPager, Pgno pgno){ 834d930b5cbSdrh u32 iRead = 0; 835d930b5cbSdrh int rc; 836d930b5cbSdrh if( pPager->pWal==0 ) return 0; 837d930b5cbSdrh rc = sqlite3WalFindFrame(pPager->pWal, pgno, &iRead); 838d930b5cbSdrh return rc || iRead; 839d0864087Sdan } 840d930b5cbSdrh #endif 841d930b5cbSdrh #ifndef SQLITE_OMIT_WAL 842d930b5cbSdrh # define pagerUseWal(x) ((x)->pWal!=0) 843d0864087Sdan #else 844d0864087Sdan # define pagerUseWal(x) 0 845d0864087Sdan # define pagerRollbackWal(x) 0 8464eb02a45Sdrh # define pagerWalFrames(v,w,x,y) 0 847d0864087Sdan # define pagerOpenWalIfPresent(z) SQLITE_OK 848d0864087Sdan # define pagerBeginReadTransaction(z) SQLITE_OK 849d0864087Sdan #endif 850d0864087Sdan 851bea2a948Sdanielk1977 #ifndef NDEBUG 852bea2a948Sdanielk1977 /* 853bea2a948Sdanielk1977 ** Usage: 854bea2a948Sdanielk1977 ** 855bea2a948Sdanielk1977 ** assert( assert_pager_state(pPager) ); 856de1ae34eSdan ** 857de1ae34eSdan ** This function runs many asserts to try to find inconsistencies in 858de1ae34eSdan ** the internal state of the Pager object. 859bea2a948Sdanielk1977 */ 860d0864087Sdan static int assert_pager_state(Pager *p){ 861d0864087Sdan Pager *pPager = p; 862bea2a948Sdanielk1977 863d0864087Sdan /* State must be valid. */ 864de1ae34eSdan assert( p->eState==PAGER_OPEN 865d0864087Sdan || p->eState==PAGER_READER 866de1ae34eSdan || p->eState==PAGER_WRITER_LOCKED 867d0864087Sdan || p->eState==PAGER_WRITER_CACHEMOD 868d0864087Sdan || p->eState==PAGER_WRITER_DBMOD 869d0864087Sdan || p->eState==PAGER_WRITER_FINISHED 870a42c66bdSdan || p->eState==PAGER_ERROR 871d0864087Sdan ); 872bea2a948Sdanielk1977 873d0864087Sdan /* Regardless of the current state, a temp-file connection always behaves 874d0864087Sdan ** as if it has an exclusive lock on the database file. It never updates 875d0864087Sdan ** the change-counter field, so the changeCountDone flag is always set. 876d0864087Sdan */ 877d0864087Sdan assert( p->tempFile==0 || p->eLock==EXCLUSIVE_LOCK ); 878d0864087Sdan assert( p->tempFile==0 || pPager->changeCountDone ); 879d0864087Sdan 880d0864087Sdan /* If the useJournal flag is clear, the journal-mode must be "OFF". 881d0864087Sdan ** And if the journal-mode is "OFF", the journal file must not be open. 882d0864087Sdan */ 883d0864087Sdan assert( p->journalMode==PAGER_JOURNALMODE_OFF || p->useJournal ); 884d0864087Sdan assert( p->journalMode!=PAGER_JOURNALMODE_OFF || !isOpen(p->jfd) ); 885d0864087Sdan 88622b328b2Sdan /* Check that MEMDB implies noSync. And an in-memory journal. Since 88722b328b2Sdan ** this means an in-memory pager performs no IO at all, it cannot encounter 88822b328b2Sdan ** either SQLITE_IOERR or SQLITE_FULL during rollback or while finalizing 88922b328b2Sdan ** a journal file. (although the in-memory journal implementation may 89022b328b2Sdan ** return SQLITE_IOERR_NOMEM while the journal file is being written). It 89122b328b2Sdan ** is therefore not possible for an in-memory pager to enter the ERROR 89222b328b2Sdan ** state. 89322b328b2Sdan */ 89422b328b2Sdan if( MEMDB ){ 895835f22deSdrh assert( !isOpen(p->fd) ); 89622b328b2Sdan assert( p->noSync ); 89722b328b2Sdan assert( p->journalMode==PAGER_JOURNALMODE_OFF 89822b328b2Sdan || p->journalMode==PAGER_JOURNALMODE_MEMORY 89922b328b2Sdan ); 90022b328b2Sdan assert( p->eState!=PAGER_ERROR && p->eState!=PAGER_OPEN ); 90122b328b2Sdan assert( pagerUseWal(p)==0 ); 90222b328b2Sdan } 903d0864087Sdan 904431b0b42Sdan /* If changeCountDone is set, a RESERVED lock or greater must be held 905431b0b42Sdan ** on the file. 906431b0b42Sdan */ 907431b0b42Sdan assert( pPager->changeCountDone==0 || pPager->eLock>=RESERVED_LOCK ); 90854919f82Sdan assert( p->eLock!=PENDING_LOCK ); 909431b0b42Sdan 910d0864087Sdan switch( p->eState ){ 911de1ae34eSdan case PAGER_OPEN: 912d0864087Sdan assert( !MEMDB ); 913a42c66bdSdan assert( pPager->errCode==SQLITE_OK ); 9144e004aa6Sdan assert( sqlite3PcacheRefCount(pPager->pPCache)==0 || pPager->tempFile ); 915d0864087Sdan break; 916d0864087Sdan 917d0864087Sdan case PAGER_READER: 918a42c66bdSdan assert( pPager->errCode==SQLITE_OK ); 9194e004aa6Sdan assert( p->eLock!=UNKNOWN_LOCK ); 92033f111dcSdrh assert( p->eLock>=SHARED_LOCK ); 921d0864087Sdan break; 922d0864087Sdan 923de1ae34eSdan case PAGER_WRITER_LOCKED: 9244e004aa6Sdan assert( p->eLock!=UNKNOWN_LOCK ); 925a42c66bdSdan assert( pPager->errCode==SQLITE_OK ); 926d0864087Sdan if( !pagerUseWal(pPager) ){ 927d0864087Sdan assert( p->eLock>=RESERVED_LOCK ); 928d0864087Sdan } 929937ac9daSdan assert( pPager->dbSize==pPager->dbOrigSize ); 930937ac9daSdan assert( pPager->dbOrigSize==pPager->dbFileSize ); 931c864912aSdan assert( pPager->dbOrigSize==pPager->dbHintSize ); 932a42c66bdSdan assert( pPager->setMaster==0 ); 933d0864087Sdan break; 934d0864087Sdan 935d0864087Sdan case PAGER_WRITER_CACHEMOD: 9364e004aa6Sdan assert( p->eLock!=UNKNOWN_LOCK ); 937a42c66bdSdan assert( pPager->errCode==SQLITE_OK ); 938d0864087Sdan if( !pagerUseWal(pPager) ){ 939d0864087Sdan /* It is possible that if journal_mode=wal here that neither the 940d0864087Sdan ** journal file nor the WAL file are open. This happens during 941d0864087Sdan ** a rollback transaction that switches from journal_mode=off 942d0864087Sdan ** to journal_mode=wal. 943d0864087Sdan */ 944d0864087Sdan assert( p->eLock>=RESERVED_LOCK ); 945d0864087Sdan assert( isOpen(p->jfd) 946d0864087Sdan || p->journalMode==PAGER_JOURNALMODE_OFF 947d0864087Sdan || p->journalMode==PAGER_JOURNALMODE_WAL 948d0864087Sdan ); 949d0864087Sdan } 950937ac9daSdan assert( pPager->dbOrigSize==pPager->dbFileSize ); 951c864912aSdan assert( pPager->dbOrigSize==pPager->dbHintSize ); 952d0864087Sdan break; 953d0864087Sdan 954d0864087Sdan case PAGER_WRITER_DBMOD: 9554e004aa6Sdan assert( p->eLock==EXCLUSIVE_LOCK ); 956a42c66bdSdan assert( pPager->errCode==SQLITE_OK ); 957d0864087Sdan assert( !pagerUseWal(pPager) ); 9584e004aa6Sdan assert( p->eLock>=EXCLUSIVE_LOCK ); 959d0864087Sdan assert( isOpen(p->jfd) 960d0864087Sdan || p->journalMode==PAGER_JOURNALMODE_OFF 961d0864087Sdan || p->journalMode==PAGER_JOURNALMODE_WAL 962d67a9770Sdan || (sqlite3OsDeviceCharacteristics(p->fd)&SQLITE_IOCAP_BATCH_ATOMIC) 963d0864087Sdan ); 964c864912aSdan assert( pPager->dbOrigSize<=pPager->dbHintSize ); 965d0864087Sdan break; 966d0864087Sdan 967d0864087Sdan case PAGER_WRITER_FINISHED: 9684e004aa6Sdan assert( p->eLock==EXCLUSIVE_LOCK ); 969a42c66bdSdan assert( pPager->errCode==SQLITE_OK ); 970d0864087Sdan assert( !pagerUseWal(pPager) ); 971d0864087Sdan assert( isOpen(p->jfd) 972d0864087Sdan || p->journalMode==PAGER_JOURNALMODE_OFF 973d0864087Sdan || p->journalMode==PAGER_JOURNALMODE_WAL 974efe16971Sdan || (sqlite3OsDeviceCharacteristics(p->fd)&SQLITE_IOCAP_BATCH_ATOMIC) 975d0864087Sdan ); 976d0864087Sdan break; 977a42c66bdSdan 978a42c66bdSdan case PAGER_ERROR: 979a42c66bdSdan /* There must be at least one outstanding reference to the pager if 980a42c66bdSdan ** in ERROR state. Otherwise the pager should have already dropped 981de1ae34eSdan ** back to OPEN state. 982a42c66bdSdan */ 983a42c66bdSdan assert( pPager->errCode!=SQLITE_OK ); 98467330a12Sdan assert( sqlite3PcacheRefCount(pPager->pPCache)>0 || pPager->tempFile ); 985a42c66bdSdan break; 986d0864087Sdan } 987bea2a948Sdanielk1977 988bea2a948Sdanielk1977 return 1; 989bea2a948Sdanielk1977 } 9906a88adcdSdan #endif /* ifndef NDEBUG */ 991d0864087Sdan 9926a88adcdSdan #ifdef SQLITE_DEBUG 993d0864087Sdan /* 994de1ae34eSdan ** Return a pointer to a human readable string in a static buffer 995de1ae34eSdan ** containing the state of the Pager object passed as an argument. This 996de1ae34eSdan ** is intended to be used within debuggers. For example, as an alternative 997de1ae34eSdan ** to "print *pPager" in gdb: 998de1ae34eSdan ** 999d0864087Sdan ** (gdb) printf "%s", print_pager_state(pPager) 1000d0864087Sdan */ 1001d0864087Sdan static char *print_pager_state(Pager *p){ 1002d0864087Sdan static char zRet[1024]; 1003d0864087Sdan 1004d0864087Sdan sqlite3_snprintf(1024, zRet, 100511f47a9bSdan "Filename: %s\n" 10064e004aa6Sdan "State: %s errCode=%d\n" 1007d0864087Sdan "Lock: %s\n" 1008d0864087Sdan "Locking mode: locking_mode=%s\n" 1009937ac9daSdan "Journal mode: journal_mode=%s\n" 1010937ac9daSdan "Backing store: tempFile=%d memDb=%d useJournal=%d\n" 10114e004aa6Sdan "Journal: journalOff=%lld journalHdr=%lld\n" 101273d66fdbSdan "Size: dbsize=%d dbOrigSize=%d dbFileSize=%d\n" 101311f47a9bSdan , p->zFilename 1014de1ae34eSdan , p->eState==PAGER_OPEN ? "OPEN" : 1015d0864087Sdan p->eState==PAGER_READER ? "READER" : 1016de1ae34eSdan p->eState==PAGER_WRITER_LOCKED ? "WRITER_LOCKED" : 1017d0864087Sdan p->eState==PAGER_WRITER_CACHEMOD ? "WRITER_CACHEMOD" : 1018d0864087Sdan p->eState==PAGER_WRITER_DBMOD ? "WRITER_DBMOD" : 1019a42c66bdSdan p->eState==PAGER_WRITER_FINISHED ? "WRITER_FINISHED" : 1020a42c66bdSdan p->eState==PAGER_ERROR ? "ERROR" : "?error?" 10214e004aa6Sdan , (int)p->errCode 10225198beadSdan , p->eLock==NO_LOCK ? "NO_LOCK" : 1023d0864087Sdan p->eLock==RESERVED_LOCK ? "RESERVED" : 1024d0864087Sdan p->eLock==EXCLUSIVE_LOCK ? "EXCLUSIVE" : 10254e004aa6Sdan p->eLock==SHARED_LOCK ? "SHARED" : 10264e004aa6Sdan p->eLock==UNKNOWN_LOCK ? "UNKNOWN" : "?error?" 1027d0864087Sdan , p->exclusiveMode ? "exclusive" : "normal" 1028937ac9daSdan , p->journalMode==PAGER_JOURNALMODE_MEMORY ? "memory" : 1029937ac9daSdan p->journalMode==PAGER_JOURNALMODE_OFF ? "off" : 1030937ac9daSdan p->journalMode==PAGER_JOURNALMODE_DELETE ? "delete" : 1031937ac9daSdan p->journalMode==PAGER_JOURNALMODE_PERSIST ? "persist" : 1032937ac9daSdan p->journalMode==PAGER_JOURNALMODE_TRUNCATE ? "truncate" : 1033937ac9daSdan p->journalMode==PAGER_JOURNALMODE_WAL ? "wal" : "?error?" 1034937ac9daSdan , (int)p->tempFile, (int)p->memDb, (int)p->useJournal 10354e004aa6Sdan , p->journalOff, p->journalHdr 103673d66fdbSdan , (int)p->dbSize, (int)p->dbOrigSize, (int)p->dbFileSize 1037d0864087Sdan ); 1038d0864087Sdan 1039d0864087Sdan return zRet; 1040d0864087Sdan } 1041bea2a948Sdanielk1977 #endif 1042bea2a948Sdanielk1977 104312e6f682Sdrh /* Forward references to the various page getters */ 104412e6f682Sdrh static int getPageNormal(Pager*,Pgno,DbPage**,int); 104512e6f682Sdrh static int getPageError(Pager*,Pgno,DbPage**,int); 1046d5df3ff2Sdrh #if SQLITE_MAX_MMAP_SIZE>0 1047d5df3ff2Sdrh static int getPageMMap(Pager*,Pgno,DbPage**,int); 1048d5df3ff2Sdrh #endif 104912e6f682Sdrh 105012e6f682Sdrh /* 105112e6f682Sdrh ** Set the Pager.xGet method for the appropriate routine used to fetch 105212e6f682Sdrh ** content from the pager. 105312e6f682Sdrh */ 105412e6f682Sdrh static void setGetterMethod(Pager *pPager){ 105512e6f682Sdrh if( pPager->errCode ){ 105612e6f682Sdrh pPager->xGet = getPageError; 1057d5df3ff2Sdrh #if SQLITE_MAX_MMAP_SIZE>0 105812e6f682Sdrh }else if( USEFETCH(pPager) 105912e6f682Sdrh #ifdef SQLITE_HAS_CODEC 106012e6f682Sdrh && pPager->xCodec==0 106112e6f682Sdrh #endif 106212e6f682Sdrh ){ 106312e6f682Sdrh pPager->xGet = getPageMMap; 1064d5df3ff2Sdrh #endif /* SQLITE_MAX_MMAP_SIZE>0 */ 106512e6f682Sdrh }else{ 106612e6f682Sdrh pPager->xGet = getPageNormal; 106712e6f682Sdrh } 106812e6f682Sdrh } 106912e6f682Sdrh 107026836654Sdanielk1977 /* 10713460d19cSdanielk1977 ** Return true if it is necessary to write page *pPg into the sub-journal. 10723460d19cSdanielk1977 ** A page needs to be written into the sub-journal if there exists one 10733460d19cSdanielk1977 ** or more open savepoints for which: 1074fd7f0452Sdanielk1977 ** 10753460d19cSdanielk1977 ** * The page-number is less than or equal to PagerSavepoint.nOrig, and 10763460d19cSdanielk1977 ** * The bit corresponding to the page-number is not set in 10773460d19cSdanielk1977 ** PagerSavepoint.pInSavepoint. 1078f35843b5Sdanielk1977 */ 10793460d19cSdanielk1977 static int subjRequiresPage(PgHdr *pPg){ 1080f35843b5Sdanielk1977 Pager *pPager = pPg->pPager; 10819d1ab079Sdrh PagerSavepoint *p; 108216f9a811Sdrh Pgno pgno = pPg->pgno; 10833460d19cSdanielk1977 int i; 10843460d19cSdanielk1977 for(i=0; i<pPager->nSavepoint; i++){ 10859d1ab079Sdrh p = &pPager->aSavepoint[i]; 108682ef8775Sdrh if( p->nOrig>=pgno && 0==sqlite3BitvecTestNotNull(p->pInSavepoint, pgno) ){ 1087fd7f0452Sdanielk1977 return 1; 1088fd7f0452Sdanielk1977 } 10893460d19cSdanielk1977 } 10903460d19cSdanielk1977 return 0; 1091f35843b5Sdanielk1977 } 10928ca0c724Sdrh 109382ef8775Sdrh #ifdef SQLITE_DEBUG 10943460d19cSdanielk1977 /* 10953460d19cSdanielk1977 ** Return true if the page is already in the journal file. 10963460d19cSdanielk1977 */ 10975dee6afcSdrh static int pageInJournal(Pager *pPager, PgHdr *pPg){ 10985dee6afcSdrh return sqlite3BitvecTest(pPager->pInJournal, pPg->pgno); 1099bc2ca9ebSdanielk1977 } 110082ef8775Sdrh #endif 1101bc2ca9ebSdanielk1977 11028ca0c724Sdrh /* 110334e79ceeSdrh ** Read a 32-bit integer from the given file descriptor. Store the integer 110434e79ceeSdrh ** that is read in *pRes. Return SQLITE_OK if everything worked, or an 110534e79ceeSdrh ** error code is something goes wrong. 1106726de599Sdrh ** 1107726de599Sdrh ** All values are stored on disk as big-endian. 110894f3331aSdrh */ 110962079060Sdanielk1977 static int read32bits(sqlite3_file *fd, i64 offset, u32 *pRes){ 111094f3331aSdrh unsigned char ac[4]; 111162079060Sdanielk1977 int rc = sqlite3OsRead(fd, ac, sizeof(ac), offset); 11123b59a5ccSdrh if( rc==SQLITE_OK ){ 1113a3152895Sdrh *pRes = sqlite3Get4byte(ac); 111494f3331aSdrh } 111594f3331aSdrh return rc; 111694f3331aSdrh } 111794f3331aSdrh 111894f3331aSdrh /* 111997b57484Sdrh ** Write a 32-bit integer into a string buffer in big-endian byte order. 112097b57484Sdrh */ 1121a3152895Sdrh #define put32bits(A,B) sqlite3Put4byte((u8*)A,B) 112297b57484Sdrh 1123d0864087Sdan 112497b57484Sdrh /* 112534e79ceeSdrh ** Write a 32-bit integer into the given file descriptor. Return SQLITE_OK 112634e79ceeSdrh ** on success or an error code is something goes wrong. 112794f3331aSdrh */ 112862079060Sdanielk1977 static int write32bits(sqlite3_file *fd, i64 offset, u32 val){ 1129bab45c64Sdanielk1977 char ac[4]; 113097b57484Sdrh put32bits(ac, val); 113162079060Sdanielk1977 return sqlite3OsWrite(fd, ac, 4, offset); 113294f3331aSdrh } 113394f3331aSdrh 11342554f8b0Sdrh /* 113554919f82Sdan ** Unlock the database file to level eLock, which must be either NO_LOCK 113654919f82Sdan ** or SHARED_LOCK. Regardless of whether or not the call to xUnlock() 113754919f82Sdan ** succeeds, set the Pager.eLock variable to match the (attempted) new lock. 113854919f82Sdan ** 113954919f82Sdan ** Except, if Pager.eLock is set to UNKNOWN_LOCK when this function is 114054919f82Sdan ** called, do not modify it. See the comment above the #define of 114154919f82Sdan ** UNKNOWN_LOCK for an explanation of this. 11427a2b1eebSdanielk1977 */ 11434e004aa6Sdan static int pagerUnlockDb(Pager *pPager, int eLock){ 1144431b0b42Sdan int rc = SQLITE_OK; 114554919f82Sdan 11468c408004Sdan assert( !pPager->exclusiveMode || pPager->eLock==eLock ); 114754919f82Sdan assert( eLock==NO_LOCK || eLock==SHARED_LOCK ); 114854919f82Sdan assert( eLock!=NO_LOCK || pagerUseWal(pPager)==0 ); 114957fe136bSdrh if( isOpen(pPager->fd) ){ 1150d0864087Sdan assert( pPager->eLock>=eLock ); 115157fe136bSdrh rc = pPager->noLock ? SQLITE_OK : sqlite3OsUnlock(pPager->fd, eLock); 11524e004aa6Sdan if( pPager->eLock!=UNKNOWN_LOCK ){ 11531df2db7fSshaneh pPager->eLock = (u8)eLock; 1154431b0b42Sdan } 11554e004aa6Sdan IOTRACE(("UNLOCK %p %d\n", pPager, eLock)) 1156431b0b42Sdan } 1157431b0b42Sdan return rc; 1158431b0b42Sdan } 1159431b0b42Sdan 116054919f82Sdan /* 116154919f82Sdan ** Lock the database file to level eLock, which must be either SHARED_LOCK, 116254919f82Sdan ** RESERVED_LOCK or EXCLUSIVE_LOCK. If the caller is successful, set the 116354919f82Sdan ** Pager.eLock variable to the new locking state. 116454919f82Sdan ** 116554919f82Sdan ** Except, if Pager.eLock is set to UNKNOWN_LOCK when this function is 116654919f82Sdan ** called, do not modify it unless the new locking state is EXCLUSIVE_LOCK. 116754919f82Sdan ** See the comment above the #define of UNKNOWN_LOCK for an explanation 116854919f82Sdan ** of this. 116954919f82Sdan */ 11704e004aa6Sdan static int pagerLockDb(Pager *pPager, int eLock){ 117154919f82Sdan int rc = SQLITE_OK; 117254919f82Sdan 1173431b0b42Sdan assert( eLock==SHARED_LOCK || eLock==RESERVED_LOCK || eLock==EXCLUSIVE_LOCK ); 117454919f82Sdan if( pPager->eLock<eLock || pPager->eLock==UNKNOWN_LOCK ){ 117557fe136bSdrh rc = pPager->noLock ? SQLITE_OK : sqlite3OsLock(pPager->fd, eLock); 11764e004aa6Sdan if( rc==SQLITE_OK && (pPager->eLock!=UNKNOWN_LOCK||eLock==EXCLUSIVE_LOCK) ){ 11771df2db7fSshaneh pPager->eLock = (u8)eLock; 11784e004aa6Sdan IOTRACE(("LOCK %p %d\n", pPager, eLock)) 1179431b0b42Sdan } 1180431b0b42Sdan } 1181431b0b42Sdan return rc; 11827a2b1eebSdanielk1977 } 11837a2b1eebSdanielk1977 11847a2b1eebSdanielk1977 /* 1185d67a9770Sdan ** This function determines whether or not the atomic-write or 1186d67a9770Sdan ** atomic-batch-write optimizations can be used with this pager. The 1187d67a9770Sdan ** atomic-write optimization can be used if: 1188c7b6017cSdanielk1977 ** 1189c7b6017cSdanielk1977 ** (a) the value returned by OsDeviceCharacteristics() indicates that 1190c7b6017cSdanielk1977 ** a database page may be written atomically, and 1191c7b6017cSdanielk1977 ** (b) the value returned by OsSectorSize() is less than or equal 1192c7b6017cSdanielk1977 ** to the page size. 1193c7b6017cSdanielk1977 ** 1194d67a9770Sdan ** If it can be used, then the value returned is the size of the journal 1195d67a9770Sdan ** file when it contains rollback data for exactly one page. 1196bea2a948Sdanielk1977 ** 1197d67a9770Sdan ** The atomic-batch-write optimization can be used if OsDeviceCharacteristics() 1198d67a9770Sdan ** returns a value with the SQLITE_IOCAP_BATCH_ATOMIC bit set. -1 is 1199d67a9770Sdan ** returned in this case. 1200d67a9770Sdan ** 1201d67a9770Sdan ** If neither optimization can be used, 0 is returned. 1202c7b6017cSdanielk1977 */ 1203c7b6017cSdanielk1977 static int jrnlBufferSize(Pager *pPager){ 1204bea2a948Sdanielk1977 assert( !MEMDB ); 1205d67a9770Sdan 1206d67a9770Sdan #if defined(SQLITE_ENABLE_ATOMIC_WRITE) \ 1207d67a9770Sdan || defined(SQLITE_ENABLE_BATCH_ATOMIC_WRITE) 1208c7b6017cSdanielk1977 int dc; /* Device characteristics */ 1209c7b6017cSdanielk1977 1210bea2a948Sdanielk1977 assert( isOpen(pPager->fd) ); 1211bea2a948Sdanielk1977 dc = sqlite3OsDeviceCharacteristics(pPager->fd); 12126235ee57Sdrh #else 12136235ee57Sdrh UNUSED_PARAMETER(pPager); 1214d67a9770Sdan #endif 1215d67a9770Sdan 1216d67a9770Sdan #ifdef SQLITE_ENABLE_BATCH_ATOMIC_WRITE 1217b8fff29cSdan if( pPager->dbSize>0 && (dc&SQLITE_IOCAP_BATCH_ATOMIC) ){ 1218efe16971Sdan return -1; 1219efe16971Sdan } 1220d67a9770Sdan #endif 1221efe16971Sdan 1222d67a9770Sdan #ifdef SQLITE_ENABLE_ATOMIC_WRITE 1223d67a9770Sdan { 1224d67a9770Sdan int nSector = pPager->sectorSize; 1225d67a9770Sdan int szPage = pPager->pageSize; 1226c7b6017cSdanielk1977 1227c7b6017cSdanielk1977 assert(SQLITE_IOCAP_ATOMIC512==(512>>8)); 1228c7b6017cSdanielk1977 assert(SQLITE_IOCAP_ATOMIC64K==(65536>>8)); 1229bea2a948Sdanielk1977 if( 0==(dc&(SQLITE_IOCAP_ATOMIC|(szPage>>8)) || nSector>szPage) ){ 123045d6882fSdanielk1977 return 0; 123145d6882fSdanielk1977 } 1232bea2a948Sdanielk1977 } 1233c7b6017cSdanielk1977 1234bea2a948Sdanielk1977 return JOURNAL_HDR_SZ(pPager) + JOURNAL_PG_SZ(pPager); 1235bea2a948Sdanielk1977 #endif 1236aef0bf64Sdanielk1977 1237d67a9770Sdan return 0; 1238d67a9770Sdan } 1239d67a9770Sdan 1240477731b5Sdrh /* 1241477731b5Sdrh ** If SQLITE_CHECK_PAGES is defined then we do some sanity checking 1242477731b5Sdrh ** on the cache using a hash function. This is used for testing 1243477731b5Sdrh ** and debugging only. 1244477731b5Sdrh */ 12453c407374Sdanielk1977 #ifdef SQLITE_CHECK_PAGES 12463c407374Sdanielk1977 /* 12473c407374Sdanielk1977 ** Return a 32-bit hash of the page data for pPage. 12483c407374Sdanielk1977 */ 1249477731b5Sdrh static u32 pager_datahash(int nByte, unsigned char *pData){ 12503c407374Sdanielk1977 u32 hash = 0; 12513c407374Sdanielk1977 int i; 1252477731b5Sdrh for(i=0; i<nByte; i++){ 1253477731b5Sdrh hash = (hash*1039) + pData[i]; 12543c407374Sdanielk1977 } 12553c407374Sdanielk1977 return hash; 12563c407374Sdanielk1977 } 1257477731b5Sdrh static u32 pager_pagehash(PgHdr *pPage){ 12588c0a791aSdanielk1977 return pager_datahash(pPage->pPager->pageSize, (unsigned char *)pPage->pData); 12598c0a791aSdanielk1977 } 1260bc2ca9ebSdanielk1977 static void pager_set_pagehash(PgHdr *pPage){ 12618c0a791aSdanielk1977 pPage->pageHash = pager_pagehash(pPage); 1262477731b5Sdrh } 12633c407374Sdanielk1977 12643c407374Sdanielk1977 /* 12653c407374Sdanielk1977 ** The CHECK_PAGE macro takes a PgHdr* as an argument. If SQLITE_CHECK_PAGES 12663c407374Sdanielk1977 ** is defined, and NDEBUG is not defined, an assert() statement checks 12673c407374Sdanielk1977 ** that the page is either dirty or still matches the calculated page-hash. 12683c407374Sdanielk1977 */ 12693c407374Sdanielk1977 #define CHECK_PAGE(x) checkPage(x) 12703c407374Sdanielk1977 static void checkPage(PgHdr *pPg){ 12713c407374Sdanielk1977 Pager *pPager = pPg->pPager; 12725f848c3aSdan assert( pPager->eState!=PAGER_ERROR ); 12735f848c3aSdan assert( (pPg->flags&PGHDR_DIRTY) || pPg->pageHash==pager_pagehash(pPg) ); 12743c407374Sdanielk1977 } 12753c407374Sdanielk1977 12763c407374Sdanielk1977 #else 12778ffa8173Sdrh #define pager_datahash(X,Y) 0 1278477731b5Sdrh #define pager_pagehash(X) 0 12795f848c3aSdan #define pager_set_pagehash(X) 12803c407374Sdanielk1977 #define CHECK_PAGE(x) 128141d3027cSdrh #endif /* SQLITE_CHECK_PAGES */ 12823c407374Sdanielk1977 1283ed7c855cSdrh /* 12847657240aSdanielk1977 ** When this is called the journal file for pager pPager must be open. 1285bea2a948Sdanielk1977 ** This function attempts to read a master journal file name from the 1286bea2a948Sdanielk1977 ** end of the file and, if successful, copies it into memory supplied 1287bea2a948Sdanielk1977 ** by the caller. See comments above writeMasterJournal() for the format 1288bea2a948Sdanielk1977 ** used to store a master journal file name at the end of a journal file. 12897657240aSdanielk1977 ** 129065839c6aSdanielk1977 ** zMaster must point to a buffer of at least nMaster bytes allocated by 129165839c6aSdanielk1977 ** the caller. This should be sqlite3_vfs.mxPathname+1 (to ensure there is 129265839c6aSdanielk1977 ** enough space to write the master journal name). If the master journal 129365839c6aSdanielk1977 ** name in the journal is longer than nMaster bytes (including a 129465839c6aSdanielk1977 ** nul-terminator), then this is handled as if no master journal name 129565839c6aSdanielk1977 ** were present in the journal. 129665839c6aSdanielk1977 ** 1297bea2a948Sdanielk1977 ** If a master journal file name is present at the end of the journal 1298bea2a948Sdanielk1977 ** file, then it is copied into the buffer pointed to by zMaster. A 1299bea2a948Sdanielk1977 ** nul-terminator byte is appended to the buffer following the master 1300bea2a948Sdanielk1977 ** journal file name. 1301bea2a948Sdanielk1977 ** 1302bea2a948Sdanielk1977 ** If it is determined that no master journal file name is present 1303bea2a948Sdanielk1977 ** zMaster[0] is set to 0 and SQLITE_OK returned. 1304bea2a948Sdanielk1977 ** 1305bea2a948Sdanielk1977 ** If an error occurs while reading from the journal file, an SQLite 1306bea2a948Sdanielk1977 ** error code is returned. 13077657240aSdanielk1977 */ 1308d92db531Sdanielk1977 static int readMasterJournal(sqlite3_file *pJrnl, char *zMaster, u32 nMaster){ 1309bea2a948Sdanielk1977 int rc; /* Return code */ 1310bea2a948Sdanielk1977 u32 len; /* Length in bytes of master journal name */ 1311bea2a948Sdanielk1977 i64 szJ; /* Total size in bytes of journal file pJrnl */ 1312bea2a948Sdanielk1977 u32 cksum; /* MJ checksum value read from journal */ 13130b8d2766Sshane u32 u; /* Unsigned loop counter */ 13147657240aSdanielk1977 unsigned char aMagic[8]; /* A buffer to hold the magic header */ 131565839c6aSdanielk1977 zMaster[0] = '\0'; 13167657240aSdanielk1977 1317bea2a948Sdanielk1977 if( SQLITE_OK!=(rc = sqlite3OsFileSize(pJrnl, &szJ)) 1318bea2a948Sdanielk1977 || szJ<16 1319bea2a948Sdanielk1977 || SQLITE_OK!=(rc = read32bits(pJrnl, szJ-16, &len)) 1320bea2a948Sdanielk1977 || len>=nMaster 132105f1ba0eSdrh || len>szJ-16 1322999cd08aSdan || len==0 1323bea2a948Sdanielk1977 || SQLITE_OK!=(rc = read32bits(pJrnl, szJ-12, &cksum)) 1324bea2a948Sdanielk1977 || SQLITE_OK!=(rc = sqlite3OsRead(pJrnl, aMagic, 8, szJ-8)) 1325bea2a948Sdanielk1977 || memcmp(aMagic, aJournalMagic, 8) 1326bea2a948Sdanielk1977 || SQLITE_OK!=(rc = sqlite3OsRead(pJrnl, zMaster, len, szJ-16-len)) 1327bea2a948Sdanielk1977 ){ 13287657240aSdanielk1977 return rc; 13297657240aSdanielk1977 } 13307657240aSdanielk1977 1331cafadbacSdanielk1977 /* See if the checksum matches the master journal name */ 13320b8d2766Sshane for(u=0; u<len; u++){ 13330b8d2766Sshane cksum -= zMaster[u]; 1334cafadbacSdanielk1977 } 13358191bff0Sdanielk1977 if( cksum ){ 13368191bff0Sdanielk1977 /* If the checksum doesn't add up, then one or more of the disk sectors 13378191bff0Sdanielk1977 ** containing the master journal filename is corrupted. This means 13388191bff0Sdanielk1977 ** definitely roll back, so just return SQLITE_OK and report a (nul) 13398191bff0Sdanielk1977 ** master-journal filename. 13408191bff0Sdanielk1977 */ 1341bea2a948Sdanielk1977 len = 0; 1342aca790acSdanielk1977 } 1343bea2a948Sdanielk1977 zMaster[len] = '\0'; 1344cafadbacSdanielk1977 13457657240aSdanielk1977 return SQLITE_OK; 13467657240aSdanielk1977 } 13477657240aSdanielk1977 13487657240aSdanielk1977 /* 1349bea2a948Sdanielk1977 ** Return the offset of the sector boundary at or immediately 1350bea2a948Sdanielk1977 ** following the value in pPager->journalOff, assuming a sector 1351bea2a948Sdanielk1977 ** size of pPager->sectorSize bytes. 13527657240aSdanielk1977 ** 13537657240aSdanielk1977 ** i.e for a sector size of 512: 13547657240aSdanielk1977 ** 1355bea2a948Sdanielk1977 ** Pager.journalOff Return value 13567657240aSdanielk1977 ** --------------------------------------- 13577657240aSdanielk1977 ** 0 0 13587657240aSdanielk1977 ** 512 512 13597657240aSdanielk1977 ** 100 512 13607657240aSdanielk1977 ** 2000 2048 13617657240aSdanielk1977 ** 13627657240aSdanielk1977 */ 1363112f752bSdanielk1977 static i64 journalHdrOffset(Pager *pPager){ 1364eb206256Sdrh i64 offset = 0; 1365eb206256Sdrh i64 c = pPager->journalOff; 13667657240aSdanielk1977 if( c ){ 13677657240aSdanielk1977 offset = ((c-1)/JOURNAL_HDR_SZ(pPager) + 1) * JOURNAL_HDR_SZ(pPager); 13687657240aSdanielk1977 } 13697657240aSdanielk1977 assert( offset%JOURNAL_HDR_SZ(pPager)==0 ); 13707657240aSdanielk1977 assert( offset>=c ); 13717657240aSdanielk1977 assert( (offset-c)<JOURNAL_HDR_SZ(pPager) ); 1372112f752bSdanielk1977 return offset; 1373112f752bSdanielk1977 } 13747657240aSdanielk1977 13757657240aSdanielk1977 /* 1376bea2a948Sdanielk1977 ** The journal file must be open when this function is called. 1377bea2a948Sdanielk1977 ** 1378bea2a948Sdanielk1977 ** This function is a no-op if the journal file has not been written to 1379bea2a948Sdanielk1977 ** within the current transaction (i.e. if Pager.journalOff==0). 1380bea2a948Sdanielk1977 ** 1381bea2a948Sdanielk1977 ** If doTruncate is non-zero or the Pager.journalSizeLimit variable is 1382bea2a948Sdanielk1977 ** set to 0, then truncate the journal file to zero bytes in size. Otherwise, 1383bea2a948Sdanielk1977 ** zero the 28-byte header at the start of the journal file. In either case, 1384bea2a948Sdanielk1977 ** if the pager is not in no-sync mode, sync the journal file immediately 1385bea2a948Sdanielk1977 ** after writing or truncating it. 1386bea2a948Sdanielk1977 ** 1387bea2a948Sdanielk1977 ** If Pager.journalSizeLimit is set to a positive, non-zero value, and 1388bea2a948Sdanielk1977 ** following the truncation or zeroing described above the size of the 1389bea2a948Sdanielk1977 ** journal file in bytes is larger than this value, then truncate the 1390bea2a948Sdanielk1977 ** journal file to Pager.journalSizeLimit bytes. The journal file does 1391bea2a948Sdanielk1977 ** not need to be synced following this operation. 1392bea2a948Sdanielk1977 ** 1393bea2a948Sdanielk1977 ** If an IO error occurs, abandon processing and return the IO error code. 1394bea2a948Sdanielk1977 ** Otherwise, return SQLITE_OK. 1395f3a87624Sdrh */ 1396df2566a3Sdanielk1977 static int zeroJournalHdr(Pager *pPager, int doTruncate){ 1397bea2a948Sdanielk1977 int rc = SQLITE_OK; /* Return code */ 1398bea2a948Sdanielk1977 assert( isOpen(pPager->jfd) ); 13995f37ed51Sdan assert( !sqlite3JournalIsInMemory(pPager->jfd) ); 1400df2566a3Sdanielk1977 if( pPager->journalOff ){ 1401bea2a948Sdanielk1977 const i64 iLimit = pPager->journalSizeLimit; /* Local cache of jsl */ 1402b53e4960Sdanielk1977 1403f3a87624Sdrh IOTRACE(("JZEROHDR %p\n", pPager)) 1404b53e4960Sdanielk1977 if( doTruncate || iLimit==0 ){ 1405df2566a3Sdanielk1977 rc = sqlite3OsTruncate(pPager->jfd, 0); 1406df2566a3Sdanielk1977 }else{ 1407bea2a948Sdanielk1977 static const char zeroHdr[28] = {0}; 1408f3a87624Sdrh rc = sqlite3OsWrite(pPager->jfd, zeroHdr, sizeof(zeroHdr), 0); 1409df2566a3Sdanielk1977 } 14108162054bSdanielk1977 if( rc==SQLITE_OK && !pPager->noSync ){ 1411c97d8463Sdrh rc = sqlite3OsSync(pPager->jfd, SQLITE_SYNC_DATAONLY|pPager->syncFlags); 1412a06ecba2Sdrh } 1413b53e4960Sdanielk1977 1414b53e4960Sdanielk1977 /* At this point the transaction is committed but the write lock 1415b53e4960Sdanielk1977 ** is still held on the file. If there is a size limit configured for 1416b53e4960Sdanielk1977 ** the persistent journal and the journal file currently consumes more 1417b53e4960Sdanielk1977 ** space than that limit allows for, truncate it now. There is no need 1418b53e4960Sdanielk1977 ** to sync the file following this operation. 1419b53e4960Sdanielk1977 */ 1420b53e4960Sdanielk1977 if( rc==SQLITE_OK && iLimit>0 ){ 1421b53e4960Sdanielk1977 i64 sz; 1422b53e4960Sdanielk1977 rc = sqlite3OsFileSize(pPager->jfd, &sz); 1423b53e4960Sdanielk1977 if( rc==SQLITE_OK && sz>iLimit ){ 1424b53e4960Sdanielk1977 rc = sqlite3OsTruncate(pPager->jfd, iLimit); 1425b53e4960Sdanielk1977 } 1426b53e4960Sdanielk1977 } 1427df2566a3Sdanielk1977 } 1428f3a87624Sdrh return rc; 1429f3a87624Sdrh } 1430f3a87624Sdrh 1431f3a87624Sdrh /* 14327657240aSdanielk1977 ** The journal file must be open when this routine is called. A journal 14337657240aSdanielk1977 ** header (JOURNAL_HDR_SZ bytes) is written into the journal file at the 14347657240aSdanielk1977 ** current location. 14357657240aSdanielk1977 ** 14367657240aSdanielk1977 ** The format for the journal header is as follows: 14377657240aSdanielk1977 ** - 8 bytes: Magic identifying journal format. 14387657240aSdanielk1977 ** - 4 bytes: Number of records in journal, or -1 no-sync mode is on. 14397657240aSdanielk1977 ** - 4 bytes: Random number used for page hash. 14407657240aSdanielk1977 ** - 4 bytes: Initial database page count. 14417657240aSdanielk1977 ** - 4 bytes: Sector size used by the process that wrote this journal. 144267c007bfSdanielk1977 ** - 4 bytes: Database page size. 14437657240aSdanielk1977 ** 144467c007bfSdanielk1977 ** Followed by (JOURNAL_HDR_SZ - 28) bytes of unused space. 14457657240aSdanielk1977 */ 14467657240aSdanielk1977 static int writeJournalHdr(Pager *pPager){ 1447bea2a948Sdanielk1977 int rc = SQLITE_OK; /* Return code */ 1448bea2a948Sdanielk1977 char *zHeader = pPager->pTmpSpace; /* Temporary space used to build header */ 144943b18e1eSdrh u32 nHeader = (u32)pPager->pageSize;/* Size of buffer pointed to by zHeader */ 1450bea2a948Sdanielk1977 u32 nWrite; /* Bytes of header sector written */ 1451bea2a948Sdanielk1977 int ii; /* Loop counter */ 1452bea2a948Sdanielk1977 1453bea2a948Sdanielk1977 assert( isOpen(pPager->jfd) ); /* Journal file must be open. */ 1454a664f8ebSdanielk1977 1455a664f8ebSdanielk1977 if( nHeader>JOURNAL_HDR_SZ(pPager) ){ 1456a664f8ebSdanielk1977 nHeader = JOURNAL_HDR_SZ(pPager); 1457a664f8ebSdanielk1977 } 14587657240aSdanielk1977 1459bea2a948Sdanielk1977 /* If there are active savepoints and any of them were created 1460bea2a948Sdanielk1977 ** since the most recent journal header was written, update the 1461bea2a948Sdanielk1977 ** PagerSavepoint.iHdrOffset fields now. 1462fd7f0452Sdanielk1977 */ 1463fd7f0452Sdanielk1977 for(ii=0; ii<pPager->nSavepoint; ii++){ 1464fd7f0452Sdanielk1977 if( pPager->aSavepoint[ii].iHdrOffset==0 ){ 1465fd7f0452Sdanielk1977 pPager->aSavepoint[ii].iHdrOffset = pPager->journalOff; 1466fd7f0452Sdanielk1977 } 14674099f6e1Sdanielk1977 } 14684099f6e1Sdanielk1977 1469bea2a948Sdanielk1977 pPager->journalHdr = pPager->journalOff = journalHdrOffset(pPager); 14704cd2cd5cSdanielk1977 14714cd2cd5cSdanielk1977 /* 14724cd2cd5cSdanielk1977 ** Write the nRec Field - the number of page records that follow this 14734cd2cd5cSdanielk1977 ** journal header. Normally, zero is written to this value at this time. 14744cd2cd5cSdanielk1977 ** After the records are added to the journal (and the journal synced, 14754cd2cd5cSdanielk1977 ** if in full-sync mode), the zero is overwritten with the true number 14764cd2cd5cSdanielk1977 ** of records (see syncJournal()). 14774cd2cd5cSdanielk1977 ** 14784cd2cd5cSdanielk1977 ** A faster alternative is to write 0xFFFFFFFF to the nRec field. When 14794cd2cd5cSdanielk1977 ** reading the journal this value tells SQLite to assume that the 14804cd2cd5cSdanielk1977 ** rest of the journal file contains valid page records. This assumption 1481be217793Sshane ** is dangerous, as if a failure occurred whilst writing to the journal 14824cd2cd5cSdanielk1977 ** file it may contain some garbage data. There are two scenarios 14834cd2cd5cSdanielk1977 ** where this risk can be ignored: 14844cd2cd5cSdanielk1977 ** 14854cd2cd5cSdanielk1977 ** * When the pager is in no-sync mode. Corruption can follow a 14864cd2cd5cSdanielk1977 ** power failure in this case anyway. 14874cd2cd5cSdanielk1977 ** 14884cd2cd5cSdanielk1977 ** * When the SQLITE_IOCAP_SAFE_APPEND flag is set. This guarantees 14894cd2cd5cSdanielk1977 ** that garbage data is never appended to the journal file. 14904cd2cd5cSdanielk1977 */ 1491bea2a948Sdanielk1977 assert( isOpen(pPager->fd) || pPager->noSync ); 1492d0864087Sdan if( pPager->noSync || (pPager->journalMode==PAGER_JOURNALMODE_MEMORY) 14934cd2cd5cSdanielk1977 || (sqlite3OsDeviceCharacteristics(pPager->fd)&SQLITE_IOCAP_SAFE_APPEND) 14944cd2cd5cSdanielk1977 ){ 14956f4c73eeSdanielk1977 memcpy(zHeader, aJournalMagic, sizeof(aJournalMagic)); 14964cd2cd5cSdanielk1977 put32bits(&zHeader[sizeof(aJournalMagic)], 0xffffffff); 14974cd2cd5cSdanielk1977 }else{ 14985ec53191Sdrh memset(zHeader, 0, sizeof(aJournalMagic)+4); 14994cd2cd5cSdanielk1977 } 15004cd2cd5cSdanielk1977 150148864df9Smistachkin /* The random check-hash initializer */ 15022fa1868fSdrh sqlite3_randomness(sizeof(pPager->cksumInit), &pPager->cksumInit); 150397b57484Sdrh put32bits(&zHeader[sizeof(aJournalMagic)+4], pPager->cksumInit); 15047657240aSdanielk1977 /* The initial database size */ 15053460d19cSdanielk1977 put32bits(&zHeader[sizeof(aJournalMagic)+8], pPager->dbOrigSize); 15067657240aSdanielk1977 /* The assumed sector size for this process */ 150797b57484Sdrh put32bits(&zHeader[sizeof(aJournalMagic)+12], pPager->sectorSize); 150808609ce7Sdrh 1509bea2a948Sdanielk1977 /* The page size */ 1510bea2a948Sdanielk1977 put32bits(&zHeader[sizeof(aJournalMagic)+16], pPager->pageSize); 1511bea2a948Sdanielk1977 151208609ce7Sdrh /* Initializing the tail of the buffer is not necessary. Everything 151308609ce7Sdrh ** works find if the following memset() is omitted. But initializing 151408609ce7Sdrh ** the memory prevents valgrind from complaining, so we are willing to 151508609ce7Sdrh ** take the performance hit. 151608609ce7Sdrh */ 1517bea2a948Sdanielk1977 memset(&zHeader[sizeof(aJournalMagic)+20], 0, 1518bea2a948Sdanielk1977 nHeader-(sizeof(aJournalMagic)+20)); 151908609ce7Sdrh 1520bea2a948Sdanielk1977 /* In theory, it is only necessary to write the 28 bytes that the 1521bea2a948Sdanielk1977 ** journal header consumes to the journal file here. Then increment the 1522bea2a948Sdanielk1977 ** Pager.journalOff variable by JOURNAL_HDR_SZ so that the next 1523bea2a948Sdanielk1977 ** record is written to the following sector (leaving a gap in the file 1524bea2a948Sdanielk1977 ** that will be implicitly filled in by the OS). 1525bea2a948Sdanielk1977 ** 1526bea2a948Sdanielk1977 ** However it has been discovered that on some systems this pattern can 1527bea2a948Sdanielk1977 ** be significantly slower than contiguously writing data to the file, 1528bea2a948Sdanielk1977 ** even if that means explicitly writing data to the block of 1529bea2a948Sdanielk1977 ** (JOURNAL_HDR_SZ - 28) bytes that will not be used. So that is what 1530bea2a948Sdanielk1977 ** is done. 1531bea2a948Sdanielk1977 ** 1532bea2a948Sdanielk1977 ** The loop is required here in case the sector-size is larger than the 1533bea2a948Sdanielk1977 ** database page size. Since the zHeader buffer is only Pager.pageSize 1534bea2a948Sdanielk1977 ** bytes in size, more than one call to sqlite3OsWrite() may be required 1535bea2a948Sdanielk1977 ** to populate the entire journal header sector. 1536bea2a948Sdanielk1977 */ 1537a664f8ebSdanielk1977 for(nWrite=0; rc==SQLITE_OK&&nWrite<JOURNAL_HDR_SZ(pPager); nWrite+=nHeader){ 1538a664f8ebSdanielk1977 IOTRACE(("JHDR %p %lld %d\n", pPager, pPager->journalHdr, nHeader)) 1539a664f8ebSdanielk1977 rc = sqlite3OsWrite(pPager->jfd, zHeader, nHeader, pPager->journalOff); 154091781bd7Sdrh assert( pPager->journalHdr <= pPager->journalOff ); 1541a664f8ebSdanielk1977 pPager->journalOff += nHeader; 1542b4746b9eSdrh } 1543a664f8ebSdanielk1977 15447657240aSdanielk1977 return rc; 15457657240aSdanielk1977 } 15467657240aSdanielk1977 15477657240aSdanielk1977 /* 15487657240aSdanielk1977 ** The journal file must be open when this is called. A journal header file 15497657240aSdanielk1977 ** (JOURNAL_HDR_SZ bytes) is read from the current location in the journal 1550d6e5e098Sdrh ** file. The current location in the journal file is given by 1551d6e5e098Sdrh ** pPager->journalOff. See comments above function writeJournalHdr() for 1552d6e5e098Sdrh ** a description of the journal header format. 15537657240aSdanielk1977 ** 1554bea2a948Sdanielk1977 ** If the header is read successfully, *pNRec is set to the number of 1555bea2a948Sdanielk1977 ** page records following this header and *pDbSize is set to the size of the 15567657240aSdanielk1977 ** database before the transaction began, in pages. Also, pPager->cksumInit 15577657240aSdanielk1977 ** is set to the value read from the journal header. SQLITE_OK is returned 15587657240aSdanielk1977 ** in this case. 15597657240aSdanielk1977 ** 15607657240aSdanielk1977 ** If the journal header file appears to be corrupted, SQLITE_DONE is 1561bea2a948Sdanielk1977 ** returned and *pNRec and *PDbSize are undefined. If JOURNAL_HDR_SZ bytes 15627657240aSdanielk1977 ** cannot be read from the journal file an error code is returned. 15637657240aSdanielk1977 */ 15647657240aSdanielk1977 static int readJournalHdr( 1565bea2a948Sdanielk1977 Pager *pPager, /* Pager object */ 15666f4c73eeSdanielk1977 int isHot, 1567bea2a948Sdanielk1977 i64 journalSize, /* Size of the open journal file in bytes */ 1568bea2a948Sdanielk1977 u32 *pNRec, /* OUT: Value read from the nRec field */ 1569bea2a948Sdanielk1977 u32 *pDbSize /* OUT: Value of original database size field */ 15707657240aSdanielk1977 ){ 1571bea2a948Sdanielk1977 int rc; /* Return code */ 15727657240aSdanielk1977 unsigned char aMagic[8]; /* A buffer to hold the magic header */ 1573bea2a948Sdanielk1977 i64 iHdrOff; /* Offset of journal header being read */ 15747657240aSdanielk1977 1575bea2a948Sdanielk1977 assert( isOpen(pPager->jfd) ); /* Journal file must be open. */ 1576bea2a948Sdanielk1977 1577bea2a948Sdanielk1977 /* Advance Pager.journalOff to the start of the next sector. If the 1578bea2a948Sdanielk1977 ** journal file is too small for there to be a header stored at this 1579bea2a948Sdanielk1977 ** point, return SQLITE_DONE. 1580bea2a948Sdanielk1977 */ 1581bea2a948Sdanielk1977 pPager->journalOff = journalHdrOffset(pPager); 15827657240aSdanielk1977 if( pPager->journalOff+JOURNAL_HDR_SZ(pPager) > journalSize ){ 15837657240aSdanielk1977 return SQLITE_DONE; 15847657240aSdanielk1977 } 1585bea2a948Sdanielk1977 iHdrOff = pPager->journalOff; 15867657240aSdanielk1977 1587bea2a948Sdanielk1977 /* Read in the first 8 bytes of the journal header. If they do not match 1588bea2a948Sdanielk1977 ** the magic string found at the start of each journal header, return 1589bea2a948Sdanielk1977 ** SQLITE_DONE. If an IO error occurs, return an error code. Otherwise, 1590bea2a948Sdanielk1977 ** proceed. 1591bea2a948Sdanielk1977 */ 15926f4c73eeSdanielk1977 if( isHot || iHdrOff!=pPager->journalHdr ){ 1593bea2a948Sdanielk1977 rc = sqlite3OsRead(pPager->jfd, aMagic, sizeof(aMagic), iHdrOff); 1594bea2a948Sdanielk1977 if( rc ){ 1595bea2a948Sdanielk1977 return rc; 1596bea2a948Sdanielk1977 } 15977657240aSdanielk1977 if( memcmp(aMagic, aJournalMagic, sizeof(aMagic))!=0 ){ 15987657240aSdanielk1977 return SQLITE_DONE; 15997657240aSdanielk1977 } 16006f4c73eeSdanielk1977 } 16017657240aSdanielk1977 1602bea2a948Sdanielk1977 /* Read the first three 32-bit fields of the journal header: The nRec 1603bea2a948Sdanielk1977 ** field, the checksum-initializer and the database size at the start 1604bea2a948Sdanielk1977 ** of the transaction. Return an error code if anything goes wrong. 1605bea2a948Sdanielk1977 */ 1606bea2a948Sdanielk1977 if( SQLITE_OK!=(rc = read32bits(pPager->jfd, iHdrOff+8, pNRec)) 1607bea2a948Sdanielk1977 || SQLITE_OK!=(rc = read32bits(pPager->jfd, iHdrOff+12, &pPager->cksumInit)) 1608bea2a948Sdanielk1977 || SQLITE_OK!=(rc = read32bits(pPager->jfd, iHdrOff+16, pDbSize)) 1609bea2a948Sdanielk1977 ){ 1610bea2a948Sdanielk1977 return rc; 1611bea2a948Sdanielk1977 } 16127657240aSdanielk1977 16137cbd589dSdanielk1977 if( pPager->journalOff==0 ){ 1614bea2a948Sdanielk1977 u32 iPageSize; /* Page-size field of journal header */ 1615bea2a948Sdanielk1977 u32 iSectorSize; /* Sector-size field of journal header */ 16167cbd589dSdanielk1977 1617bea2a948Sdanielk1977 /* Read the page-size and sector-size journal header fields. */ 1618bea2a948Sdanielk1977 if( SQLITE_OK!=(rc = read32bits(pPager->jfd, iHdrOff+20, &iSectorSize)) 1619bea2a948Sdanielk1977 || SQLITE_OK!=(rc = read32bits(pPager->jfd, iHdrOff+24, &iPageSize)) 162067c007bfSdanielk1977 ){ 1621bea2a948Sdanielk1977 return rc; 162267c007bfSdanielk1977 } 1623bea2a948Sdanielk1977 1624a35dafcdSdan /* Versions of SQLite prior to 3.5.8 set the page-size field of the 1625a35dafcdSdan ** journal header to zero. In this case, assume that the Pager.pageSize 1626a35dafcdSdan ** variable is already set to the correct page size. 1627a35dafcdSdan */ 1628a35dafcdSdan if( iPageSize==0 ){ 1629a35dafcdSdan iPageSize = pPager->pageSize; 1630a35dafcdSdan } 1631a35dafcdSdan 1632bea2a948Sdanielk1977 /* Check that the values read from the page-size and sector-size fields 1633bea2a948Sdanielk1977 ** are within range. To be 'in range', both values need to be a power 16343c99d68bSdrh ** of two greater than or equal to 512 or 32, and not greater than their 1635bea2a948Sdanielk1977 ** respective compile time maximum limits. 1636bea2a948Sdanielk1977 */ 16373c99d68bSdrh if( iPageSize<512 || iSectorSize<32 1638bea2a948Sdanielk1977 || iPageSize>SQLITE_MAX_PAGE_SIZE || iSectorSize>MAX_SECTOR_SIZE 1639bea2a948Sdanielk1977 || ((iPageSize-1)&iPageSize)!=0 || ((iSectorSize-1)&iSectorSize)!=0 1640bea2a948Sdanielk1977 ){ 1641bea2a948Sdanielk1977 /* If the either the page-size or sector-size in the journal-header is 1642bea2a948Sdanielk1977 ** invalid, then the process that wrote the journal-header must have 1643bea2a948Sdanielk1977 ** crashed before the header was synced. In this case stop reading 1644bea2a948Sdanielk1977 ** the journal file here. 1645bea2a948Sdanielk1977 */ 1646bea2a948Sdanielk1977 return SQLITE_DONE; 1647bea2a948Sdanielk1977 } 1648bea2a948Sdanielk1977 1649bea2a948Sdanielk1977 /* Update the page-size to match the value read from the journal. 1650bea2a948Sdanielk1977 ** Use a testcase() macro to make sure that malloc failure within 1651bea2a948Sdanielk1977 ** PagerSetPagesize() is tested. 1652bea2a948Sdanielk1977 */ 1653b2eced5dSdrh rc = sqlite3PagerSetPagesize(pPager, &iPageSize, -1); 1654bea2a948Sdanielk1977 testcase( rc!=SQLITE_OK ); 165567c007bfSdanielk1977 16567657240aSdanielk1977 /* Update the assumed sector-size to match the value used by 16577657240aSdanielk1977 ** the process that created this journal. If this journal was 16587657240aSdanielk1977 ** created by a process other than this one, then this routine 16597657240aSdanielk1977 ** is being called from within pager_playback(). The local value 16607657240aSdanielk1977 ** of Pager.sectorSize is restored at the end of that routine. 16617657240aSdanielk1977 */ 16627cbd589dSdanielk1977 pPager->sectorSize = iSectorSize; 16637cbd589dSdanielk1977 } 16647657240aSdanielk1977 16657657240aSdanielk1977 pPager->journalOff += JOURNAL_HDR_SZ(pPager); 1666bea2a948Sdanielk1977 return rc; 16677657240aSdanielk1977 } 16687657240aSdanielk1977 16697657240aSdanielk1977 16707657240aSdanielk1977 /* 16717657240aSdanielk1977 ** Write the supplied master journal name into the journal file for pager 1672cafadbacSdanielk1977 ** pPager at the current location. The master journal name must be the last 1673cafadbacSdanielk1977 ** thing written to a journal file. If the pager is in full-sync mode, the 1674cafadbacSdanielk1977 ** journal file descriptor is advanced to the next sector boundary before 1675cafadbacSdanielk1977 ** anything is written. The format is: 1676cafadbacSdanielk1977 ** 1677cafadbacSdanielk1977 ** + 4 bytes: PAGER_MJ_PGNO. 1678bea2a948Sdanielk1977 ** + N bytes: Master journal filename in utf-8. 1679bea2a948Sdanielk1977 ** + 4 bytes: N (length of master journal name in bytes, no nul-terminator). 1680cafadbacSdanielk1977 ** + 4 bytes: Master journal name checksum. 1681cafadbacSdanielk1977 ** + 8 bytes: aJournalMagic[]. 1682cafadbacSdanielk1977 ** 1683cafadbacSdanielk1977 ** The master journal page checksum is the sum of the bytes in the master 1684bea2a948Sdanielk1977 ** journal name, where each byte is interpreted as a signed 8-bit integer. 1685aef0bf64Sdanielk1977 ** 1686aef0bf64Sdanielk1977 ** If zMaster is a NULL pointer (occurs for a single database transaction), 1687aef0bf64Sdanielk1977 ** this call is a no-op. 16887657240aSdanielk1977 */ 16897657240aSdanielk1977 static int writeMasterJournal(Pager *pPager, const char *zMaster){ 1690bea2a948Sdanielk1977 int rc; /* Return code */ 1691bea2a948Sdanielk1977 int nMaster; /* Length of string zMaster */ 1692bea2a948Sdanielk1977 i64 iHdrOff; /* Offset of header in journal file */ 1693bea2a948Sdanielk1977 i64 jrnlSize; /* Size of journal file on disk */ 1694bea2a948Sdanielk1977 u32 cksum = 0; /* Checksum of string zMaster */ 16957657240aSdanielk1977 16961e01cf1bSdan assert( pPager->setMaster==0 ); 1697d0864087Sdan assert( !pagerUseWal(pPager) ); 16981e01cf1bSdan 16991e01cf1bSdan if( !zMaster 1700bea2a948Sdanielk1977 || pPager->journalMode==PAGER_JOURNALMODE_MEMORY 17011fb6a110Sdrh || !isOpen(pPager->jfd) 1702bea2a948Sdanielk1977 ){ 1703bea2a948Sdanielk1977 return SQLITE_OK; 1704bea2a948Sdanielk1977 } 17057657240aSdanielk1977 pPager->setMaster = 1; 170691781bd7Sdrh assert( pPager->journalHdr <= pPager->journalOff ); 17077657240aSdanielk1977 1708bea2a948Sdanielk1977 /* Calculate the length in bytes and the checksum of zMaster */ 1709bea2a948Sdanielk1977 for(nMaster=0; zMaster[nMaster]; nMaster++){ 1710bea2a948Sdanielk1977 cksum += zMaster[nMaster]; 1711cafadbacSdanielk1977 } 17127657240aSdanielk1977 17137657240aSdanielk1977 /* If in full-sync mode, advance to the next disk sector before writing 17147657240aSdanielk1977 ** the master journal name. This is in case the previous page written to 17157657240aSdanielk1977 ** the journal has already been synced. 17167657240aSdanielk1977 */ 17177657240aSdanielk1977 if( pPager->fullSync ){ 1718bea2a948Sdanielk1977 pPager->journalOff = journalHdrOffset(pPager); 17197657240aSdanielk1977 } 1720bea2a948Sdanielk1977 iHdrOff = pPager->journalOff; 17217657240aSdanielk1977 1722bea2a948Sdanielk1977 /* Write the master journal data to the end of the journal file. If 1723bea2a948Sdanielk1977 ** an error occurs, return the error code to the caller. 1724bea2a948Sdanielk1977 */ 172563207ab2Sshane if( (0 != (rc = write32bits(pPager->jfd, iHdrOff, PAGER_MJ_PGNO(pPager)))) 172663207ab2Sshane || (0 != (rc = sqlite3OsWrite(pPager->jfd, zMaster, nMaster, iHdrOff+4))) 172763207ab2Sshane || (0 != (rc = write32bits(pPager->jfd, iHdrOff+4+nMaster, nMaster))) 172863207ab2Sshane || (0 != (rc = write32bits(pPager->jfd, iHdrOff+4+nMaster+4, cksum))) 1729e399ac2eSdrh || (0 != (rc = sqlite3OsWrite(pPager->jfd, aJournalMagic, 8, 1730e399ac2eSdrh iHdrOff+4+nMaster+8))) 1731bea2a948Sdanielk1977 ){ 1732bea2a948Sdanielk1977 return rc; 1733bea2a948Sdanielk1977 } 1734bea2a948Sdanielk1977 pPager->journalOff += (nMaster+20); 1735df2566a3Sdanielk1977 1736df2566a3Sdanielk1977 /* If the pager is in peristent-journal mode, then the physical 1737df2566a3Sdanielk1977 ** journal-file may extend past the end of the master-journal name 1738df2566a3Sdanielk1977 ** and 8 bytes of magic data just written to the file. This is 1739df2566a3Sdanielk1977 ** dangerous because the code to rollback a hot-journal file 1740df2566a3Sdanielk1977 ** will not be able to find the master-journal name to determine 1741df2566a3Sdanielk1977 ** whether or not the journal is hot. 1742df2566a3Sdanielk1977 ** 1743df2566a3Sdanielk1977 ** Easiest thing to do in this scenario is to truncate the journal 1744df2566a3Sdanielk1977 ** file to the required size. 1745df2566a3Sdanielk1977 */ 1746bea2a948Sdanielk1977 if( SQLITE_OK==(rc = sqlite3OsFileSize(pPager->jfd, &jrnlSize)) 1747bea2a948Sdanielk1977 && jrnlSize>pPager->journalOff 1748df2566a3Sdanielk1977 ){ 1749bea2a948Sdanielk1977 rc = sqlite3OsTruncate(pPager->jfd, pPager->journalOff); 1750df2566a3Sdanielk1977 } 17517657240aSdanielk1977 return rc; 17527657240aSdanielk1977 } 17537657240aSdanielk1977 17547657240aSdanielk1977 /* 1755a42c66bdSdan ** Discard the entire contents of the in-memory page-cache. 1756ed7c855cSdrh */ 1757d9b0257aSdrh static void pager_reset(Pager *pPager){ 1758d7107b38Sdrh pPager->iDataVersion++; 17590410302eSdanielk1977 sqlite3BackupRestart(pPager->pBackup); 17608c0a791aSdanielk1977 sqlite3PcacheClear(pPager->pPCache); 1761e277be05Sdanielk1977 } 1762e277be05Sdanielk1977 176334cf35daSdanielk1977 /* 1764d7107b38Sdrh ** Return the pPager->iDataVersion value 176591618564Sdrh */ 176691618564Sdrh u32 sqlite3PagerDataVersion(Pager *pPager){ 1767d7107b38Sdrh return pPager->iDataVersion; 176891618564Sdrh } 176991618564Sdrh 177091618564Sdrh /* 177134cf35daSdanielk1977 ** Free all structures in the Pager.aSavepoint[] array and set both 177234cf35daSdanielk1977 ** Pager.aSavepoint and Pager.nSavepoint to zero. Close the sub-journal 177334cf35daSdanielk1977 ** if it is open and the pager is not in exclusive mode. 177434cf35daSdanielk1977 */ 1775bea2a948Sdanielk1977 static void releaseAllSavepoints(Pager *pPager){ 1776bea2a948Sdanielk1977 int ii; /* Iterator for looping through Pager.aSavepoint */ 1777fd7f0452Sdanielk1977 for(ii=0; ii<pPager->nSavepoint; ii++){ 1778fd7f0452Sdanielk1977 sqlite3BitvecDestroy(pPager->aSavepoint[ii].pInSavepoint); 1779fd7f0452Sdanielk1977 } 17802491de28Sdan if( !pPager->exclusiveMode || sqlite3JournalIsInMemory(pPager->sjfd) ){ 1781fd7f0452Sdanielk1977 sqlite3OsClose(pPager->sjfd); 1782fd7f0452Sdanielk1977 } 1783fd7f0452Sdanielk1977 sqlite3_free(pPager->aSavepoint); 1784fd7f0452Sdanielk1977 pPager->aSavepoint = 0; 1785fd7f0452Sdanielk1977 pPager->nSavepoint = 0; 1786bea2a948Sdanielk1977 pPager->nSubRec = 0; 1787fd7f0452Sdanielk1977 } 1788fd7f0452Sdanielk1977 178934cf35daSdanielk1977 /* 1790bea2a948Sdanielk1977 ** Set the bit number pgno in the PagerSavepoint.pInSavepoint 1791bea2a948Sdanielk1977 ** bitvecs of all open savepoints. Return SQLITE_OK if successful 1792bea2a948Sdanielk1977 ** or SQLITE_NOMEM if a malloc failure occurs. 179334cf35daSdanielk1977 */ 1794fd7f0452Sdanielk1977 static int addToSavepointBitvecs(Pager *pPager, Pgno pgno){ 17957539b6b8Sdrh int ii; /* Loop counter */ 17967539b6b8Sdrh int rc = SQLITE_OK; /* Result code */ 17977539b6b8Sdrh 1798fd7f0452Sdanielk1977 for(ii=0; ii<pPager->nSavepoint; ii++){ 1799fd7f0452Sdanielk1977 PagerSavepoint *p = &pPager->aSavepoint[ii]; 1800fd7f0452Sdanielk1977 if( pgno<=p->nOrig ){ 18017539b6b8Sdrh rc |= sqlite3BitvecSet(p->pInSavepoint, pgno); 1802bea2a948Sdanielk1977 testcase( rc==SQLITE_NOMEM ); 18037539b6b8Sdrh assert( rc==SQLITE_OK || rc==SQLITE_NOMEM ); 1804fd7f0452Sdanielk1977 } 1805fd7f0452Sdanielk1977 } 18067539b6b8Sdrh return rc; 1807fd7f0452Sdanielk1977 } 1808fd7f0452Sdanielk1977 1809e277be05Sdanielk1977 /* 1810de5fd22fSdan ** This function is a no-op if the pager is in exclusive mode and not 1811de5fd22fSdan ** in the ERROR state. Otherwise, it switches the pager to PAGER_OPEN 1812de5fd22fSdan ** state. 1813ae72d982Sdanielk1977 ** 1814de5fd22fSdan ** If the pager is not in exclusive-access mode, the database file is 1815de5fd22fSdan ** completely unlocked. If the file is unlocked and the file-system does 1816de5fd22fSdan ** not exhibit the UNDELETABLE_WHEN_OPEN property, the journal file is 1817de5fd22fSdan ** closed (if it is open). 1818de5fd22fSdan ** 1819de5fd22fSdan ** If the pager is in ERROR state when this function is called, the 1820de5fd22fSdan ** contents of the pager cache are discarded before switching back to 1821de5fd22fSdan ** the OPEN state. Regardless of whether the pager is in exclusive-mode 1822de5fd22fSdan ** or not, any journal file left in the file-system will be treated 1823de5fd22fSdan ** as a hot-journal and rolled back the next time a read-transaction 1824de5fd22fSdan ** is opened (by this or by any other connection). 1825ae72d982Sdanielk1977 */ 1826ae72d982Sdanielk1977 static void pager_unlock(Pager *pPager){ 1827a42c66bdSdan 1828de5fd22fSdan assert( pPager->eState==PAGER_READER 1829de5fd22fSdan || pPager->eState==PAGER_OPEN 1830de5fd22fSdan || pPager->eState==PAGER_ERROR 1831de5fd22fSdan ); 1832de5fd22fSdan 1833a42c66bdSdan sqlite3BitvecDestroy(pPager->pInJournal); 1834a42c66bdSdan pPager->pInJournal = 0; 1835a42c66bdSdan releaseAllSavepoints(pPager); 1836a42c66bdSdan 1837a42c66bdSdan if( pagerUseWal(pPager) ){ 1838a42c66bdSdan assert( !isOpen(pPager->jfd) ); 1839a42c66bdSdan sqlite3WalEndReadTransaction(pPager->pWal); 1840de1ae34eSdan pPager->eState = PAGER_OPEN; 1841a42c66bdSdan }else if( !pPager->exclusiveMode ){ 18424e004aa6Sdan int rc; /* Error code returned by pagerUnlockDb() */ 1843e08341c6Sdan int iDc = isOpen(pPager->fd)?sqlite3OsDeviceCharacteristics(pPager->fd):0; 1844ae72d982Sdanielk1977 1845de3c301dSdrh /* If the operating system support deletion of open files, then 1846de3c301dSdrh ** close the journal file when dropping the database lock. Otherwise 1847de3c301dSdrh ** another connection with journal_mode=delete might delete the file 1848de3c301dSdrh ** out from under us. 184916e45a43Sdrh */ 1850e08341c6Sdan assert( (PAGER_JOURNALMODE_MEMORY & 5)!=1 ); 1851e08341c6Sdan assert( (PAGER_JOURNALMODE_OFF & 5)!=1 ); 1852e08341c6Sdan assert( (PAGER_JOURNALMODE_WAL & 5)!=1 ); 1853e08341c6Sdan assert( (PAGER_JOURNALMODE_DELETE & 5)!=1 ); 1854e08341c6Sdan assert( (PAGER_JOURNALMODE_TRUNCATE & 5)==1 ); 1855e08341c6Sdan assert( (PAGER_JOURNALMODE_PERSIST & 5)==1 ); 1856e08341c6Sdan if( 0==(iDc & SQLITE_IOCAP_UNDELETABLE_WHEN_OPEN) 1857e08341c6Sdan || 1!=(pPager->journalMode & 5) 18582a321c75Sdan ){ 185916e45a43Sdrh sqlite3OsClose(pPager->jfd); 18602a321c75Sdan } 18614e004aa6Sdan 186254919f82Sdan /* If the pager is in the ERROR state and the call to unlock the database 186354919f82Sdan ** file fails, set the current lock to UNKNOWN_LOCK. See the comment 186454919f82Sdan ** above the #define for UNKNOWN_LOCK for an explanation of why this 186554919f82Sdan ** is necessary. 186654919f82Sdan */ 18674e004aa6Sdan rc = pagerUnlockDb(pPager, NO_LOCK); 18684e004aa6Sdan if( rc!=SQLITE_OK && pPager->eState==PAGER_ERROR ){ 18694e004aa6Sdan pPager->eLock = UNKNOWN_LOCK; 18704e004aa6Sdan } 18712a321c75Sdan 1872de1ae34eSdan /* The pager state may be changed from PAGER_ERROR to PAGER_OPEN here 1873a42c66bdSdan ** without clearing the error code. This is intentional - the error 1874a42c66bdSdan ** code is cleared and the cache reset in the block below. 1875ae72d982Sdanielk1977 */ 1876a42c66bdSdan assert( pPager->errCode || pPager->eState!=PAGER_ERROR ); 187745d6882fSdanielk1977 pPager->changeCountDone = 0; 1878de1ae34eSdan pPager->eState = PAGER_OPEN; 1879a42c66bdSdan } 1880a42c66bdSdan 1881a42c66bdSdan /* If Pager.errCode is set, the contents of the pager cache cannot be 1882a42c66bdSdan ** trusted. Now that there are no outstanding references to the pager, 1883de1ae34eSdan ** it can safely move back to PAGER_OPEN state. This happens in both 1884a42c66bdSdan ** normal and exclusive-locking mode. 18856c963586Sdrh */ 188667330a12Sdan assert( pPager->errCode==SQLITE_OK || !MEMDB ); 18876572c16aSdan if( pPager->errCode ){ 18886572c16aSdan if( pPager->tempFile==0 ){ 1889a42c66bdSdan pager_reset(pPager); 189067330a12Sdan pPager->changeCountDone = 0; 1891de1ae34eSdan pPager->eState = PAGER_OPEN; 18926572c16aSdan }else{ 18936572c16aSdan pPager->eState = (isOpen(pPager->jfd) ? PAGER_OPEN : PAGER_READER); 18946572c16aSdan } 1895789efdb9Sdan if( USEFETCH(pPager) ) sqlite3OsUnfetch(pPager->fd, 0, 0); 18966572c16aSdan pPager->errCode = SQLITE_OK; 189712e6f682Sdrh setGetterMethod(pPager); 1898ae72d982Sdanielk1977 } 18994e004aa6Sdan 19004e004aa6Sdan pPager->journalOff = 0; 19014e004aa6Sdan pPager->journalHdr = 0; 19024e004aa6Sdan pPager->setMaster = 0; 1903ae72d982Sdanielk1977 } 1904ae72d982Sdanielk1977 1905ae72d982Sdanielk1977 /* 1906de5fd22fSdan ** This function is called whenever an IOERR or FULL error that requires 1907de5fd22fSdan ** the pager to transition into the ERROR state may ahve occurred. 1908de5fd22fSdan ** The first argument is a pointer to the pager structure, the second 1909de5fd22fSdan ** the error-code about to be returned by a pager API function. The 1910de5fd22fSdan ** value returned is a copy of the second argument to this function. 1911bea2a948Sdanielk1977 ** 1912de5fd22fSdan ** If the second argument is SQLITE_FULL, SQLITE_IOERR or one of the 1913de5fd22fSdan ** IOERR sub-codes, the pager enters the ERROR state and the error code 1914de5fd22fSdan ** is stored in Pager.errCode. While the pager remains in the ERROR state, 1915de5fd22fSdan ** all major API calls on the Pager will immediately return Pager.errCode. 1916bea2a948Sdanielk1977 ** 1917de5fd22fSdan ** The ERROR state indicates that the contents of the pager-cache 1918bea2a948Sdanielk1977 ** cannot be trusted. This state can be cleared by completely discarding 1919bea2a948Sdanielk1977 ** the contents of the pager-cache. If a transaction was active when 1920be217793Sshane ** the persistent error occurred, then the rollback journal may need 1921bea2a948Sdanielk1977 ** to be replayed to restore the contents of the database file (as if 1922bea2a948Sdanielk1977 ** it were a hot-journal). 1923bea2a948Sdanielk1977 */ 1924bea2a948Sdanielk1977 static int pager_error(Pager *pPager, int rc){ 1925bea2a948Sdanielk1977 int rc2 = rc & 0xff; 1926c7ca875eSdanielk1977 assert( rc==SQLITE_OK || !MEMDB ); 1927bea2a948Sdanielk1977 assert( 1928bea2a948Sdanielk1977 pPager->errCode==SQLITE_FULL || 1929bea2a948Sdanielk1977 pPager->errCode==SQLITE_OK || 1930bea2a948Sdanielk1977 (pPager->errCode & 0xff)==SQLITE_IOERR 1931bea2a948Sdanielk1977 ); 1932b75d570eSdrh if( rc2==SQLITE_FULL || rc2==SQLITE_IOERR ){ 1933bea2a948Sdanielk1977 pPager->errCode = rc; 1934a42c66bdSdan pPager->eState = PAGER_ERROR; 193512e6f682Sdrh setGetterMethod(pPager); 1936bea2a948Sdanielk1977 } 1937bea2a948Sdanielk1977 return rc; 1938bea2a948Sdanielk1977 } 1939bea2a948Sdanielk1977 1940bc1a3c6cSdan static int pager_truncate(Pager *pPager, Pgno nPage); 1941bc1a3c6cSdan 1942bea2a948Sdanielk1977 /* 19434bf7d21fSdrh ** The write transaction open on pPager is being committed (bCommit==1) 19444bf7d21fSdrh ** or rolled back (bCommit==0). 19450f52455aSdan ** 19464bf7d21fSdrh ** Return TRUE if and only if all dirty pages should be flushed to disk. 19470f52455aSdan ** 19484bf7d21fSdrh ** Rules: 19490f52455aSdan ** 19504bf7d21fSdrh ** * For non-TEMP databases, always sync to disk. This is necessary 19514bf7d21fSdrh ** for transactions to be durable. 19524bf7d21fSdrh ** 19534bf7d21fSdrh ** * Sync TEMP database only on a COMMIT (not a ROLLBACK) when the backing 19544bf7d21fSdrh ** file has been created already (via a spill on pagerStress()) and 19554bf7d21fSdrh ** when the number of dirty pages in memory exceeds 25% of the total 19564bf7d21fSdrh ** cache size. 19570f52455aSdan */ 19584bf7d21fSdrh static int pagerFlushOnCommit(Pager *pPager, int bCommit){ 19590f52455aSdan if( pPager->tempFile==0 ) return 1; 19604bf7d21fSdrh if( !bCommit ) return 0; 19610f52455aSdan if( !isOpen(pPager->fd) ) return 0; 19620f52455aSdan return (sqlite3PCachePercentDirty(pPager->pPCache)>=25); 19630f52455aSdan } 19640f52455aSdan 19650f52455aSdan /* 1966bea2a948Sdanielk1977 ** This routine ends a transaction. A transaction is usually ended by 1967bea2a948Sdanielk1977 ** either a COMMIT or a ROLLBACK operation. This routine may be called 1968bea2a948Sdanielk1977 ** after rollback of a hot-journal, or if an error occurs while opening 1969bea2a948Sdanielk1977 ** the journal file or writing the very first journal-header of a 1970bea2a948Sdanielk1977 ** database transaction. 197180e35f46Sdrh ** 197285d14ed2Sdan ** This routine is never called in PAGER_ERROR state. If it is called 197385d14ed2Sdan ** in PAGER_NONE or PAGER_SHARED state and the lock held is less 197485d14ed2Sdan ** exclusive than a RESERVED lock, it is a no-op. 197580e35f46Sdrh ** 1976bea2a948Sdanielk1977 ** Otherwise, any active savepoints are released. 197750457896Sdrh ** 1978bea2a948Sdanielk1977 ** If the journal file is open, then it is "finalized". Once a journal 1979bea2a948Sdanielk1977 ** file has been finalized it is not possible to use it to roll back a 1980bea2a948Sdanielk1977 ** transaction. Nor will it be considered to be a hot-journal by this 1981bea2a948Sdanielk1977 ** or any other database connection. Exactly how a journal is finalized 1982bea2a948Sdanielk1977 ** depends on whether or not the pager is running in exclusive mode and 1983bea2a948Sdanielk1977 ** the current journal-mode (Pager.journalMode value), as follows: 1984bea2a948Sdanielk1977 ** 1985bea2a948Sdanielk1977 ** journalMode==MEMORY 1986bea2a948Sdanielk1977 ** Journal file descriptor is simply closed. This destroys an 1987bea2a948Sdanielk1977 ** in-memory journal. 1988bea2a948Sdanielk1977 ** 1989bea2a948Sdanielk1977 ** journalMode==TRUNCATE 1990bea2a948Sdanielk1977 ** Journal file is truncated to zero bytes in size. 1991bea2a948Sdanielk1977 ** 1992bea2a948Sdanielk1977 ** journalMode==PERSIST 1993bea2a948Sdanielk1977 ** The first 28 bytes of the journal file are zeroed. This invalidates 1994bea2a948Sdanielk1977 ** the first journal header in the file, and hence the entire journal 1995bea2a948Sdanielk1977 ** file. An invalid journal file cannot be rolled back. 1996bea2a948Sdanielk1977 ** 1997bea2a948Sdanielk1977 ** journalMode==DELETE 1998bea2a948Sdanielk1977 ** The journal file is closed and deleted using sqlite3OsDelete(). 1999bea2a948Sdanielk1977 ** 2000bea2a948Sdanielk1977 ** If the pager is running in exclusive mode, this method of finalizing 2001bea2a948Sdanielk1977 ** the journal file is never used. Instead, if the journalMode is 2002bea2a948Sdanielk1977 ** DELETE and the pager is in exclusive mode, the method described under 2003bea2a948Sdanielk1977 ** journalMode==PERSIST is used instead. 2004bea2a948Sdanielk1977 ** 200585d14ed2Sdan ** After the journal is finalized, the pager moves to PAGER_READER state. 200685d14ed2Sdan ** If running in non-exclusive rollback mode, the lock on the file is 200785d14ed2Sdan ** downgraded to a SHARED_LOCK. 2008bea2a948Sdanielk1977 ** 2009bea2a948Sdanielk1977 ** SQLITE_OK is returned if no error occurs. If an error occurs during 2010bea2a948Sdanielk1977 ** any of the IO operations to finalize the journal file or unlock the 2011bea2a948Sdanielk1977 ** database then the IO error code is returned to the user. If the 2012bea2a948Sdanielk1977 ** operation to finalize the journal file fails, then the code still 2013bea2a948Sdanielk1977 ** tries to unlock the database file if not in exclusive mode. If the 2014bea2a948Sdanielk1977 ** unlock operation fails as well, then the first error code related 2015bea2a948Sdanielk1977 ** to the first error encountered (the journal finalization one) is 2016bea2a948Sdanielk1977 ** returned. 2017ed7c855cSdrh */ 2018bc1a3c6cSdan static int pager_end_transaction(Pager *pPager, int hasMaster, int bCommit){ 2019bea2a948Sdanielk1977 int rc = SQLITE_OK; /* Error code from journal finalization operation */ 2020bea2a948Sdanielk1977 int rc2 = SQLITE_OK; /* Error code from db file unlock operation */ 2021bea2a948Sdanielk1977 202285d14ed2Sdan /* Do nothing if the pager does not have an open write transaction 202385d14ed2Sdan ** or at least a RESERVED lock. This function may be called when there 202485d14ed2Sdan ** is no write-transaction active but a RESERVED or greater lock is 202585d14ed2Sdan ** held under two circumstances: 202685d14ed2Sdan ** 202785d14ed2Sdan ** 1. After a successful hot-journal rollback, it is called with 202885d14ed2Sdan ** eState==PAGER_NONE and eLock==EXCLUSIVE_LOCK. 202985d14ed2Sdan ** 203085d14ed2Sdan ** 2. If a connection with locking_mode=exclusive holding an EXCLUSIVE 203185d14ed2Sdan ** lock switches back to locking_mode=normal and then executes a 203285d14ed2Sdan ** read-transaction, this function is called with eState==PAGER_READER 203385d14ed2Sdan ** and eLock==EXCLUSIVE_LOCK when the read-transaction is closed. 203485d14ed2Sdan */ 2035d0864087Sdan assert( assert_pager_state(pPager) ); 2036a42c66bdSdan assert( pPager->eState!=PAGER_ERROR ); 2037de1ae34eSdan if( pPager->eState<PAGER_WRITER_LOCKED && pPager->eLock<RESERVED_LOCK ){ 2038a6abd041Sdrh return SQLITE_OK; 2039a6abd041Sdrh } 2040bea2a948Sdanielk1977 2041d0864087Sdan releaseAllSavepoints(pPager); 2042efe16971Sdan assert( isOpen(pPager->jfd) || pPager->pInJournal==0 2043efe16971Sdan || (sqlite3OsDeviceCharacteristics(pPager->fd)&SQLITE_IOCAP_BATCH_ATOMIC) 2044efe16971Sdan ); 2045bea2a948Sdanielk1977 if( isOpen(pPager->jfd) ){ 20467ed91f23Sdrh assert( !pagerUseWal(pPager) ); 2047bea2a948Sdanielk1977 2048bea2a948Sdanielk1977 /* Finalize the journal file. */ 20492491de28Sdan if( sqlite3JournalIsInMemory(pPager->jfd) ){ 20502491de28Sdan /* assert( pPager->journalMode==PAGER_JOURNALMODE_MEMORY ); */ 2051b3175389Sdanielk1977 sqlite3OsClose(pPager->jfd); 20529e7ba7c6Sdrh }else if( pPager->journalMode==PAGER_JOURNALMODE_TRUNCATE ){ 205359813953Sdrh if( pPager->journalOff==0 ){ 205459813953Sdrh rc = SQLITE_OK; 205559813953Sdrh }else{ 20569e7ba7c6Sdrh rc = sqlite3OsTruncate(pPager->jfd, 0); 2057442c5cd3Sdrh if( rc==SQLITE_OK && pPager->fullSync ){ 2058442c5cd3Sdrh /* Make sure the new file size is written into the inode right away. 2059442c5cd3Sdrh ** Otherwise the journal might resurrect following a power loss and 2060442c5cd3Sdrh ** cause the last transaction to roll back. See 2061442c5cd3Sdrh ** https://bugzilla.mozilla.org/show_bug.cgi?id=1072773 2062442c5cd3Sdrh */ 2063442c5cd3Sdrh rc = sqlite3OsSync(pPager->jfd, pPager->syncFlags); 2064442c5cd3Sdrh } 206559813953Sdrh } 206604335886Sdrh pPager->journalOff = 0; 20675543759bSdan }else if( pPager->journalMode==PAGER_JOURNALMODE_PERSIST 20685543759bSdan || (pPager->exclusiveMode && pPager->journalMode!=PAGER_JOURNALMODE_WAL) 206993f7af97Sdanielk1977 ){ 207065c64203Sdrh rc = zeroJournalHdr(pPager, hasMaster||pPager->tempFile); 207141483468Sdanielk1977 pPager->journalOff = 0; 207241483468Sdanielk1977 }else{ 2073ded6d0f1Sdanielk1977 /* This branch may be executed with Pager.journalMode==MEMORY if 2074ded6d0f1Sdanielk1977 ** a hot-journal was just rolled back. In this case the journal 2075ded6d0f1Sdanielk1977 ** file should be closed and deleted. If this connection writes to 2076e04dc88bSdan ** the database file, it will do so using an in-memory journal. 2077e04dc88bSdan */ 20785f37ed51Sdan int bDelete = !pPager->tempFile; 20795f37ed51Sdan assert( sqlite3JournalIsInMemory(pPager->jfd)==0 ); 2080ded6d0f1Sdanielk1977 assert( pPager->journalMode==PAGER_JOURNALMODE_DELETE 2081ded6d0f1Sdanielk1977 || pPager->journalMode==PAGER_JOURNALMODE_MEMORY 2082e04dc88bSdan || pPager->journalMode==PAGER_JOURNALMODE_WAL 2083ded6d0f1Sdanielk1977 ); 2084b4b47411Sdanielk1977 sqlite3OsClose(pPager->jfd); 20853de0f184Sdan if( bDelete ){ 20866841b1cbSdrh rc = sqlite3OsDelete(pPager->pVfs, pPager->zJournal, pPager->extraSync); 208741483468Sdanielk1977 } 20887152de8dSdanielk1977 } 20895f848c3aSdan } 2090bea2a948Sdanielk1977 20913c407374Sdanielk1977 #ifdef SQLITE_CHECK_PAGES 2092bc2ca9ebSdanielk1977 sqlite3PcacheIterateDirty(pPager->pPCache, pager_set_pagehash); 20935f848c3aSdan if( pPager->dbSize==0 && sqlite3PcacheRefCount(pPager->pPCache)>0 ){ 2094c137807aSdrh PgHdr *p = sqlite3PagerLookup(pPager, 1); 20955f848c3aSdan if( p ){ 20965f848c3aSdan p->pageHash = 0; 2097da8a330aSdrh sqlite3PagerUnrefNotNull(p); 2098e9c2d34cSdrh } 20995f848c3aSdan } 21005f848c3aSdan #endif 21015f848c3aSdan 2102bea2a948Sdanielk1977 sqlite3BitvecDestroy(pPager->pInJournal); 2103bea2a948Sdanielk1977 pPager->pInJournal = 0; 2104ef317ab5Sdanielk1977 pPager->nRec = 0; 2105a37e0cfbSdrh if( rc==SQLITE_OK ){ 210665e1ba3fSdrh if( MEMDB || pagerFlushOnCommit(pPager, bCommit) ){ 2107ba726f49Sdrh sqlite3PcacheCleanAll(pPager->pPCache); 210841113b64Sdan }else{ 210941113b64Sdan sqlite3PcacheClearWritable(pPager->pPCache); 211041113b64Sdan } 2111d0864087Sdan sqlite3PcacheTruncate(pPager->pPCache, pPager->dbSize); 2112a37e0cfbSdrh } 2113979f38e5Sdanielk1977 21147ed91f23Sdrh if( pagerUseWal(pPager) ){ 2115d0864087Sdan /* Drop the WAL write-lock, if any. Also, if the connection was in 2116d0864087Sdan ** locking_mode=exclusive mode but is no longer, drop the EXCLUSIVE 2117d0864087Sdan ** lock held on the database file. 2118d0864087Sdan */ 211973b64e4dSdrh rc2 = sqlite3WalEndWriteTransaction(pPager->pWal); 21200350c7faSdrh assert( rc2==SQLITE_OK ); 2121bc1a3c6cSdan }else if( rc==SQLITE_OK && bCommit && pPager->dbFileSize>pPager->dbSize ){ 2122bc1a3c6cSdan /* This branch is taken when committing a transaction in rollback-journal 2123bc1a3c6cSdan ** mode if the database file on disk is larger than the database image. 2124bc1a3c6cSdan ** At this point the journal has been finalized and the transaction 2125bc1a3c6cSdan ** successfully committed, but the EXCLUSIVE lock is still held on the 2126bc1a3c6cSdan ** file. So it is safe to truncate the database file to its minimum 2127bc1a3c6cSdan ** required size. */ 2128bc1a3c6cSdan assert( pPager->eLock==EXCLUSIVE_LOCK ); 2129bc1a3c6cSdan rc = pager_truncate(pPager, pPager->dbSize); 21305543759bSdan } 2131bc1a3c6cSdan 2132afb39a4cSdrh if( rc==SQLITE_OK && bCommit ){ 2133999cd08aSdan rc = sqlite3OsFileControl(pPager->fd, SQLITE_FCNTL_COMMIT_PHASETWO, 0); 2134999cd08aSdan if( rc==SQLITE_NOTFOUND ) rc = SQLITE_OK; 2135999cd08aSdan } 2136999cd08aSdan 2137431b0b42Sdan if( !pPager->exclusiveMode 2138431b0b42Sdan && (!pagerUseWal(pPager) || sqlite3WalExclusiveMode(pPager->pWal, 0)) 2139431b0b42Sdan ){ 21404e004aa6Sdan rc2 = pagerUnlockDb(pPager, SHARED_LOCK); 2141104f1fefSdanielk1977 pPager->changeCountDone = 0; 2142334cdb63Sdanielk1977 } 2143d0864087Sdan pPager->eState = PAGER_READER; 21447657240aSdanielk1977 pPager->setMaster = 0; 2145979f38e5Sdanielk1977 2146979f38e5Sdanielk1977 return (rc==SQLITE_OK?rc2:rc); 2147ed7c855cSdrh } 2148ed7c855cSdrh 2149ed7c855cSdrh /* 2150d0864087Sdan ** Execute a rollback if a transaction is active and unlock the 2151d0864087Sdan ** database file. 2152d0864087Sdan ** 215385d14ed2Sdan ** If the pager has already entered the ERROR state, do not attempt 2154d0864087Sdan ** the rollback at this time. Instead, pager_unlock() is called. The 2155d0864087Sdan ** call to pager_unlock() will discard all in-memory pages, unlock 215685d14ed2Sdan ** the database file and move the pager back to OPEN state. If this 215785d14ed2Sdan ** means that there is a hot-journal left in the file-system, the next 215885d14ed2Sdan ** connection to obtain a shared lock on the pager (which may be this one) 215985d14ed2Sdan ** will roll it back. 2160d0864087Sdan ** 216185d14ed2Sdan ** If the pager has not already entered the ERROR state, but an IO or 2162d0864087Sdan ** malloc error occurs during a rollback, then this will itself cause 216385d14ed2Sdan ** the pager to enter the ERROR state. Which will be cleared by the 2164d0864087Sdan ** call to pager_unlock(), as described above. 2165d0864087Sdan */ 2166d0864087Sdan static void pagerUnlockAndRollback(Pager *pPager){ 2167de1ae34eSdan if( pPager->eState!=PAGER_ERROR && pPager->eState!=PAGER_OPEN ){ 2168a42c66bdSdan assert( assert_pager_state(pPager) ); 2169de1ae34eSdan if( pPager->eState>=PAGER_WRITER_LOCKED ){ 2170d0864087Sdan sqlite3BeginBenignMalloc(); 2171d0864087Sdan sqlite3PagerRollback(pPager); 2172d0864087Sdan sqlite3EndBenignMalloc(); 217385d14ed2Sdan }else if( !pPager->exclusiveMode ){ 217411f47a9bSdan assert( pPager->eState==PAGER_READER ); 2175bc1a3c6cSdan pager_end_transaction(pPager, 0, 0); 2176d0864087Sdan } 2177d0864087Sdan } 2178d0864087Sdan pager_unlock(pPager); 2179d0864087Sdan } 2180d0864087Sdan 2181d0864087Sdan /* 2182bea2a948Sdanielk1977 ** Parameter aData must point to a buffer of pPager->pageSize bytes 2183bea2a948Sdanielk1977 ** of data. Compute and return a checksum based ont the contents of the 2184bea2a948Sdanielk1977 ** page of data and the current value of pPager->cksumInit. 218534e79ceeSdrh ** 218634e79ceeSdrh ** This is not a real checksum. It is really just the sum of the 2187bea2a948Sdanielk1977 ** random initial value (pPager->cksumInit) and every 200th byte 2188bea2a948Sdanielk1977 ** of the page data, starting with byte offset (pPager->pageSize%200). 2189bea2a948Sdanielk1977 ** Each byte is interpreted as an 8-bit unsigned integer. 2190726de599Sdrh ** 2191bea2a948Sdanielk1977 ** Changing the formula used to compute this checksum results in an 2192bea2a948Sdanielk1977 ** incompatible journal file format. 2193bea2a948Sdanielk1977 ** 2194bea2a948Sdanielk1977 ** If journal corruption occurs due to a power failure, the most likely 2195bea2a948Sdanielk1977 ** scenario is that one end or the other of the record will be changed. 2196bea2a948Sdanielk1977 ** It is much less likely that the two ends of the journal record will be 2197726de599Sdrh ** correct and the middle be corrupt. Thus, this "checksum" scheme, 2198726de599Sdrh ** though fast and simple, catches the mostly likely kind of corruption. 2199968af52aSdrh */ 220074161705Sdrh static u32 pager_cksum(Pager *pPager, const u8 *aData){ 2201bea2a948Sdanielk1977 u32 cksum = pPager->cksumInit; /* Checksum value to return */ 2202bea2a948Sdanielk1977 int i = pPager->pageSize-200; /* Loop counter */ 2203ef317ab5Sdanielk1977 while( i>0 ){ 2204ef317ab5Sdanielk1977 cksum += aData[i]; 2205ef317ab5Sdanielk1977 i -= 200; 2206ef317ab5Sdanielk1977 } 2207968af52aSdrh return cksum; 2208968af52aSdrh } 2209968af52aSdrh 2210968af52aSdrh /* 22118220da7bSdrh ** Report the current page size and number of reserved bytes back 22128220da7bSdrh ** to the codec. 22138220da7bSdrh */ 22148220da7bSdrh #ifdef SQLITE_HAS_CODEC 22158220da7bSdrh static void pagerReportSize(Pager *pPager){ 22168220da7bSdrh if( pPager->xCodecSizeChng ){ 22178220da7bSdrh pPager->xCodecSizeChng(pPager->pCodec, pPager->pageSize, 22188220da7bSdrh (int)pPager->nReserve); 22198220da7bSdrh } 22208220da7bSdrh } 22218220da7bSdrh #else 22228220da7bSdrh # define pagerReportSize(X) /* No-op if we do not support a codec */ 22238220da7bSdrh #endif 22248220da7bSdrh 222558cb6dbeSdrh #ifdef SQLITE_HAS_CODEC 222658cb6dbeSdrh /* 222758cb6dbeSdrh ** Make sure the number of reserved bits is the same in the destination 222858cb6dbeSdrh ** pager as it is in the source. This comes up when a VACUUM changes the 222958cb6dbeSdrh ** number of reserved bits to the "optimal" amount. 223058cb6dbeSdrh */ 223158cb6dbeSdrh void sqlite3PagerAlignReserve(Pager *pDest, Pager *pSrc){ 223258cb6dbeSdrh if( pDest->nReserve!=pSrc->nReserve ){ 223358cb6dbeSdrh pDest->nReserve = pSrc->nReserve; 223458cb6dbeSdrh pagerReportSize(pDest); 223558cb6dbeSdrh } 223658cb6dbeSdrh } 223758cb6dbeSdrh #endif 223858cb6dbeSdrh 22398220da7bSdrh /* 2240d6e5e098Sdrh ** Read a single page from either the journal file (if isMainJrnl==1) or 2241d6e5e098Sdrh ** from the sub-journal (if isMainJrnl==0) and playback that page. 2242d6e5e098Sdrh ** The page begins at offset *pOffset into the file. The *pOffset 2243d6e5e098Sdrh ** value is increased to the start of the next page in the journal. 2244968af52aSdrh ** 224585d14ed2Sdan ** The main rollback journal uses checksums - the statement journal does 224685d14ed2Sdan ** not. 2247d6e5e098Sdrh ** 2248bea2a948Sdanielk1977 ** If the page number of the page record read from the (sub-)journal file 2249bea2a948Sdanielk1977 ** is greater than the current value of Pager.dbSize, then playback is 2250bea2a948Sdanielk1977 ** skipped and SQLITE_OK is returned. 2251bea2a948Sdanielk1977 ** 2252d6e5e098Sdrh ** If pDone is not NULL, then it is a record of pages that have already 2253d6e5e098Sdrh ** been played back. If the page at *pOffset has already been played back 2254d6e5e098Sdrh ** (if the corresponding pDone bit is set) then skip the playback. 2255d6e5e098Sdrh ** Make sure the pDone bit corresponding to the *pOffset page is set 2256d6e5e098Sdrh ** prior to returning. 2257bea2a948Sdanielk1977 ** 2258bea2a948Sdanielk1977 ** If the page record is successfully read from the (sub-)journal file 2259bea2a948Sdanielk1977 ** and played back, then SQLITE_OK is returned. If an IO error occurs 2260bea2a948Sdanielk1977 ** while reading the record from the (sub-)journal file or while writing 2261bea2a948Sdanielk1977 ** to the database file, then the IO error code is returned. If data 2262bea2a948Sdanielk1977 ** is successfully read from the (sub-)journal file but appears to be 2263bea2a948Sdanielk1977 ** corrupted, SQLITE_DONE is returned. Data is considered corrupted in 2264bea2a948Sdanielk1977 ** two circumstances: 2265bea2a948Sdanielk1977 ** 2266bea2a948Sdanielk1977 ** * If the record page-number is illegal (0 or PAGER_MJ_PGNO), or 2267bea2a948Sdanielk1977 ** * If the record is being rolled back from the main journal file 2268bea2a948Sdanielk1977 ** and the checksum field does not match the record content. 2269bea2a948Sdanielk1977 ** 2270bea2a948Sdanielk1977 ** Neither of these two scenarios are possible during a savepoint rollback. 2271bea2a948Sdanielk1977 ** 2272bea2a948Sdanielk1977 ** If this is a savepoint rollback, then memory may have to be dynamically 2273bea2a948Sdanielk1977 ** allocated by this function. If this is the case and an allocation fails, 2274bea2a948Sdanielk1977 ** SQLITE_NOMEM is returned. 2275fa86c412Sdrh */ 227662079060Sdanielk1977 static int pager_playback_one_page( 2277c13148ffSdrh Pager *pPager, /* The pager being played back */ 2278d6e5e098Sdrh i64 *pOffset, /* Offset of record to playback */ 227991781bd7Sdrh Bitvec *pDone, /* Bitvec of pages already played back */ 228091781bd7Sdrh int isMainJrnl, /* 1 -> main journal. 0 -> sub-journal. */ 228191781bd7Sdrh int isSavepnt /* True for a savepoint rollback */ 228262079060Sdanielk1977 ){ 2283fa86c412Sdrh int rc; 2284fa86c412Sdrh PgHdr *pPg; /* An existing page in the cache */ 2285ae2b40c4Sdrh Pgno pgno; /* The page number of a page in journal */ 2286ae2b40c4Sdrh u32 cksum; /* Checksum used for sanity checking */ 2287bfcb4adaSdrh char *aData; /* Temporary storage for the page */ 2288d6e5e098Sdrh sqlite3_file *jfd; /* The file descriptor for the journal file */ 228991781bd7Sdrh int isSynced; /* True if journal page is synced */ 2290614c6a09Sdrh #ifdef SQLITE_HAS_CODEC 2291614c6a09Sdrh /* The jrnlEnc flag is true if Journal pages should be passed through 2292614c6a09Sdrh ** the codec. It is false for pure in-memory journals. */ 22932617c9bdSdan const int jrnlEnc = (isMainJrnl || pPager->subjInMemory==0); 2294614c6a09Sdrh #endif 2295fa86c412Sdrh 2296d6e5e098Sdrh assert( (isMainJrnl&~1)==0 ); /* isMainJrnl is 0 or 1 */ 2297d6e5e098Sdrh assert( (isSavepnt&~1)==0 ); /* isSavepnt is 0 or 1 */ 2298d6e5e098Sdrh assert( isMainJrnl || pDone ); /* pDone always used on sub-journals */ 2299d6e5e098Sdrh assert( isSavepnt || pDone==0 ); /* pDone never used on non-savepoint */ 23009636284eSdrh 2301bfcb4adaSdrh aData = pPager->pTmpSpace; 2302d6e5e098Sdrh assert( aData ); /* Temp storage must have already been allocated */ 23037ed91f23Sdrh assert( pagerUseWal(pPager)==0 || (!isMainJrnl && isSavepnt) ); 2304d6e5e098Sdrh 230585d14ed2Sdan /* Either the state is greater than PAGER_WRITER_CACHEMOD (a transaction 230685d14ed2Sdan ** or savepoint rollback done at the request of the caller) or this is 230785d14ed2Sdan ** a hot-journal rollback. If it is a hot-journal rollback, the pager 230885d14ed2Sdan ** is in state OPEN and holds an EXCLUSIVE lock. Hot-journal rollback 230985d14ed2Sdan ** only reads from the main journal, not the sub-journal. 231085d14ed2Sdan */ 231185d14ed2Sdan assert( pPager->eState>=PAGER_WRITER_CACHEMOD 231285d14ed2Sdan || (pPager->eState==PAGER_OPEN && pPager->eLock==EXCLUSIVE_LOCK) 231385d14ed2Sdan ); 231485d14ed2Sdan assert( pPager->eState>=PAGER_WRITER_CACHEMOD || isMainJrnl ); 231585d14ed2Sdan 2316bea2a948Sdanielk1977 /* Read the page number and page data from the journal or sub-journal 2317bea2a948Sdanielk1977 ** file. Return an error code to the caller if an IO error occurs. 2318bea2a948Sdanielk1977 */ 2319d6e5e098Sdrh jfd = isMainJrnl ? pPager->jfd : pPager->sjfd; 2320d6e5e098Sdrh rc = read32bits(jfd, *pOffset, &pgno); 232199ee3600Sdrh if( rc!=SQLITE_OK ) return rc; 2322bfcb4adaSdrh rc = sqlite3OsRead(jfd, (u8*)aData, pPager->pageSize, (*pOffset)+4); 232399ee3600Sdrh if( rc!=SQLITE_OK ) return rc; 2324d6e5e098Sdrh *pOffset += pPager->pageSize + 4 + isMainJrnl*4; 2325fa86c412Sdrh 2326968af52aSdrh /* Sanity checking on the page. This is more important that I originally 2327968af52aSdrh ** thought. If a power failure occurs while the journal is being written, 2328968af52aSdrh ** it could cause invalid data to be written into the journal. We need to 2329968af52aSdrh ** detect this invalid data (with high probability) and ignore it. 2330968af52aSdrh */ 233175edc16fSdanielk1977 if( pgno==0 || pgno==PAGER_MJ_PGNO(pPager) ){ 2332bea2a948Sdanielk1977 assert( !isSavepnt ); 2333968af52aSdrh return SQLITE_DONE; 2334968af52aSdrh } 2335fd7f0452Sdanielk1977 if( pgno>(Pgno)pPager->dbSize || sqlite3BitvecTest(pDone, pgno) ){ 2336968af52aSdrh return SQLITE_OK; 2337968af52aSdrh } 2338c13148ffSdrh if( isMainJrnl ){ 2339d6e5e098Sdrh rc = read32bits(jfd, (*pOffset)-4, &cksum); 234099ee3600Sdrh if( rc ) return rc; 2341bfcb4adaSdrh if( !isSavepnt && pager_cksum(pPager, (u8*)aData)!=cksum ){ 2342968af52aSdrh return SQLITE_DONE; 2343968af52aSdrh } 2344968af52aSdrh } 2345bea2a948Sdanielk1977 2346b3475530Sdrh /* If this page has already been played back before during the current 23478220da7bSdrh ** rollback, then don't bother to play it back again. 23488220da7bSdrh */ 2349859546caSdanielk1977 if( pDone && (rc = sqlite3BitvecSet(pDone, pgno))!=SQLITE_OK ){ 2350fd7f0452Sdanielk1977 return rc; 2351fd7f0452Sdanielk1977 } 2352a3f3a5f3Sdanielk1977 23538220da7bSdrh /* When playing back page 1, restore the nReserve setting 23548220da7bSdrh */ 23558220da7bSdrh if( pgno==1 && pPager->nReserve!=((u8*)aData)[20] ){ 23568220da7bSdrh pPager->nReserve = ((u8*)aData)[20]; 23578220da7bSdrh pagerReportSize(pPager); 23588220da7bSdrh } 23598220da7bSdrh 2360de5fd22fSdan /* If the pager is in CACHEMOD state, then there must be a copy of this 2361a3f3a5f3Sdanielk1977 ** page in the pager cache. In this case just update the pager cache, 23620de0bb33Sdanielk1977 ** not the database file. The page is left marked dirty in this case. 23630de0bb33Sdanielk1977 ** 23642df71c74Sdanielk1977 ** An exception to the above rule: If the database is in no-sync mode 23652df71c74Sdanielk1977 ** and a page is moved during an incremental vacuum then the page may 2366369f3a05Sdanielk1977 ** not be in the pager cache. Later: if a malloc() or IO error occurs 2367369f3a05Sdanielk1977 ** during a Movepage() call, then the page may not be in the cache 2368369f3a05Sdanielk1977 ** either. So the condition described in the above paragraph is not 2369369f3a05Sdanielk1977 ** assert()able. 23702df71c74Sdanielk1977 ** 2371de5fd22fSdan ** If in WRITER_DBMOD, WRITER_FINISHED or OPEN state, then we update the 2372de5fd22fSdan ** pager cache if it exists and the main file. The page is then marked 2373de5fd22fSdan ** not dirty. Since this code is only executed in PAGER_OPEN state for 2374de5fd22fSdan ** a hot-journal rollback, it is guaranteed that the page-cache is empty 2375de5fd22fSdan ** if the pager is in OPEN state. 23769636284eSdrh ** 23779636284eSdrh ** Ticket #1171: The statement journal might contain page content that is 23789636284eSdrh ** different from the page content at the start of the transaction. 23799636284eSdrh ** This occurs when a page is changed prior to the start of a statement 23809636284eSdrh ** then changed again within the statement. When rolling back such a 23819636284eSdrh ** statement we must not write to the original database unless we know 23825e385311Sdrh ** for certain that original page contents are synced into the main rollback 23835e385311Sdrh ** journal. Otherwise, a power loss might leave modified data in the 23845e385311Sdrh ** database file without an entry in the rollback journal that can 23855e385311Sdrh ** restore the database to its original form. Two conditions must be 23865e385311Sdrh ** met before writing to the database files. (1) the database must be 23875e385311Sdrh ** locked. (2) we know that the original page content is fully synced 23885e385311Sdrh ** in the main journal either because the page is not in cache or else 23895e385311Sdrh ** the page is marked as needSync==0. 23904c02a235Sdrh ** 23914c02a235Sdrh ** 2008-04-14: When attempting to vacuum a corrupt database file, it 23924c02a235Sdrh ** is possible to fail a statement on a database that does not yet exist. 23934c02a235Sdrh ** Do not attempt to write if database file has never been opened. 2394fa86c412Sdrh */ 23957ed91f23Sdrh if( pagerUseWal(pPager) ){ 23964cd78b4dSdan pPg = 0; 23974cd78b4dSdan }else{ 2398c137807aSdrh pPg = sqlite3PagerLookup(pPager, pgno); 23994cd78b4dSdan } 240086655a1dSdrh assert( pPg || !MEMDB ); 24016572c16aSdan assert( pPager->eState!=PAGER_OPEN || pPg==0 || pPager->tempFile ); 240230d53701Sdrh PAGERTRACE(("PLAYBACK %d page %d hash(%08x) %s\n", 2403bfcb4adaSdrh PAGERID(pPager), pgno, pager_datahash(pPager->pageSize, (u8*)aData), 2404ecfef985Sdanielk1977 (isMainJrnl?"main-journal":"sub-journal") 240530d53701Sdrh )); 240691781bd7Sdrh if( isMainJrnl ){ 240791781bd7Sdrh isSynced = pPager->noSync || (*pOffset <= pPager->journalHdr); 240891781bd7Sdrh }else{ 240991781bd7Sdrh isSynced = (pPg==0 || 0==(pPg->flags & PGHDR_NEED_SYNC)); 241091781bd7Sdrh } 2411719e3a7aSdrh if( isOpen(pPager->fd) 2412719e3a7aSdrh && (pPager->eState>=PAGER_WRITER_DBMOD || pPager->eState==PAGER_OPEN) 241391781bd7Sdrh && isSynced 24148c0a791aSdanielk1977 ){ 2415281b21daSdrh i64 ofst = (pgno-1)*(i64)pPager->pageSize; 241605f69dd3Sdrh testcase( !isSavepnt && pPg!=0 && (pPg->flags&PGHDR_NEED_SYNC)!=0 ); 24177ed91f23Sdrh assert( !pagerUseWal(pPager) ); 24182617c9bdSdan 24192617c9bdSdan /* Write the data read from the journal back into the database file. 24202617c9bdSdan ** This is usually safe even for an encrypted database - as the data 24212617c9bdSdan ** was encrypted before it was written to the journal file. The exception 24222617c9bdSdan ** is if the data was just read from an in-memory sub-journal. In that 24232617c9bdSdan ** case it must be encrypted here before it is copied into the database 24242617c9bdSdan ** file. */ 2425614c6a09Sdrh #ifdef SQLITE_HAS_CODEC 2426614c6a09Sdrh if( !jrnlEnc ){ 2427614c6a09Sdrh CODEC2(pPager, aData, pgno, 7, rc=SQLITE_NOMEM_BKPT, aData); 2428f23da966Sdan rc = sqlite3OsWrite(pPager->fd, (u8 *)aData, pPager->pageSize, ofst); 2429614c6a09Sdrh CODEC1(pPager, aData, pgno, 3, rc=SQLITE_NOMEM_BKPT); 2430614c6a09Sdrh }else 2431614c6a09Sdrh #endif 2432614c6a09Sdrh rc = sqlite3OsWrite(pPager->fd, (u8 *)aData, pPager->pageSize, ofst); 24332617c9bdSdan 24343460d19cSdanielk1977 if( pgno>pPager->dbFileSize ){ 24353460d19cSdanielk1977 pPager->dbFileSize = pgno; 24363460d19cSdanielk1977 } 24370719ee29Sdrh if( pPager->pBackup ){ 2438614c6a09Sdrh #ifdef SQLITE_HAS_CODEC 2439614c6a09Sdrh if( jrnlEnc ){ 2440614c6a09Sdrh CODEC1(pPager, aData, pgno, 3, rc=SQLITE_NOMEM_BKPT); 2441bfcb4adaSdrh sqlite3BackupUpdate(pPager->pBackup, pgno, (u8*)aData); 2442614c6a09Sdrh CODEC2(pPager, aData, pgno, 7, rc=SQLITE_NOMEM_BKPT,aData); 2443614c6a09Sdrh }else 2444614c6a09Sdrh #endif 2445614c6a09Sdrh sqlite3BackupUpdate(pPager->pBackup, pgno, (u8*)aData); 24460719ee29Sdrh } 2447f2c31ad8Sdanielk1977 }else if( !isMainJrnl && pPg==0 ){ 2448f2c31ad8Sdanielk1977 /* If this is a rollback of a savepoint and data was not written to 2449f2c31ad8Sdanielk1977 ** the database and the page is not in-memory, there is a potential 2450f2c31ad8Sdanielk1977 ** problem. When the page is next fetched by the b-tree layer, it 2451f2c31ad8Sdanielk1977 ** will be read from the database file, which may or may not be 2452f2c31ad8Sdanielk1977 ** current. 2453f2c31ad8Sdanielk1977 ** 2454f2c31ad8Sdanielk1977 ** There are a couple of different ways this can happen. All are quite 2455401b65edSdanielk1977 ** obscure. When running in synchronous mode, this can only happen 2456f2c31ad8Sdanielk1977 ** if the page is on the free-list at the start of the transaction, then 2457f2c31ad8Sdanielk1977 ** populated, then moved using sqlite3PagerMovepage(). 2458f2c31ad8Sdanielk1977 ** 2459f2c31ad8Sdanielk1977 ** The solution is to add an in-memory page to the cache containing 2460f2c31ad8Sdanielk1977 ** the data just read from the sub-journal. Mark the page as dirty 2461f2c31ad8Sdanielk1977 ** and if the pager requires a journal-sync, then mark the page as 2462f2c31ad8Sdanielk1977 ** requiring a journal-sync before it is written. 2463f2c31ad8Sdanielk1977 */ 2464f2c31ad8Sdanielk1977 assert( isSavepnt ); 246540c3941cSdrh assert( (pPager->doNotSpill & SPILLFLAG_ROLLBACK)==0 ); 246640c3941cSdrh pPager->doNotSpill |= SPILLFLAG_ROLLBACK; 24679584f58cSdrh rc = sqlite3PagerGet(pPager, pgno, &pPg, 1); 246840c3941cSdrh assert( (pPager->doNotSpill & SPILLFLAG_ROLLBACK)!=0 ); 246940c3941cSdrh pPager->doNotSpill &= ~SPILLFLAG_ROLLBACK; 24707cf4c7adSdrh if( rc!=SQLITE_OK ) return rc; 2471f2c31ad8Sdanielk1977 sqlite3PcacheMakeDirty(pPg); 2472a3f3a5f3Sdanielk1977 } 2473fa86c412Sdrh if( pPg ){ 24742812956bSdanielk1977 /* No page should ever be explicitly rolled back that is in use, except 24752812956bSdanielk1977 ** for page 1 which is held in use in order to keep the lock on the 24762812956bSdanielk1977 ** database active. However such a page may be rolled back as a result 24772812956bSdanielk1977 ** of an internal error resulting in an automatic call to 24783b8a05f6Sdanielk1977 ** sqlite3PagerRollback(). 24793a84069dSdrh */ 2480b6f41486Sdrh void *pData; 24818c0a791aSdanielk1977 pData = pPg->pData; 2482bfcb4adaSdrh memcpy(pData, (u8*)aData, pPager->pageSize); 2483eaa06f69Sdanielk1977 pPager->xReiniter(pPg); 248442bee5f4Sdrh /* It used to be that sqlite3PcacheMakeClean(pPg) was called here. But 248542bee5f4Sdrh ** that call was dangerous and had no detectable benefit since the cache 248642bee5f4Sdrh ** is normally cleaned by sqlite3PcacheCleanAll() after rollback and so 248742bee5f4Sdrh ** has been removed. */ 24885f848c3aSdan pager_set_pagehash(pPg); 24895f848c3aSdan 249086a88114Sdrh /* If this was page 1, then restore the value of Pager.dbFileVers. 249186a88114Sdrh ** Do this before any decoding. */ 249241483468Sdanielk1977 if( pgno==1 ){ 249386a88114Sdrh memcpy(&pPager->dbFileVers, &((u8*)pData)[24],sizeof(pPager->dbFileVers)); 249441483468Sdanielk1977 } 249586a88114Sdrh 249686a88114Sdrh /* Decode the page just read from disk */ 2497614c6a09Sdrh #if SQLITE_HAS_CODEC 2498614c6a09Sdrh if( jrnlEnc ){ CODEC1(pPager, pData, pPg->pgno, 3, rc=SQLITE_NOMEM_BKPT); } 2499614c6a09Sdrh #endif 25008c0a791aSdanielk1977 sqlite3PcacheRelease(pPg); 2501fa86c412Sdrh } 2502fa86c412Sdrh return rc; 2503fa86c412Sdrh } 2504fa86c412Sdrh 2505fa86c412Sdrh /* 250613adf8a0Sdanielk1977 ** Parameter zMaster is the name of a master journal file. A single journal 250713adf8a0Sdanielk1977 ** file that referred to the master journal file has just been rolled back. 250813adf8a0Sdanielk1977 ** This routine checks if it is possible to delete the master journal file, 250913adf8a0Sdanielk1977 ** and does so if it is. 2510726de599Sdrh ** 251165839c6aSdanielk1977 ** Argument zMaster may point to Pager.pTmpSpace. So that buffer is not 251265839c6aSdanielk1977 ** available for use within this function. 251365839c6aSdanielk1977 ** 2514bea2a948Sdanielk1977 ** When a master journal file is created, it is populated with the names 2515bea2a948Sdanielk1977 ** of all of its child journals, one after another, formatted as utf-8 2516bea2a948Sdanielk1977 ** encoded text. The end of each child journal file is marked with a 2517bea2a948Sdanielk1977 ** nul-terminator byte (0x00). i.e. the entire contents of a master journal 2518bea2a948Sdanielk1977 ** file for a transaction involving two databases might be: 251965839c6aSdanielk1977 ** 2520bea2a948Sdanielk1977 ** "/home/bill/a.db-journal\x00/home/bill/b.db-journal\x00" 2521bea2a948Sdanielk1977 ** 2522bea2a948Sdanielk1977 ** A master journal file may only be deleted once all of its child 2523bea2a948Sdanielk1977 ** journals have been rolled back. 2524bea2a948Sdanielk1977 ** 2525bea2a948Sdanielk1977 ** This function reads the contents of the master-journal file into 2526bea2a948Sdanielk1977 ** memory and loops through each of the child journal names. For 2527bea2a948Sdanielk1977 ** each child journal, it checks if: 2528bea2a948Sdanielk1977 ** 2529bea2a948Sdanielk1977 ** * if the child journal exists, and if so 2530bea2a948Sdanielk1977 ** * if the child journal contains a reference to master journal 2531bea2a948Sdanielk1977 ** file zMaster 2532bea2a948Sdanielk1977 ** 2533bea2a948Sdanielk1977 ** If a child journal can be found that matches both of the criteria 2534bea2a948Sdanielk1977 ** above, this function returns without doing anything. Otherwise, if 2535bea2a948Sdanielk1977 ** no such child journal can be found, file zMaster is deleted from 2536bea2a948Sdanielk1977 ** the file-system using sqlite3OsDelete(). 2537bea2a948Sdanielk1977 ** 2538bea2a948Sdanielk1977 ** If an IO error within this function, an error code is returned. This 2539bea2a948Sdanielk1977 ** function allocates memory by calling sqlite3Malloc(). If an allocation 2540bea2a948Sdanielk1977 ** fails, SQLITE_NOMEM is returned. Otherwise, if no IO or malloc errors 2541bea2a948Sdanielk1977 ** occur, SQLITE_OK is returned. 2542bea2a948Sdanielk1977 ** 2543bea2a948Sdanielk1977 ** TODO: This function allocates a single block of memory to load 2544bea2a948Sdanielk1977 ** the entire contents of the master journal file. This could be 2545bea2a948Sdanielk1977 ** a couple of kilobytes or so - potentially larger than the page 2546bea2a948Sdanielk1977 ** size. 254713adf8a0Sdanielk1977 */ 2548b4b47411Sdanielk1977 static int pager_delmaster(Pager *pPager, const char *zMaster){ 2549b4b47411Sdanielk1977 sqlite3_vfs *pVfs = pPager->pVfs; 2550bea2a948Sdanielk1977 int rc; /* Return code */ 2551bea2a948Sdanielk1977 sqlite3_file *pMaster; /* Malloc'd master-journal file descriptor */ 2552bea2a948Sdanielk1977 sqlite3_file *pJournal; /* Malloc'd child-journal file descriptor */ 255313adf8a0Sdanielk1977 char *zMasterJournal = 0; /* Contents of master journal file */ 2554eb206256Sdrh i64 nMasterJournal; /* Size of master journal file */ 2555a64febe1Sdrh char *zJournal; /* Pointer to one journal within MJ file */ 2556a64febe1Sdrh char *zMasterPtr; /* Space to hold MJ filename from a journal file */ 2557a64febe1Sdrh int nMasterPtr; /* Amount of space allocated to zMasterPtr[] */ 255813adf8a0Sdanielk1977 2559bea2a948Sdanielk1977 /* Allocate space for both the pJournal and pMaster file descriptors. 2560bea2a948Sdanielk1977 ** If successful, open the master journal file for reading. 256113adf8a0Sdanielk1977 */ 2562bea2a948Sdanielk1977 pMaster = (sqlite3_file *)sqlite3MallocZero(pVfs->szOsFile * 2); 2563fee2d25aSdanielk1977 pJournal = (sqlite3_file *)(((u8 *)pMaster) + pVfs->szOsFile); 2564b4b47411Sdanielk1977 if( !pMaster ){ 2565fad3039cSmistachkin rc = SQLITE_NOMEM_BKPT; 2566b4b47411Sdanielk1977 }else{ 2567bea2a948Sdanielk1977 const int flags = (SQLITE_OPEN_READONLY|SQLITE_OPEN_MASTER_JOURNAL); 2568fee2d25aSdanielk1977 rc = sqlite3OsOpen(pVfs, zMaster, pMaster, flags, 0); 2569b4b47411Sdanielk1977 } 257013adf8a0Sdanielk1977 if( rc!=SQLITE_OK ) goto delmaster_out; 2571b4b47411Sdanielk1977 2572a64febe1Sdrh /* Load the entire master journal file into space obtained from 2573a64febe1Sdrh ** sqlite3_malloc() and pointed to by zMasterJournal. Also obtain 2574a64febe1Sdrh ** sufficient space (in zMasterPtr) to hold the names of master 2575a64febe1Sdrh ** journal files extracted from regular rollback-journals. 2576a64febe1Sdrh */ 2577b4b47411Sdanielk1977 rc = sqlite3OsFileSize(pMaster, &nMasterJournal); 257813adf8a0Sdanielk1977 if( rc!=SQLITE_OK ) goto delmaster_out; 2579a64febe1Sdrh nMasterPtr = pVfs->mxPathname+1; 2580da4ca9d1Sdrh zMasterJournal = sqlite3Malloc(nMasterJournal + nMasterPtr + 1); 258113adf8a0Sdanielk1977 if( !zMasterJournal ){ 2582fad3039cSmistachkin rc = SQLITE_NOMEM_BKPT; 258313adf8a0Sdanielk1977 goto delmaster_out; 258413adf8a0Sdanielk1977 } 25850b0abe45Sdrh zMasterPtr = &zMasterJournal[nMasterJournal+1]; 25864f21c4afSdrh rc = sqlite3OsRead(pMaster, zMasterJournal, (int)nMasterJournal, 0); 258713adf8a0Sdanielk1977 if( rc!=SQLITE_OK ) goto delmaster_out; 25880b0abe45Sdrh zMasterJournal[nMasterJournal] = 0; 258913adf8a0Sdanielk1977 25905865e3d5Sdanielk1977 zJournal = zMasterJournal; 25915865e3d5Sdanielk1977 while( (zJournal-zMasterJournal)<nMasterJournal ){ 2592861f7456Sdanielk1977 int exists; 2593861f7456Sdanielk1977 rc = sqlite3OsAccess(pVfs, zJournal, SQLITE_ACCESS_EXISTS, &exists); 2594861f7456Sdanielk1977 if( rc!=SQLITE_OK ){ 259519db9352Sdrh goto delmaster_out; 259619db9352Sdrh } 2597861f7456Sdanielk1977 if( exists ){ 259813adf8a0Sdanielk1977 /* One of the journals pointed to by the master journal exists. 259913adf8a0Sdanielk1977 ** Open it and check if it points at the master journal. If 260013adf8a0Sdanielk1977 ** so, return without deleting the master journal file. 260113adf8a0Sdanielk1977 */ 26023b7b78b3Sdrh int c; 2603fee2d25aSdanielk1977 int flags = (SQLITE_OPEN_READONLY|SQLITE_OPEN_MAIN_JOURNAL); 2604fee2d25aSdanielk1977 rc = sqlite3OsOpen(pVfs, zJournal, pJournal, flags, 0); 260513adf8a0Sdanielk1977 if( rc!=SQLITE_OK ){ 260613adf8a0Sdanielk1977 goto delmaster_out; 260713adf8a0Sdanielk1977 } 26089eed5057Sdanielk1977 260965839c6aSdanielk1977 rc = readMasterJournal(pJournal, zMasterPtr, nMasterPtr); 2610b4b47411Sdanielk1977 sqlite3OsClose(pJournal); 26119eed5057Sdanielk1977 if( rc!=SQLITE_OK ){ 26129eed5057Sdanielk1977 goto delmaster_out; 26139eed5057Sdanielk1977 } 261413adf8a0Sdanielk1977 261565839c6aSdanielk1977 c = zMasterPtr[0]!=0 && strcmp(zMasterPtr, zMaster)==0; 26163b7b78b3Sdrh if( c ){ 261713adf8a0Sdanielk1977 /* We have a match. Do not delete the master journal file. */ 261813adf8a0Sdanielk1977 goto delmaster_out; 261913adf8a0Sdanielk1977 } 262013adf8a0Sdanielk1977 } 2621ea678832Sdrh zJournal += (sqlite3Strlen30(zJournal)+1); 262213adf8a0Sdanielk1977 } 262313adf8a0Sdanielk1977 2624de3c301dSdrh sqlite3OsClose(pMaster); 2625fee2d25aSdanielk1977 rc = sqlite3OsDelete(pVfs, zMaster, 0); 262613adf8a0Sdanielk1977 262713adf8a0Sdanielk1977 delmaster_out: 262817435752Sdrh sqlite3_free(zMasterJournal); 2629bea2a948Sdanielk1977 if( pMaster ){ 2630b4b47411Sdanielk1977 sqlite3OsClose(pMaster); 2631bea2a948Sdanielk1977 assert( !isOpen(pJournal) ); 2632b4b47411Sdanielk1977 sqlite3_free(pMaster); 2633de3c301dSdrh } 263413adf8a0Sdanielk1977 return rc; 263513adf8a0Sdanielk1977 } 263613adf8a0Sdanielk1977 2637a6abd041Sdrh 2638a6abd041Sdrh /* 2639bea2a948Sdanielk1977 ** This function is used to change the actual size of the database 2640bea2a948Sdanielk1977 ** file in the file-system. This only happens when committing a transaction, 2641bea2a948Sdanielk1977 ** or rolling back a transaction (including rolling back a hot-journal). 26427fe3f7e9Sdrh ** 2643de5fd22fSdan ** If the main database file is not open, or the pager is not in either 2644de5fd22fSdan ** DBMOD or OPEN state, this function is a no-op. Otherwise, the size 2645de5fd22fSdan ** of the file is changed to nPage pages (nPage*pPager->pageSize bytes). 2646de5fd22fSdan ** If the file on disk is currently larger than nPage pages, then use the VFS 2647bea2a948Sdanielk1977 ** xTruncate() method to truncate it. 2648bea2a948Sdanielk1977 ** 264960ec914cSpeter.d.reid ** Or, it might be the case that the file on disk is smaller than 2650bea2a948Sdanielk1977 ** nPage pages. Some operating system implementations can get confused if 2651bea2a948Sdanielk1977 ** you try to truncate a file to some size that is larger than it 2652bea2a948Sdanielk1977 ** currently is, so detect this case and write a single zero byte to 2653bea2a948Sdanielk1977 ** the end of the new file instead. 2654bea2a948Sdanielk1977 ** 2655bea2a948Sdanielk1977 ** If successful, return SQLITE_OK. If an IO error occurs while modifying 2656bea2a948Sdanielk1977 ** the database file, return the error code to the caller. 2657cb4c40baSdrh */ 2658d92db531Sdanielk1977 static int pager_truncate(Pager *pPager, Pgno nPage){ 2659e180dd93Sdanielk1977 int rc = SQLITE_OK; 2660a42c66bdSdan assert( pPager->eState!=PAGER_ERROR ); 26614e004aa6Sdan assert( pPager->eState!=PAGER_READER ); 26624e004aa6Sdan 26634e004aa6Sdan if( isOpen(pPager->fd) 2664de1ae34eSdan && (pPager->eState>=PAGER_WRITER_DBMOD || pPager->eState==PAGER_OPEN) 26654e004aa6Sdan ){ 26667fe3f7e9Sdrh i64 currentSize, newSize; 2667bd1334dfSdrh int szPage = pPager->pageSize; 2668de5fd22fSdan assert( pPager->eLock==EXCLUSIVE_LOCK ); 2669bea2a948Sdanielk1977 /* TODO: Is it safe to use Pager.dbFileSize here? */ 26707fe3f7e9Sdrh rc = sqlite3OsFileSize(pPager->fd, ¤tSize); 2671bd1334dfSdrh newSize = szPage*(i64)nPage; 267206e11af9Sdanielk1977 if( rc==SQLITE_OK && currentSize!=newSize ){ 267306e11af9Sdanielk1977 if( currentSize>newSize ){ 26747fe3f7e9Sdrh rc = sqlite3OsTruncate(pPager->fd, newSize); 2675935de7e8Sdrh }else if( (currentSize+szPage)<=newSize ){ 2676fb3828c2Sdan char *pTmp = pPager->pTmpSpace; 2677bd1334dfSdrh memset(pTmp, 0, szPage); 2678bd1334dfSdrh testcase( (newSize-szPage) == currentSize ); 2679bd1334dfSdrh testcase( (newSize-szPage) > currentSize ); 2680bd1334dfSdrh rc = sqlite3OsWrite(pPager->fd, pTmp, szPage, newSize-szPage); 268106e11af9Sdanielk1977 } 26823460d19cSdanielk1977 if( rc==SQLITE_OK ){ 26833460d19cSdanielk1977 pPager->dbFileSize = nPage; 26843460d19cSdanielk1977 } 26857fe3f7e9Sdrh } 2686e180dd93Sdanielk1977 } 2687e180dd93Sdanielk1977 return rc; 2688cb4c40baSdrh } 2689cb4c40baSdrh 2690cb4c40baSdrh /* 2691c9a53269Sdan ** Return a sanitized version of the sector-size of OS file pFile. The 2692c9a53269Sdan ** return value is guaranteed to lie between 32 and MAX_SECTOR_SIZE. 2693c9a53269Sdan */ 2694c9a53269Sdan int sqlite3SectorSize(sqlite3_file *pFile){ 2695c9a53269Sdan int iRet = sqlite3OsSectorSize(pFile); 2696c9a53269Sdan if( iRet<32 ){ 2697c9a53269Sdan iRet = 512; 2698c9a53269Sdan }else if( iRet>MAX_SECTOR_SIZE ){ 2699c9a53269Sdan assert( MAX_SECTOR_SIZE>=512 ); 2700c9a53269Sdan iRet = MAX_SECTOR_SIZE; 2701c9a53269Sdan } 2702c9a53269Sdan return iRet; 2703c9a53269Sdan } 2704c9a53269Sdan 2705c9a53269Sdan /* 2706bea2a948Sdanielk1977 ** Set the value of the Pager.sectorSize variable for the given 2707bea2a948Sdanielk1977 ** pager based on the value returned by the xSectorSize method 270860ec914cSpeter.d.reid ** of the open database file. The sector size will be used 2709bea2a948Sdanielk1977 ** to determine the size and alignment of journal header and 2710bea2a948Sdanielk1977 ** master journal pointers within created journal files. 2711c80f058dSdrh ** 2712bea2a948Sdanielk1977 ** For temporary files the effective sector size is always 512 bytes. 2713bea2a948Sdanielk1977 ** 2714bea2a948Sdanielk1977 ** Otherwise, for non-temporary files, the effective sector size is 27153c99d68bSdrh ** the value returned by the xSectorSize() method rounded up to 32 if 27163c99d68bSdrh ** it is less than 32, or rounded down to MAX_SECTOR_SIZE if it 2717bea2a948Sdanielk1977 ** is greater than MAX_SECTOR_SIZE. 27188bbaa89dSdrh ** 2719cb15f35fSdrh ** If the file has the SQLITE_IOCAP_POWERSAFE_OVERWRITE property, then set 2720cb15f35fSdrh ** the effective sector size to its minimum value (512). The purpose of 27218bbaa89dSdrh ** pPager->sectorSize is to define the "blast radius" of bytes that 27228bbaa89dSdrh ** might change if a crash occurs while writing to a single byte in 2723cb15f35fSdrh ** that range. But with POWERSAFE_OVERWRITE, the blast radius is zero 2724cb15f35fSdrh ** (that is what POWERSAFE_OVERWRITE means), so we minimize the sector 2725cb15f35fSdrh ** size. For backwards compatibility of the rollback journal file format, 2726cb15f35fSdrh ** we cannot reduce the effective sector size below 512. 2727c80f058dSdrh */ 2728c80f058dSdrh static void setSectorSize(Pager *pPager){ 2729bea2a948Sdanielk1977 assert( isOpen(pPager->fd) || pPager->tempFile ); 2730bea2a948Sdanielk1977 2731374f4a04Sdrh if( pPager->tempFile 2732cb15f35fSdrh || (sqlite3OsDeviceCharacteristics(pPager->fd) & 2733cb15f35fSdrh SQLITE_IOCAP_POWERSAFE_OVERWRITE)!=0 27348bbaa89dSdrh ){ 27357a2b1eebSdanielk1977 /* Sector size doesn't matter for temporary files. Also, the file 2736bea2a948Sdanielk1977 ** may not have been opened yet, in which case the OsSectorSize() 2737374f4a04Sdrh ** call will segfault. */ 2738374f4a04Sdrh pPager->sectorSize = 512; 2739374f4a04Sdrh }else{ 2740c9a53269Sdan pPager->sectorSize = sqlite3SectorSize(pPager->fd); 2741c80f058dSdrh } 2742374f4a04Sdrh } 2743c80f058dSdrh 2744c80f058dSdrh /* 2745ed7c855cSdrh ** Playback the journal and thus restore the database file to 2746ed7c855cSdrh ** the state it was in before we started making changes. 2747ed7c855cSdrh ** 274834e79ceeSdrh ** The journal file format is as follows: 274934e79ceeSdrh ** 2750ae2b40c4Sdrh ** (1) 8 byte prefix. A copy of aJournalMagic[]. 2751ae2b40c4Sdrh ** (2) 4 byte big-endian integer which is the number of valid page records 275234e79ceeSdrh ** in the journal. If this value is 0xffffffff, then compute the 2753ae2b40c4Sdrh ** number of page records from the journal size. 2754ae2b40c4Sdrh ** (3) 4 byte big-endian integer which is the initial value for the 2755ae2b40c4Sdrh ** sanity checksum. 2756ae2b40c4Sdrh ** (4) 4 byte integer which is the number of pages to truncate the 275734e79ceeSdrh ** database to during a rollback. 2758334c80d6Sdrh ** (5) 4 byte big-endian integer which is the sector size. The header 2759334c80d6Sdrh ** is this many bytes in size. 2760e7ae4e2cSdrh ** (6) 4 byte big-endian integer which is the page size. 2761e7ae4e2cSdrh ** (7) zero padding out to the next sector size. 2762e7ae4e2cSdrh ** (8) Zero or more pages instances, each as follows: 276334e79ceeSdrh ** + 4 byte page number. 2764ae2b40c4Sdrh ** + pPager->pageSize bytes of data. 2765ae2b40c4Sdrh ** + 4 byte checksum 276634e79ceeSdrh ** 2767e7ae4e2cSdrh ** When we speak of the journal header, we mean the first 7 items above. 2768e7ae4e2cSdrh ** Each entry in the journal is an instance of the 8th item. 276934e79ceeSdrh ** 277034e79ceeSdrh ** Call the value from the second bullet "nRec". nRec is the number of 277134e79ceeSdrh ** valid page entries in the journal. In most cases, you can compute the 277234e79ceeSdrh ** value of nRec from the size of the journal file. But if a power 277334e79ceeSdrh ** failure occurred while the journal was being written, it could be the 277434e79ceeSdrh ** case that the size of the journal file had already been increased but 277534e79ceeSdrh ** the extra entries had not yet made it safely to disk. In such a case, 277634e79ceeSdrh ** the value of nRec computed from the file size would be too large. For 277734e79ceeSdrh ** that reason, we always use the nRec value in the header. 277834e79ceeSdrh ** 277934e79ceeSdrh ** If the nRec value is 0xffffffff it means that nRec should be computed 278034e79ceeSdrh ** from the file size. This value is used when the user selects the 278134e79ceeSdrh ** no-sync option for the journal. A power failure could lead to corruption 278234e79ceeSdrh ** in this case. But for things like temporary table (which will be 278334e79ceeSdrh ** deleted when the power is restored) we don't care. 278434e79ceeSdrh ** 2785d9b0257aSdrh ** If the file opened as the journal file is not a well-formed 2786ece80f1eSdanielk1977 ** journal file then all pages up to the first corrupted page are rolled 2787ece80f1eSdanielk1977 ** back (or no pages if the journal header is corrupted). The journal file 2788ece80f1eSdanielk1977 ** is then deleted and SQLITE_OK returned, just as if no corruption had 2789ece80f1eSdanielk1977 ** been encountered. 2790ece80f1eSdanielk1977 ** 2791ece80f1eSdanielk1977 ** If an I/O or malloc() error occurs, the journal-file is not deleted 2792ece80f1eSdanielk1977 ** and an error code is returned. 2793d3a5c50eSdrh ** 2794d3a5c50eSdrh ** The isHot parameter indicates that we are trying to rollback a journal 2795d3a5c50eSdrh ** that might be a hot journal. Or, it could be that the journal is 2796d3a5c50eSdrh ** preserved because of JOURNALMODE_PERSIST or JOURNALMODE_TRUNCATE. 2797d3a5c50eSdrh ** If the journal really is hot, reset the pager cache prior rolling 2798d3a5c50eSdrh ** back any content. If the journal is merely persistent, no reset is 2799d3a5c50eSdrh ** needed. 2800ed7c855cSdrh */ 2801e277be05Sdanielk1977 static int pager_playback(Pager *pPager, int isHot){ 2802b4b47411Sdanielk1977 sqlite3_vfs *pVfs = pPager->pVfs; 2803eb206256Sdrh i64 szJ; /* Size of the journal file in bytes */ 2804c3e8f5efSdanielk1977 u32 nRec; /* Number of Records in the journal */ 28050b8d2766Sshane u32 u; /* Unsigned loop counter */ 2806ed7c855cSdrh Pgno mxPg = 0; /* Size of the original file in pages */ 2807ae2b40c4Sdrh int rc; /* Result code of a subroutine */ 2808861f7456Sdanielk1977 int res = 1; /* Value returned by sqlite3OsAccess() */ 280913adf8a0Sdanielk1977 char *zMaster = 0; /* Name of master journal file if any */ 2810d3a5c50eSdrh int needPagerReset; /* True to reset page prior to first page rollback */ 2811ab755ac8Sdrh int nPlayback = 0; /* Total number of pages restored from journal */ 2812edea4a7cSdrh u32 savedPageSize = pPager->pageSize; 2813ed7c855cSdrh 2814c3a64ba0Sdrh /* Figure out how many records are in the journal. Abort early if 2815c3a64ba0Sdrh ** the journal is empty. 2816ed7c855cSdrh */ 281722b328b2Sdan assert( isOpen(pPager->jfd) ); 2818054889ecSdrh rc = sqlite3OsFileSize(pPager->jfd, &szJ); 2819719e3a7aSdrh if( rc!=SQLITE_OK ){ 2820c3a64ba0Sdrh goto end_playback; 2821c3a64ba0Sdrh } 2822240c5795Sdrh 28237657240aSdanielk1977 /* Read the master journal name from the journal, if it is present. 28247657240aSdanielk1977 ** If a master journal file name is specified, but the file is not 28257657240aSdanielk1977 ** present on disk, then the journal is not hot and does not need to be 28267657240aSdanielk1977 ** played back. 2827bea2a948Sdanielk1977 ** 2828bea2a948Sdanielk1977 ** TODO: Technically the following is an error because it assumes that 2829bea2a948Sdanielk1977 ** buffer Pager.pTmpSpace is (mxPathname+1) bytes or larger. i.e. that 2830bea2a948Sdanielk1977 ** (pPager->pageSize >= pPager->pVfs->mxPathname+1). Using os_unix.c, 2831bea2a948Sdanielk1977 ** mxPathname is 512, which is the same as the minimum allowable value 2832bea2a948Sdanielk1977 ** for pageSize. 2833240c5795Sdrh */ 283465839c6aSdanielk1977 zMaster = pPager->pTmpSpace; 283565839c6aSdanielk1977 rc = readMasterJournal(pPager->jfd, zMaster, pPager->pVfs->mxPathname+1); 2836861f7456Sdanielk1977 if( rc==SQLITE_OK && zMaster[0] ){ 2837861f7456Sdanielk1977 rc = sqlite3OsAccess(pVfs, zMaster, SQLITE_ACCESS_EXISTS, &res); 28387657240aSdanielk1977 } 283965839c6aSdanielk1977 zMaster = 0; 2840861f7456Sdanielk1977 if( rc!=SQLITE_OK || !res ){ 2841ce98bba2Sdanielk1977 goto end_playback; 2842ce98bba2Sdanielk1977 } 2843ce98bba2Sdanielk1977 pPager->journalOff = 0; 2844d3a5c50eSdrh needPagerReset = isHot; 28457657240aSdanielk1977 2846bea2a948Sdanielk1977 /* This loop terminates either when a readJournalHdr() or 2847bea2a948Sdanielk1977 ** pager_playback_one_page() call returns SQLITE_DONE or an IO error 2848bea2a948Sdanielk1977 ** occurs. 2849bea2a948Sdanielk1977 */ 2850edea4a7cSdrh while( 1 ){ 28517657240aSdanielk1977 /* Read the next journal header from the journal file. If there are 28527657240aSdanielk1977 ** not enough bytes left in the journal file for a complete header, or 2853719e3a7aSdrh ** it is corrupted, then a process must have failed while writing it. 28547657240aSdanielk1977 ** This indicates nothing more needs to be rolled back. 28557657240aSdanielk1977 */ 28566f4c73eeSdanielk1977 rc = readJournalHdr(pPager, isHot, szJ, &nRec, &mxPg); 28577657240aSdanielk1977 if( rc!=SQLITE_OK ){ 28587657240aSdanielk1977 if( rc==SQLITE_DONE ){ 28597657240aSdanielk1977 rc = SQLITE_OK; 28607657240aSdanielk1977 } 2861c3a64ba0Sdrh goto end_playback; 2862c3a64ba0Sdrh } 2863c3a64ba0Sdrh 28647657240aSdanielk1977 /* If nRec is 0xffffffff, then this journal was created by a process 28657657240aSdanielk1977 ** working in no-sync mode. This means that the rest of the journal 28667657240aSdanielk1977 ** file consists of pages, there are no more journal headers. Compute 28677657240aSdanielk1977 ** the value of nRec based on this assumption. 28687657240aSdanielk1977 */ 28697657240aSdanielk1977 if( nRec==0xffffffff ){ 28707657240aSdanielk1977 assert( pPager->journalOff==JOURNAL_HDR_SZ(pPager) ); 28714f21c4afSdrh nRec = (int)((szJ - JOURNAL_HDR_SZ(pPager))/JOURNAL_PG_SZ(pPager)); 287213adf8a0Sdanielk1977 } 287313adf8a0Sdanielk1977 2874e277be05Sdanielk1977 /* If nRec is 0 and this rollback is of a transaction created by this 28758940f4eeSdrh ** process and if this is the final header in the journal, then it means 28768940f4eeSdrh ** that this part of the journal was being filled but has not yet been 28778940f4eeSdrh ** synced to disk. Compute the number of pages based on the remaining 28788940f4eeSdrh ** size of the file. 28798940f4eeSdrh ** 28808940f4eeSdrh ** The third term of the test was added to fix ticket #2565. 2881d6e5e098Sdrh ** When rolling back a hot journal, nRec==0 always means that the next 2882d6e5e098Sdrh ** chunk of the journal contains zero pages to be rolled back. But 2883d6e5e098Sdrh ** when doing a ROLLBACK and the nRec==0 chunk is the last chunk in 2884d6e5e098Sdrh ** the journal, it means that the journal might contain additional 2885d6e5e098Sdrh ** pages that need to be rolled back and that the number of pages 2886d6e5e098Sdrh ** should be computed based on the journal file size. 2887e277be05Sdanielk1977 */ 28888940f4eeSdrh if( nRec==0 && !isHot && 28898940f4eeSdrh pPager->journalHdr+JOURNAL_HDR_SZ(pPager)==pPager->journalOff ){ 28904f21c4afSdrh nRec = (int)((szJ - pPager->journalOff) / JOURNAL_PG_SZ(pPager)); 2891e277be05Sdanielk1977 } 2892e277be05Sdanielk1977 28937657240aSdanielk1977 /* If this is the first header read from the journal, truncate the 289485b623f2Sdrh ** database file back to its original size. 28957657240aSdanielk1977 */ 2896e180dd93Sdanielk1977 if( pPager->journalOff==JOURNAL_HDR_SZ(pPager) ){ 2897cb4c40baSdrh rc = pager_truncate(pPager, mxPg); 289881a20f21Sdrh if( rc!=SQLITE_OK ){ 289981a20f21Sdrh goto end_playback; 290081a20f21Sdrh } 2901f90b7260Sdanielk1977 pPager->dbSize = mxPg; 29027657240aSdanielk1977 } 29037657240aSdanielk1977 2904bea2a948Sdanielk1977 /* Copy original pages out of the journal and back into the 2905bea2a948Sdanielk1977 ** database file and/or page cache. 2906ed7c855cSdrh */ 29070b8d2766Sshane for(u=0; u<nRec; u++){ 2908d3a5c50eSdrh if( needPagerReset ){ 2909d3a5c50eSdrh pager_reset(pPager); 2910d3a5c50eSdrh needPagerReset = 0; 2911d3a5c50eSdrh } 291291781bd7Sdrh rc = pager_playback_one_page(pPager,&pPager->journalOff,0,1,0); 2913ab755ac8Sdrh if( rc==SQLITE_OK ){ 2914ab755ac8Sdrh nPlayback++; 2915ab755ac8Sdrh }else{ 2916968af52aSdrh if( rc==SQLITE_DONE ){ 29177657240aSdanielk1977 pPager->journalOff = szJ; 2918968af52aSdrh break; 29198d83c0fdSdrh }else if( rc==SQLITE_IOERR_SHORT_READ ){ 29208d83c0fdSdrh /* If the journal has been truncated, simply stop reading and 29218d83c0fdSdrh ** processing the journal. This might happen if the journal was 29228d83c0fdSdrh ** not completely written and synced prior to a crash. In that 29238d83c0fdSdrh ** case, the database should have never been written in the 29248d83c0fdSdrh ** first place so it is OK to simply abandon the rollback. */ 29258d83c0fdSdrh rc = SQLITE_OK; 29268d83c0fdSdrh goto end_playback; 29277657240aSdanielk1977 }else{ 292866fd2160Sdrh /* If we are unable to rollback, quit and return the error 292966fd2160Sdrh ** code. This will cause the pager to enter the error state 293066fd2160Sdrh ** so that no further harm will be done. Perhaps the next 293166fd2160Sdrh ** process to come along will be able to rollback the database. 2932a9625eaeSdrh */ 29337657240aSdanielk1977 goto end_playback; 29347657240aSdanielk1977 } 29357657240aSdanielk1977 } 2936968af52aSdrh } 2937edea4a7cSdrh } 2938edea4a7cSdrh /*NOTREACHED*/ 2939edea4a7cSdrh assert( 0 ); 29404a0681efSdrh 29414a0681efSdrh end_playback: 2942edea4a7cSdrh if( rc==SQLITE_OK ){ 2943edea4a7cSdrh rc = sqlite3PagerSetPagesize(pPager, &savedPageSize, -1); 2944edea4a7cSdrh } 29458f941bc7Sdrh /* Following a rollback, the database file should be back in its original 29468f941bc7Sdrh ** state prior to the start of the transaction, so invoke the 29478f941bc7Sdrh ** SQLITE_FCNTL_DB_UNCHANGED file-control method to disable the 29488f941bc7Sdrh ** assertion that the transaction counter was modified. 29498f941bc7Sdrh */ 2950c02372ceSdrh #ifdef SQLITE_DEBUG 2951c02372ceSdrh sqlite3OsFileControlHint(pPager->fd,SQLITE_FCNTL_DB_UNCHANGED,0); 2952c02372ceSdrh #endif 29538f941bc7Sdrh 2954db340397Sdanielk1977 /* If this playback is happening automatically as a result of an IO or 2955be217793Sshane ** malloc error that occurred after the change-counter was updated but 2956db340397Sdanielk1977 ** before the transaction was committed, then the change-counter 2957db340397Sdanielk1977 ** modification may just have been reverted. If this happens in exclusive 2958db340397Sdanielk1977 ** mode, then subsequent transactions performed by the connection will not 2959db340397Sdanielk1977 ** update the change-counter at all. This may lead to cache inconsistency 2960db340397Sdanielk1977 ** problems for other processes at some point in the future. So, just 2961db340397Sdanielk1977 ** in case this has happened, clear the changeCountDone flag now. 2962db340397Sdanielk1977 */ 2963bea2a948Sdanielk1977 pPager->changeCountDone = pPager->tempFile; 2964db340397Sdanielk1977 29658191bff0Sdanielk1977 if( rc==SQLITE_OK ){ 296665839c6aSdanielk1977 zMaster = pPager->pTmpSpace; 296765839c6aSdanielk1977 rc = readMasterJournal(pPager->jfd, zMaster, pPager->pVfs->mxPathname+1); 2968bea2a948Sdanielk1977 testcase( rc!=SQLITE_OK ); 296965839c6aSdanielk1977 } 2970354bfe03Sdan if( rc==SQLITE_OK 29717e684238Sdan && (pPager->eState>=PAGER_WRITER_DBMOD || pPager->eState==PAGER_OPEN) 29727e684238Sdan ){ 2973999cd08aSdan rc = sqlite3PagerSync(pPager, 0); 29747c24610eSdan } 297565839c6aSdanielk1977 if( rc==SQLITE_OK ){ 2976bc1a3c6cSdan rc = pager_end_transaction(pPager, zMaster[0]!='\0', 0); 2977bea2a948Sdanielk1977 testcase( rc!=SQLITE_OK ); 29788191bff0Sdanielk1977 } 2979c56774e2Sdanielk1977 if( rc==SQLITE_OK && zMaster[0] && res ){ 2980979f38e5Sdanielk1977 /* If there was a master journal and this routine will return success, 298132554c10Sdanielk1977 ** see if it is possible to delete the master journal. 298213adf8a0Sdanielk1977 */ 2983b4b47411Sdanielk1977 rc = pager_delmaster(pPager, zMaster); 2984bea2a948Sdanielk1977 testcase( rc!=SQLITE_OK ); 298513adf8a0Sdanielk1977 } 2986ab755ac8Sdrh if( isHot && nPlayback ){ 2987d040e764Sdrh sqlite3_log(SQLITE_NOTICE_RECOVER_ROLLBACK, "recovered %d pages from %s", 2988ab755ac8Sdrh nPlayback, pPager->zJournal); 2989ab755ac8Sdrh } 29907657240aSdanielk1977 29917657240aSdanielk1977 /* The Pager.sectorSize variable may have been updated while rolling 29923ceeb756Sdrh ** back a journal created by a process with a different sector size 29937657240aSdanielk1977 ** value. Reset it to the correct value for this process. 29947657240aSdanielk1977 */ 2995c80f058dSdrh setSectorSize(pPager); 2996d9b0257aSdrh return rc; 2997ed7c855cSdrh } 2998ed7c855cSdrh 29997c24610eSdan 30007c24610eSdan /* 300156520ab8Sdrh ** Read the content for page pPg out of the database file (or out of 300256520ab8Sdrh ** the WAL if that is where the most recent copy if found) into 30037c24610eSdan ** pPg->pData. A shared lock or greater must be held on the database 30047c24610eSdan ** file before this function is called. 30057c24610eSdan ** 30067c24610eSdan ** If page 1 is read, then the value of Pager.dbFileVers[] is set to 30077c24610eSdan ** the value read from the database file. 30087c24610eSdan ** 30097c24610eSdan ** If an IO error occurs, then the IO error is returned to the caller. 30107c24610eSdan ** Otherwise, SQLITE_OK is returned. 30117c24610eSdan */ 301256520ab8Sdrh static int readDbPage(PgHdr *pPg){ 30137c24610eSdan Pager *pPager = pPg->pPager; /* Pager object associated with page pPg */ 3014622194c0Sdrh int rc = SQLITE_OK; /* Return code */ 30155f54e2b5Sdan 30165f54e2b5Sdan #ifndef SQLITE_OMIT_WAL 301756520ab8Sdrh u32 iFrame = 0; /* Frame of WAL containing pgno */ 30187c24610eSdan 3019d0864087Sdan assert( pPager->eState>=PAGER_READER && !MEMDB ); 30207c24610eSdan assert( isOpen(pPager->fd) ); 30217c24610eSdan 302256520ab8Sdrh if( pagerUseWal(pPager) ){ 3023251866d0Sdrh rc = sqlite3WalFindFrame(pPager->pWal, pPg->pgno, &iFrame); 302456520ab8Sdrh if( rc ) return rc; 302556520ab8Sdrh } 302699bd1097Sdan if( iFrame ){ 3027251866d0Sdrh rc = sqlite3WalReadFrame(pPager->pWal, iFrame,pPager->pageSize,pPg->pData); 30285f54e2b5Sdan }else 30295f54e2b5Sdan #endif 30305f54e2b5Sdan { 3031251866d0Sdrh i64 iOffset = (pPg->pgno-1)*(i64)pPager->pageSize; 3032251866d0Sdrh rc = sqlite3OsRead(pPager->fd, pPg->pData, pPager->pageSize, iOffset); 30337c24610eSdan if( rc==SQLITE_IOERR_SHORT_READ ){ 30347c24610eSdan rc = SQLITE_OK; 30357c24610eSdan } 30367c24610eSdan } 30377c24610eSdan 3038251866d0Sdrh if( pPg->pgno==1 ){ 30397c24610eSdan if( rc ){ 30407c24610eSdan /* If the read is unsuccessful, set the dbFileVers[] to something 30417c24610eSdan ** that will never be a valid file version. dbFileVers[] is a copy 30427c24610eSdan ** of bytes 24..39 of the database. Bytes 28..31 should always be 3043b28e59bbSdrh ** zero or the size of the database in page. Bytes 32..35 and 35..39 3044b28e59bbSdrh ** should be page numbers which are never 0xffffffff. So filling 3045b28e59bbSdrh ** pPager->dbFileVers[] with all 0xff bytes should suffice. 30467c24610eSdan ** 30477c24610eSdan ** For an encrypted database, the situation is more complex: bytes 30487c24610eSdan ** 24..39 of the database are white noise. But the probability of 3049113762a2Sdrh ** white noise equaling 16 bytes of 0xff is vanishingly small so 30507c24610eSdan ** we should still be ok. 30517c24610eSdan */ 30527c24610eSdan memset(pPager->dbFileVers, 0xff, sizeof(pPager->dbFileVers)); 30537c24610eSdan }else{ 30547c24610eSdan u8 *dbFileVers = &((u8*)pPg->pData)[24]; 30557c24610eSdan memcpy(&pPager->dbFileVers, dbFileVers, sizeof(pPager->dbFileVers)); 30567c24610eSdan } 30577c24610eSdan } 3058251866d0Sdrh CODEC1(pPager, pPg->pData, pPg->pgno, 3, rc = SQLITE_NOMEM_BKPT); 30597c24610eSdan 30607c24610eSdan PAGER_INCR(sqlite3_pager_readdb_count); 30617c24610eSdan PAGER_INCR(pPager->nRead); 3062251866d0Sdrh IOTRACE(("PGIN %p %d\n", pPager, pPg->pgno)); 30637c24610eSdan PAGERTRACE(("FETCH %d page %d hash(%08x)\n", 3064251866d0Sdrh PAGERID(pPager), pPg->pgno, pager_pagehash(pPg))); 30657c24610eSdan 30667c24610eSdan return rc; 30677c24610eSdan } 30687c24610eSdan 30696d311fb0Sdan /* 30706d311fb0Sdan ** Update the value of the change-counter at offsets 24 and 92 in 30716d311fb0Sdan ** the header and the sqlite version number at offset 96. 30726d311fb0Sdan ** 30736d311fb0Sdan ** This is an unconditional update. See also the pager_incr_changecounter() 30746d311fb0Sdan ** routine which only updates the change-counter if the update is actually 30756d311fb0Sdan ** needed, as determined by the pPager->changeCountDone state variable. 30766d311fb0Sdan */ 30776d311fb0Sdan static void pager_write_changecounter(PgHdr *pPg){ 30786d311fb0Sdan u32 change_counter; 30796d311fb0Sdan 30806d311fb0Sdan /* Increment the value just read and write it back to byte 24. */ 30816d311fb0Sdan change_counter = sqlite3Get4byte((u8*)pPg->pPager->dbFileVers)+1; 30826d311fb0Sdan put32bits(((char*)pPg->pData)+24, change_counter); 30836d311fb0Sdan 30846d311fb0Sdan /* Also store the SQLite version number in bytes 96..99 and in 30856d311fb0Sdan ** bytes 92..95 store the change counter for which the version number 30866d311fb0Sdan ** is valid. */ 30876d311fb0Sdan put32bits(((char*)pPg->pData)+92, change_counter); 30886d311fb0Sdan put32bits(((char*)pPg->pData)+96, SQLITE_VERSION_NUMBER); 30896d311fb0Sdan } 30906d311fb0Sdan 30915cf53537Sdan #ifndef SQLITE_OMIT_WAL 30923306c4a9Sdan /* 309374d6cd88Sdan ** This function is invoked once for each page that has already been 309474d6cd88Sdan ** written into the log file when a WAL transaction is rolled back. 309574d6cd88Sdan ** Parameter iPg is the page number of said page. The pCtx argument 309674d6cd88Sdan ** is actually a pointer to the Pager structure. 30973306c4a9Sdan ** 309874d6cd88Sdan ** If page iPg is present in the cache, and has no outstanding references, 309974d6cd88Sdan ** it is discarded. Otherwise, if there are one or more outstanding 310074d6cd88Sdan ** references, the page content is reloaded from the database. If the 310174d6cd88Sdan ** attempt to reload content from the database is required and fails, 310274d6cd88Sdan ** return an SQLite error code. Otherwise, SQLITE_OK. 310374d6cd88Sdan */ 310474d6cd88Sdan static int pagerUndoCallback(void *pCtx, Pgno iPg){ 310574d6cd88Sdan int rc = SQLITE_OK; 310674d6cd88Sdan Pager *pPager = (Pager *)pCtx; 310774d6cd88Sdan PgHdr *pPg; 310874d6cd88Sdan 3109092d993cSdrh assert( pagerUseWal(pPager) ); 311074d6cd88Sdan pPg = sqlite3PagerLookup(pPager, iPg); 311174d6cd88Sdan if( pPg ){ 311274d6cd88Sdan if( sqlite3PcachePageRefcount(pPg)==1 ){ 311374d6cd88Sdan sqlite3PcacheDrop(pPg); 311474d6cd88Sdan }else{ 311556520ab8Sdrh rc = readDbPage(pPg); 311674d6cd88Sdan if( rc==SQLITE_OK ){ 311774d6cd88Sdan pPager->xReiniter(pPg); 311874d6cd88Sdan } 3119da8a330aSdrh sqlite3PagerUnrefNotNull(pPg); 312074d6cd88Sdan } 312174d6cd88Sdan } 312274d6cd88Sdan 31234c97b534Sdan /* Normally, if a transaction is rolled back, any backup processes are 31244c97b534Sdan ** updated as data is copied out of the rollback journal and into the 31254c97b534Sdan ** database. This is not generally possible with a WAL database, as 31264c97b534Sdan ** rollback involves simply truncating the log file. Therefore, if one 31274c97b534Sdan ** or more frames have already been written to the log (and therefore 31284c97b534Sdan ** also copied into the backup databases) as part of this transaction, 31294c97b534Sdan ** the backups must be restarted. 31304c97b534Sdan */ 31314c97b534Sdan sqlite3BackupRestart(pPager->pBackup); 31324c97b534Sdan 313374d6cd88Sdan return rc; 313474d6cd88Sdan } 313574d6cd88Sdan 313674d6cd88Sdan /* 313774d6cd88Sdan ** This function is called to rollback a transaction on a WAL database. 31383306c4a9Sdan */ 31397ed91f23Sdrh static int pagerRollbackWal(Pager *pPager){ 314074d6cd88Sdan int rc; /* Return Code */ 314174d6cd88Sdan PgHdr *pList; /* List of dirty pages to revert */ 31423306c4a9Sdan 314374d6cd88Sdan /* For all pages in the cache that are currently dirty or have already 314474d6cd88Sdan ** been written (but not committed) to the log file, do one of the 314574d6cd88Sdan ** following: 314674d6cd88Sdan ** 314774d6cd88Sdan ** + Discard the cached page (if refcount==0), or 314874d6cd88Sdan ** + Reload page content from the database (if refcount>0). 314974d6cd88Sdan */ 31507c24610eSdan pPager->dbSize = pPager->dbOrigSize; 31517ed91f23Sdrh rc = sqlite3WalUndo(pPager->pWal, pagerUndoCallback, (void *)pPager); 315274d6cd88Sdan pList = sqlite3PcacheDirtyList(pPager->pPCache); 31537c24610eSdan while( pList && rc==SQLITE_OK ){ 31547c24610eSdan PgHdr *pNext = pList->pDirty; 315574d6cd88Sdan rc = pagerUndoCallback((void *)pPager, pList->pgno); 31567c24610eSdan pList = pNext; 31577c24610eSdan } 315874d6cd88Sdan 31597c24610eSdan return rc; 31607c24610eSdan } 31617c24610eSdan 3162ed7c855cSdrh /* 31635cf53537Sdan ** This function is a wrapper around sqlite3WalFrames(). As well as logging 31645cf53537Sdan ** the contents of the list of pages headed by pList (connected by pDirty), 31655cf53537Sdan ** this function notifies any active backup processes that the pages have 31665cf53537Sdan ** changed. 3167104a7bbaSdrh ** 3168104a7bbaSdrh ** The list of pages passed into this routine is always sorted by page number. 3169104a7bbaSdrh ** Hence, if page 1 appears anywhere on the list, it will be the first page. 31705cf53537Sdan */ 31715cf53537Sdan static int pagerWalFrames( 31725cf53537Sdan Pager *pPager, /* Pager object */ 31735cf53537Sdan PgHdr *pList, /* List of frames to log */ 31745cf53537Sdan Pgno nTruncate, /* Database size after this commit */ 31754eb02a45Sdrh int isCommit /* True if this is a commit */ 31765cf53537Sdan ){ 31775cf53537Sdan int rc; /* Return code */ 31789ad3ee40Sdrh int nList; /* Number of pages in pList */ 3179104a7bbaSdrh PgHdr *p; /* For looping over pages */ 31805cf53537Sdan 31815cf53537Sdan assert( pPager->pWal ); 3182b07028f7Sdrh assert( pList ); 3183104a7bbaSdrh #ifdef SQLITE_DEBUG 3184104a7bbaSdrh /* Verify that the page list is in accending order */ 3185104a7bbaSdrh for(p=pList; p && p->pDirty; p=p->pDirty){ 3186104a7bbaSdrh assert( p->pgno < p->pDirty->pgno ); 3187104a7bbaSdrh } 3188104a7bbaSdrh #endif 3189104a7bbaSdrh 31909ad3ee40Sdrh assert( pList->pDirty==0 || isCommit ); 3191ce8e5ffeSdan if( isCommit ){ 3192ce8e5ffeSdan /* If a WAL transaction is being committed, there is no point in writing 3193ce8e5ffeSdan ** any pages with page numbers greater than nTruncate into the WAL file. 3194ce8e5ffeSdan ** They will never be read by any client. So remove them from the pDirty 3195ce8e5ffeSdan ** list here. */ 3196ce8e5ffeSdan PgHdr **ppNext = &pList; 31979ad3ee40Sdrh nList = 0; 3198a4c5860eSdrh for(p=pList; (*ppNext = p)!=0; p=p->pDirty){ 31999ad3ee40Sdrh if( p->pgno<=nTruncate ){ 32009ad3ee40Sdrh ppNext = &p->pDirty; 32019ad3ee40Sdrh nList++; 32029ad3ee40Sdrh } 3203ce8e5ffeSdan } 3204ce8e5ffeSdan assert( pList ); 32059ad3ee40Sdrh }else{ 32069ad3ee40Sdrh nList = 1; 3207ce8e5ffeSdan } 32089ad3ee40Sdrh pPager->aStat[PAGER_STAT_WRITE] += nList; 3209ce8e5ffeSdan 321054a7347aSdrh if( pList->pgno==1 ) pager_write_changecounter(pList); 32115cf53537Sdan rc = sqlite3WalFrames(pPager->pWal, 32124eb02a45Sdrh pPager->pageSize, pList, nTruncate, isCommit, pPager->walSyncFlags 32135cf53537Sdan ); 32145cf53537Sdan if( rc==SQLITE_OK && pPager->pBackup ){ 32155cf53537Sdan for(p=pList; p; p=p->pDirty){ 32165cf53537Sdan sqlite3BackupUpdate(pPager->pBackup, p->pgno, (u8 *)p->pData); 32175cf53537Sdan } 32185cf53537Sdan } 32195f848c3aSdan 32205f848c3aSdan #ifdef SQLITE_CHECK_PAGES 3221ce8e5ffeSdan pList = sqlite3PcacheDirtyList(pPager->pPCache); 3222104a7bbaSdrh for(p=pList; p; p=p->pDirty){ 3223104a7bbaSdrh pager_set_pagehash(p); 32245f848c3aSdan } 32255f848c3aSdan #endif 32265f848c3aSdan 32275cf53537Sdan return rc; 32285cf53537Sdan } 32295cf53537Sdan 32305cf53537Sdan /* 323173b64e4dSdrh ** Begin a read transaction on the WAL. 323273b64e4dSdrh ** 323373b64e4dSdrh ** This routine used to be called "pagerOpenSnapshot()" because it essentially 323473b64e4dSdrh ** makes a snapshot of the database at the current point in time and preserves 323573b64e4dSdrh ** that snapshot for use by the reader in spite of concurrently changes by 323673b64e4dSdrh ** other writers or checkpointers. 32375cf53537Sdan */ 323873b64e4dSdrh static int pagerBeginReadTransaction(Pager *pPager){ 32395cf53537Sdan int rc; /* Return code */ 32405cf53537Sdan int changed = 0; /* True if cache must be reset */ 32415cf53537Sdan 32425cf53537Sdan assert( pagerUseWal(pPager) ); 3243de1ae34eSdan assert( pPager->eState==PAGER_OPEN || pPager->eState==PAGER_READER ); 32445cf53537Sdan 324561e4acecSdrh /* sqlite3WalEndReadTransaction() was not called for the previous 324661e4acecSdrh ** transaction in locking_mode=EXCLUSIVE. So call it now. If we 324761e4acecSdrh ** are in locking_mode=NORMAL and EndRead() was previously called, 324861e4acecSdrh ** the duplicate call is harmless. 324961e4acecSdrh */ 325061e4acecSdrh sqlite3WalEndReadTransaction(pPager->pWal); 325161e4acecSdrh 325273b64e4dSdrh rc = sqlite3WalBeginReadTransaction(pPager->pWal, &changed); 325392683f54Sdrh if( rc!=SQLITE_OK || changed ){ 32545cf53537Sdan pager_reset(pPager); 3255188d4884Sdrh if( USEFETCH(pPager) ) sqlite3OsUnfetch(pPager->fd, 0, 0); 32565cf53537Sdan } 32575cf53537Sdan 32585cf53537Sdan return rc; 32595cf53537Sdan } 32609091f775Sshaneh #endif 32615cf53537Sdan 3262763afe62Sdan /* 326385d14ed2Sdan ** This function is called as part of the transition from PAGER_OPEN 326485d14ed2Sdan ** to PAGER_READER state to determine the size of the database file 326585d14ed2Sdan ** in pages (assuming the page size currently stored in Pager.pageSize). 326685d14ed2Sdan ** 326785d14ed2Sdan ** If no error occurs, SQLITE_OK is returned and the size of the database 326885d14ed2Sdan ** in pages is stored in *pnPage. Otherwise, an error code (perhaps 326985d14ed2Sdan ** SQLITE_IOERR_FSTAT) is returned and *pnPage is left unmodified. 3270763afe62Sdan */ 3271763afe62Sdan static int pagerPagecount(Pager *pPager, Pgno *pnPage){ 3272763afe62Sdan Pgno nPage; /* Value to return via *pnPage */ 3273763afe62Sdan 327485d14ed2Sdan /* Query the WAL sub-system for the database size. The WalDbsize() 327585d14ed2Sdan ** function returns zero if the WAL is not open (i.e. Pager.pWal==0), or 327685d14ed2Sdan ** if the database size is not available. The database size is not 327785d14ed2Sdan ** available from the WAL sub-system if the log file is empty or 327885d14ed2Sdan ** contains no valid committed transactions. 327985d14ed2Sdan */ 3280de1ae34eSdan assert( pPager->eState==PAGER_OPEN ); 328133f111dcSdrh assert( pPager->eLock>=SHARED_LOCK ); 3282835f22deSdrh assert( isOpen(pPager->fd) ); 3283835f22deSdrh assert( pPager->tempFile==0 ); 3284763afe62Sdan nPage = sqlite3WalDbsize(pPager->pWal); 328585d14ed2Sdan 3286af80a1c8Sdrh /* If the number of pages in the database is not available from the 32878abc54e2Sdrh ** WAL sub-system, determine the page count based on the size of 3288af80a1c8Sdrh ** the database file. If the size of the database file is not an 3289af80a1c8Sdrh ** integer multiple of the page-size, round up the result. 329085d14ed2Sdan */ 3291835f22deSdrh if( nPage==0 && ALWAYS(isOpen(pPager->fd)) ){ 3292763afe62Sdan i64 n = 0; /* Size of db file in bytes */ 3293763afe62Sdan int rc = sqlite3OsFileSize(pPager->fd, &n); 3294763afe62Sdan if( rc!=SQLITE_OK ){ 3295763afe62Sdan return rc; 3296763afe62Sdan } 3297935de7e8Sdrh nPage = (Pgno)((n+pPager->pageSize-1) / pPager->pageSize); 3298763afe62Sdan } 3299937ac9daSdan 3300937ac9daSdan /* If the current number of pages in the file is greater than the 3301937ac9daSdan ** configured maximum pager number, increase the allowed limit so 3302937ac9daSdan ** that the file can be read. 3303937ac9daSdan */ 3304937ac9daSdan if( nPage>pPager->mxPgno ){ 3305937ac9daSdan pPager->mxPgno = (Pgno)nPage; 3306937ac9daSdan } 3307937ac9daSdan 3308763afe62Sdan *pnPage = nPage; 3309763afe62Sdan return SQLITE_OK; 3310763afe62Sdan } 3311763afe62Sdan 33129091f775Sshaneh #ifndef SQLITE_OMIT_WAL 33135cf53537Sdan /* 33145cf53537Sdan ** Check if the *-wal file that corresponds to the database opened by pPager 331532f29643Sdrh ** exists if the database is not empy, or verify that the *-wal file does 331632f29643Sdrh ** not exist (by deleting it) if the database file is empty. 331732f29643Sdrh ** 331832f29643Sdrh ** If the database is not empty and the *-wal file exists, open the pager 331932f29643Sdrh ** in WAL mode. If the database is empty or if no *-wal file exists and 332032f29643Sdrh ** if no error occurs, make sure Pager.journalMode is not set to 332132f29643Sdrh ** PAGER_JOURNALMODE_WAL. 332232f29643Sdrh ** 332332f29643Sdrh ** Return SQLITE_OK or an error code. 33245cf53537Sdan ** 33255cf53537Sdan ** The caller must hold a SHARED lock on the database file to call this 33265cf53537Sdan ** function. Because an EXCLUSIVE lock on the db file is required to delete 332732f29643Sdrh ** a WAL on a none-empty database, this ensures there is no race condition 332832f29643Sdrh ** between the xAccess() below and an xDelete() being executed by some 332932f29643Sdrh ** other connection. 33305cf53537Sdan */ 33315cf53537Sdan static int pagerOpenWalIfPresent(Pager *pPager){ 33325cf53537Sdan int rc = SQLITE_OK; 333385d14ed2Sdan assert( pPager->eState==PAGER_OPEN ); 333433f111dcSdrh assert( pPager->eLock>=SHARED_LOCK ); 333585d14ed2Sdan 33365cf53537Sdan if( !pPager->tempFile ){ 33375cf53537Sdan int isWal; /* True if WAL file exists */ 333877f6af2bSdrh rc = sqlite3OsAccess( 333977f6af2bSdrh pPager->pVfs, pPager->zWal, SQLITE_ACCESS_EXISTS, &isWal 334077f6af2bSdrh ); 334177f6af2bSdrh if( rc==SQLITE_OK ){ 334277f6af2bSdrh if( isWal ){ 3343763afe62Sdan Pgno nPage; /* Size of the database file */ 3344d0864087Sdan 3345763afe62Sdan rc = pagerPagecount(pPager, &nPage); 334632f29643Sdrh if( rc ) return rc; 334732f29643Sdrh if( nPage==0 ){ 3348db10f082Sdan rc = sqlite3OsDelete(pPager->pVfs, pPager->zWal, 0); 334932f29643Sdrh }else{ 33504e004aa6Sdan testcase( sqlite3PcachePagecount(pPager->pPCache)==0 ); 33515cf53537Sdan rc = sqlite3PagerOpenWal(pPager, 0); 335277f6af2bSdrh } 33535cf53537Sdan }else if( pPager->journalMode==PAGER_JOURNALMODE_WAL ){ 33545cf53537Sdan pPager->journalMode = PAGER_JOURNALMODE_DELETE; 33555cf53537Sdan } 33565cf53537Sdan } 33575cf53537Sdan } 33585cf53537Sdan return rc; 33595cf53537Sdan } 33605cf53537Sdan #endif 33615cf53537Sdan 33625cf53537Sdan /* 3363d6e5e098Sdrh ** Playback savepoint pSavepoint. Or, if pSavepoint==NULL, then playback 3364bea2a948Sdanielk1977 ** the entire master journal file. The case pSavepoint==NULL occurs when 3365bea2a948Sdanielk1977 ** a ROLLBACK TO command is invoked on a SAVEPOINT that is a transaction 3366bea2a948Sdanielk1977 ** savepoint. 3367d6e5e098Sdrh ** 3368bea2a948Sdanielk1977 ** When pSavepoint is not NULL (meaning a non-transaction savepoint is 3369bea2a948Sdanielk1977 ** being rolled back), then the rollback consists of up to three stages, 3370bea2a948Sdanielk1977 ** performed in the order specified: 3371bea2a948Sdanielk1977 ** 3372bea2a948Sdanielk1977 ** * Pages are played back from the main journal starting at byte 3373bea2a948Sdanielk1977 ** offset PagerSavepoint.iOffset and continuing to 3374bea2a948Sdanielk1977 ** PagerSavepoint.iHdrOffset, or to the end of the main journal 3375bea2a948Sdanielk1977 ** file if PagerSavepoint.iHdrOffset is zero. 3376bea2a948Sdanielk1977 ** 3377bea2a948Sdanielk1977 ** * If PagerSavepoint.iHdrOffset is not zero, then pages are played 3378bea2a948Sdanielk1977 ** back starting from the journal header immediately following 3379bea2a948Sdanielk1977 ** PagerSavepoint.iHdrOffset to the end of the main journal file. 3380bea2a948Sdanielk1977 ** 3381bea2a948Sdanielk1977 ** * Pages are then played back from the sub-journal file, starting 3382bea2a948Sdanielk1977 ** with the PagerSavepoint.iSubRec and continuing to the end of 3383bea2a948Sdanielk1977 ** the journal file. 3384bea2a948Sdanielk1977 ** 3385bea2a948Sdanielk1977 ** Throughout the rollback process, each time a page is rolled back, the 3386bea2a948Sdanielk1977 ** corresponding bit is set in a bitvec structure (variable pDone in the 3387bea2a948Sdanielk1977 ** implementation below). This is used to ensure that a page is only 3388bea2a948Sdanielk1977 ** rolled back the first time it is encountered in either journal. 3389bea2a948Sdanielk1977 ** 3390bea2a948Sdanielk1977 ** If pSavepoint is NULL, then pages are only played back from the main 3391bea2a948Sdanielk1977 ** journal file. There is no need for a bitvec in this case. 3392bea2a948Sdanielk1977 ** 3393bea2a948Sdanielk1977 ** In either case, before playback commences the Pager.dbSize variable 3394bea2a948Sdanielk1977 ** is reset to the value that it held at the start of the savepoint 3395bea2a948Sdanielk1977 ** (or transaction). No page with a page-number greater than this value 3396bea2a948Sdanielk1977 ** is played back. If one is encountered it is simply skipped. 3397fa86c412Sdrh */ 3398fd7f0452Sdanielk1977 static int pagerPlaybackSavepoint(Pager *pPager, PagerSavepoint *pSavepoint){ 3399d6e5e098Sdrh i64 szJ; /* Effective size of the main journal */ 3400fd7f0452Sdanielk1977 i64 iHdrOff; /* End of first segment of main-journal records */ 3401f2c31ad8Sdanielk1977 int rc = SQLITE_OK; /* Return code */ 3402fd7f0452Sdanielk1977 Bitvec *pDone = 0; /* Bitvec to ensure pages played back only once */ 3403fa86c412Sdrh 3404a42c66bdSdan assert( pPager->eState!=PAGER_ERROR ); 3405de1ae34eSdan assert( pPager->eState>=PAGER_WRITER_LOCKED ); 3406bea2a948Sdanielk1977 3407fd7f0452Sdanielk1977 /* Allocate a bitvec to use to store the set of pages rolled back */ 3408fd7f0452Sdanielk1977 if( pSavepoint ){ 3409fd7f0452Sdanielk1977 pDone = sqlite3BitvecCreate(pSavepoint->nOrig); 3410fd7f0452Sdanielk1977 if( !pDone ){ 3411fad3039cSmistachkin return SQLITE_NOMEM_BKPT; 3412fd7f0452Sdanielk1977 } 34137657240aSdanielk1977 } 34147657240aSdanielk1977 3415bea2a948Sdanielk1977 /* Set the database size back to the value it was before the savepoint 3416bea2a948Sdanielk1977 ** being reverted was opened. 3417fa86c412Sdrh */ 3418f2c31ad8Sdanielk1977 pPager->dbSize = pSavepoint ? pSavepoint->nOrig : pPager->dbOrigSize; 3419ab7e8d85Sdan pPager->changeCountDone = pPager->tempFile; 3420fa86c412Sdrh 34217ed91f23Sdrh if( !pSavepoint && pagerUseWal(pPager) ){ 34227ed91f23Sdrh return pagerRollbackWal(pPager); 34237c24610eSdan } 34247c24610eSdan 3425d6e5e098Sdrh /* Use pPager->journalOff as the effective size of the main rollback 3426d6e5e098Sdrh ** journal. The actual file might be larger than this in 3427d6e5e098Sdrh ** PAGER_JOURNALMODE_TRUNCATE or PAGER_JOURNALMODE_PERSIST. But anything 3428d6e5e098Sdrh ** past pPager->journalOff is off-limits to us. 3429fa86c412Sdrh */ 3430fd7f0452Sdanielk1977 szJ = pPager->journalOff; 34317ed91f23Sdrh assert( pagerUseWal(pPager)==0 || szJ==0 ); 3432d6e5e098Sdrh 3433d6e5e098Sdrh /* Begin by rolling back records from the main journal starting at 3434d6e5e098Sdrh ** PagerSavepoint.iOffset and continuing to the next journal header. 3435d6e5e098Sdrh ** There might be records in the main journal that have a page number 3436d6e5e098Sdrh ** greater than the current database size (pPager->dbSize) but those 3437d6e5e098Sdrh ** will be skipped automatically. Pages are added to pDone as they 3438d6e5e098Sdrh ** are played back. 3439d6e5e098Sdrh */ 34407ed91f23Sdrh if( pSavepoint && !pagerUseWal(pPager) ){ 3441fd7f0452Sdanielk1977 iHdrOff = pSavepoint->iHdrOffset ? pSavepoint->iHdrOffset : szJ; 3442fd7f0452Sdanielk1977 pPager->journalOff = pSavepoint->iOffset; 3443fd7f0452Sdanielk1977 while( rc==SQLITE_OK && pPager->journalOff<iHdrOff ){ 344491781bd7Sdrh rc = pager_playback_one_page(pPager, &pPager->journalOff, pDone, 1, 1); 344545d6882fSdanielk1977 } 3446bea2a948Sdanielk1977 assert( rc!=SQLITE_DONE ); 3447fd7f0452Sdanielk1977 }else{ 3448fd7f0452Sdanielk1977 pPager->journalOff = 0; 34497657240aSdanielk1977 } 3450d6e5e098Sdrh 3451d6e5e098Sdrh /* Continue rolling back records out of the main journal starting at 3452d6e5e098Sdrh ** the first journal header seen and continuing until the effective end 3453d6e5e098Sdrh ** of the main journal file. Continue to skip out-of-range pages and 3454d6e5e098Sdrh ** continue adding pages rolled back to pDone. 3455d6e5e098Sdrh */ 3456fd7f0452Sdanielk1977 while( rc==SQLITE_OK && pPager->journalOff<szJ ){ 3457bea2a948Sdanielk1977 u32 ii; /* Loop counter */ 3458c81806f3Sdanielk1977 u32 nJRec = 0; /* Number of Journal Records */ 34597657240aSdanielk1977 u32 dummy; 34606f4c73eeSdanielk1977 rc = readJournalHdr(pPager, 0, szJ, &nJRec, &dummy); 3461968af52aSdrh assert( rc!=SQLITE_DONE ); 3462d6e5e098Sdrh 3463d6e5e098Sdrh /* 3464d6e5e098Sdrh ** The "pPager->journalHdr+JOURNAL_HDR_SZ(pPager)==pPager->journalOff" 3465d6e5e098Sdrh ** test is related to ticket #2565. See the discussion in the 3466d6e5e098Sdrh ** pager_playback() function for additional information. 3467d6e5e098Sdrh */ 3468d6e5e098Sdrh if( nJRec==0 3469d6e5e098Sdrh && pPager->journalHdr+JOURNAL_HDR_SZ(pPager)==pPager->journalOff 3470d6e5e098Sdrh ){ 3471d87897dfSshane nJRec = (u32)((szJ - pPager->journalOff)/JOURNAL_PG_SZ(pPager)); 347275edc16fSdanielk1977 } 347312dd5496Sdanielk1977 for(ii=0; rc==SQLITE_OK && ii<nJRec && pPager->journalOff<szJ; ii++){ 347491781bd7Sdrh rc = pager_playback_one_page(pPager, &pPager->journalOff, pDone, 1, 1); 3475fd7f0452Sdanielk1977 } 3476bea2a948Sdanielk1977 assert( rc!=SQLITE_DONE ); 347745d6882fSdanielk1977 } 347839cf5109Sdrh assert( rc!=SQLITE_OK || pPager->journalOff>=szJ ); 3479fd7f0452Sdanielk1977 3480d6e5e098Sdrh /* Finally, rollback pages from the sub-journal. Page that were 3481d6e5e098Sdrh ** previously rolled back out of the main journal (and are hence in pDone) 3482d6e5e098Sdrh ** will be skipped. Out-of-range pages are also skipped. 3483d6e5e098Sdrh */ 3484fd7f0452Sdanielk1977 if( pSavepoint ){ 3485bea2a948Sdanielk1977 u32 ii; /* Loop counter */ 34867c3210e6Sdan i64 offset = (i64)pSavepoint->iSubRec*(4+pPager->pageSize); 34874cd78b4dSdan 34887ed91f23Sdrh if( pagerUseWal(pPager) ){ 348971d89919Sdan rc = sqlite3WalSavepointUndo(pPager->pWal, pSavepoint->aWalData); 34904cd78b4dSdan } 3491bea2a948Sdanielk1977 for(ii=pSavepoint->iSubRec; rc==SQLITE_OK && ii<pPager->nSubRec; ii++){ 34927c3210e6Sdan assert( offset==(i64)ii*(4+pPager->pageSize) ); 349391781bd7Sdrh rc = pager_playback_one_page(pPager, &offset, pDone, 0, 1); 34947657240aSdanielk1977 } 3495bea2a948Sdanielk1977 assert( rc!=SQLITE_DONE ); 349645d6882fSdanielk1977 } 34977657240aSdanielk1977 3498fd7f0452Sdanielk1977 sqlite3BitvecDestroy(pDone); 34998a7aea3bSdanielk1977 if( rc==SQLITE_OK ){ 350075edc16fSdanielk1977 pPager->journalOff = szJ; 3501fa86c412Sdrh } 35024cd78b4dSdan 3503fa86c412Sdrh return rc; 3504fa86c412Sdrh } 3505fa86c412Sdrh 3506fa86c412Sdrh /* 35079b0cf34fSdrh ** Change the maximum number of in-memory pages that are allowed 35089b0cf34fSdrh ** before attempting to recycle clean and unused pages. 3509f57b14a6Sdrh */ 35103b8a05f6Sdanielk1977 void sqlite3PagerSetCachesize(Pager *pPager, int mxPage){ 35118c0a791aSdanielk1977 sqlite3PcacheSetCachesize(pPager->pPCache, mxPage); 3512f57b14a6Sdrh } 3513f57b14a6Sdrh 3514f57b14a6Sdrh /* 35159b0cf34fSdrh ** Change the maximum number of in-memory pages that are allowed 35169b0cf34fSdrh ** before attempting to spill pages to journal. 35179b0cf34fSdrh */ 35189b0cf34fSdrh int sqlite3PagerSetSpillsize(Pager *pPager, int mxPage){ 35199b0cf34fSdrh return sqlite3PcacheSetSpillsize(pPager->pPCache, mxPage); 35209b0cf34fSdrh } 35219b0cf34fSdrh 35229b0cf34fSdrh /* 35239b4c59faSdrh ** Invoke SQLITE_FCNTL_MMAP_SIZE based on the current value of szMmap. 35245d8a1372Sdan */ 35255d8a1372Sdan static void pagerFixMaplimit(Pager *pPager){ 35269b4c59faSdrh #if SQLITE_MAX_MMAP_SIZE>0 3527f23da966Sdan sqlite3_file *fd = pPager->fd; 3528789efdb9Sdan if( isOpen(fd) && fd->pMethods->iVersion>=3 ){ 35299b4c59faSdrh sqlite3_int64 sz; 35309b4c59faSdrh sz = pPager->szMmap; 3531789efdb9Sdan pPager->bUseFetch = (sz>0); 353212e6f682Sdrh setGetterMethod(pPager); 35339b4c59faSdrh sqlite3OsFileControlHint(pPager->fd, SQLITE_FCNTL_MMAP_SIZE, &sz); 3534f23da966Sdan } 3535188d4884Sdrh #endif 35365d8a1372Sdan } 35375d8a1372Sdan 35385d8a1372Sdan /* 35395d8a1372Sdan ** Change the maximum size of any memory mapping made of the database file. 35405d8a1372Sdan */ 35419b4c59faSdrh void sqlite3PagerSetMmapLimit(Pager *pPager, sqlite3_int64 szMmap){ 35429b4c59faSdrh pPager->szMmap = szMmap; 35435d8a1372Sdan pagerFixMaplimit(pPager); 35445d8a1372Sdan } 35455d8a1372Sdan 35465d8a1372Sdan /* 354709419b4bSdrh ** Free as much memory as possible from the pager. 354809419b4bSdrh */ 354909419b4bSdrh void sqlite3PagerShrink(Pager *pPager){ 355009419b4bSdrh sqlite3PcacheShrink(pPager->pPCache); 355109419b4bSdrh } 355209419b4bSdrh 355309419b4bSdrh /* 355440c3941cSdrh ** Adjust settings of the pager to those specified in the pgFlags parameter. 355540c3941cSdrh ** 355640c3941cSdrh ** The "level" in pgFlags & PAGER_SYNCHRONOUS_MASK sets the robustness 355740c3941cSdrh ** of the database to damage due to OS crashes or power failures by 355840c3941cSdrh ** changing the number of syncs()s when writing the journals. 35590dba3304Sdrh ** There are four levels: 3560973b6e33Sdrh ** 3561054889ecSdrh ** OFF sqlite3OsSync() is never called. This is the default 3562973b6e33Sdrh ** for temporary and transient files. 3563973b6e33Sdrh ** 3564973b6e33Sdrh ** NORMAL The journal is synced once before writes begin on the 3565973b6e33Sdrh ** database. This is normally adequate protection, but 3566973b6e33Sdrh ** it is theoretically possible, though very unlikely, 3567973b6e33Sdrh ** that an inopertune power failure could leave the journal 3568973b6e33Sdrh ** in a state which would cause damage to the database 3569973b6e33Sdrh ** when it is rolled back. 3570973b6e33Sdrh ** 3571973b6e33Sdrh ** FULL The journal is synced twice before writes begin on the 357234e79ceeSdrh ** database (with some additional information - the nRec field 357334e79ceeSdrh ** of the journal header - being written in between the two 357434e79ceeSdrh ** syncs). If we assume that writing a 3575973b6e33Sdrh ** single disk sector is atomic, then this mode provides 3576973b6e33Sdrh ** assurance that the journal will not be corrupted to the 3577973b6e33Sdrh ** point of causing damage to the database during rollback. 3578973b6e33Sdrh ** 35790dba3304Sdrh ** EXTRA This is like FULL except that is also syncs the directory 35800dba3304Sdrh ** that contains the rollback journal after the rollback 35810dba3304Sdrh ** journal is unlinked. 35820dba3304Sdrh ** 3583c97d8463Sdrh ** The above is for a rollback-journal mode. For WAL mode, OFF continues 3584c97d8463Sdrh ** to mean that no syncs ever occur. NORMAL means that the WAL is synced 3585c97d8463Sdrh ** prior to the start of checkpoint and that the database file is synced 3586c97d8463Sdrh ** at the conclusion of the checkpoint if the entire content of the WAL 3587c97d8463Sdrh ** was written back into the database. But no sync operations occur for 3588c97d8463Sdrh ** an ordinary commit in NORMAL mode with WAL. FULL means that the WAL 3589c97d8463Sdrh ** file is synced following each commit operation, in addition to the 35900dba3304Sdrh ** syncs associated with NORMAL. There is no difference between FULL 35910dba3304Sdrh ** and EXTRA for WAL mode. 3592c97d8463Sdrh ** 3593c97d8463Sdrh ** Do not confuse synchronous=FULL with SQLITE_SYNC_FULL. The 3594c97d8463Sdrh ** SQLITE_SYNC_FULL macro means to use the MacOSX-style full-fsync 3595c97d8463Sdrh ** using fcntl(F_FULLFSYNC). SQLITE_SYNC_NORMAL means to do an 3596c97d8463Sdrh ** ordinary fsync() call. There is no difference between SQLITE_SYNC_FULL 3597c97d8463Sdrh ** and SQLITE_SYNC_NORMAL on platforms other than MacOSX. But the 3598c97d8463Sdrh ** synchronous=FULL versus synchronous=NORMAL setting determines when 3599c97d8463Sdrh ** the xSync primitive is called and is relevant to all platforms. 3600c97d8463Sdrh ** 3601973b6e33Sdrh ** Numeric values associated with these states are OFF==1, NORMAL=2, 3602973b6e33Sdrh ** and FULL=3. 3603973b6e33Sdrh */ 360493758c8dSdanielk1977 #ifndef SQLITE_OMIT_PAGER_PRAGMAS 360540c3941cSdrh void sqlite3PagerSetFlags( 3606c97d8463Sdrh Pager *pPager, /* The pager to set safety level for */ 360740c3941cSdrh unsigned pgFlags /* Various flags */ 3608c97d8463Sdrh ){ 360940c3941cSdrh unsigned level = pgFlags & PAGER_SYNCHRONOUS_MASK; 36106841b1cbSdrh if( pPager->tempFile ){ 36116841b1cbSdrh pPager->noSync = 1; 36126841b1cbSdrh pPager->fullSync = 0; 36136841b1cbSdrh pPager->extraSync = 0; 36146841b1cbSdrh }else{ 36156841b1cbSdrh pPager->noSync = level==PAGER_SYNCHRONOUS_OFF ?1:0; 36166841b1cbSdrh pPager->fullSync = level>=PAGER_SYNCHRONOUS_FULL ?1:0; 36176841b1cbSdrh pPager->extraSync = level==PAGER_SYNCHRONOUS_EXTRA ?1:0; 36186841b1cbSdrh } 3619c97d8463Sdrh if( pPager->noSync ){ 3620c97d8463Sdrh pPager->syncFlags = 0; 362140c3941cSdrh }else if( pgFlags & PAGER_FULLFSYNC ){ 3622c97d8463Sdrh pPager->syncFlags = SQLITE_SYNC_FULL; 3623c97d8463Sdrh }else{ 3624c97d8463Sdrh pPager->syncFlags = SQLITE_SYNC_NORMAL; 3625c97d8463Sdrh } 3626daaae7b9Sdrh pPager->walSyncFlags = (pPager->syncFlags<<2); 36274eb02a45Sdrh if( pPager->fullSync ){ 3628daaae7b9Sdrh pPager->walSyncFlags |= pPager->syncFlags; 3629daaae7b9Sdrh } 3630daaae7b9Sdrh if( (pgFlags & PAGER_CKPT_FULLFSYNC) && !pPager->noSync ){ 3631daaae7b9Sdrh pPager->walSyncFlags |= (SQLITE_SYNC_FULL<<2); 36324eb02a45Sdrh } 363340c3941cSdrh if( pgFlags & PAGER_CACHESPILL ){ 363440c3941cSdrh pPager->doNotSpill &= ~SPILLFLAG_OFF; 363540c3941cSdrh }else{ 363640c3941cSdrh pPager->doNotSpill |= SPILLFLAG_OFF; 363740c3941cSdrh } 3638973b6e33Sdrh } 363993758c8dSdanielk1977 #endif 3640973b6e33Sdrh 3641973b6e33Sdrh /* 3642af6df11fSdrh ** The following global variable is incremented whenever the library 3643af6df11fSdrh ** attempts to open a temporary file. This information is used for 3644af6df11fSdrh ** testing and analysis only. 3645af6df11fSdrh */ 36460f7eb611Sdrh #ifdef SQLITE_TEST 3647af6df11fSdrh int sqlite3_opentemp_count = 0; 36480f7eb611Sdrh #endif 3649af6df11fSdrh 3650af6df11fSdrh /* 36513f56e6ebSdrh ** Open a temporary file. 36523f56e6ebSdrh ** 3653bea2a948Sdanielk1977 ** Write the file descriptor into *pFile. Return SQLITE_OK on success 3654bea2a948Sdanielk1977 ** or some other error code if we fail. The OS will automatically 3655bea2a948Sdanielk1977 ** delete the temporary file when it is closed. 3656bea2a948Sdanielk1977 ** 3657bea2a948Sdanielk1977 ** The flags passed to the VFS layer xOpen() call are those specified 3658bea2a948Sdanielk1977 ** by parameter vfsFlags ORed with the following: 3659bea2a948Sdanielk1977 ** 3660bea2a948Sdanielk1977 ** SQLITE_OPEN_READWRITE 3661bea2a948Sdanielk1977 ** SQLITE_OPEN_CREATE 3662bea2a948Sdanielk1977 ** SQLITE_OPEN_EXCLUSIVE 3663bea2a948Sdanielk1977 ** SQLITE_OPEN_DELETEONCLOSE 3664fa86c412Sdrh */ 3665bea2a948Sdanielk1977 static int pagerOpentemp( 366617b90b53Sdanielk1977 Pager *pPager, /* The pager object */ 366733f4e02aSdrh sqlite3_file *pFile, /* Write the file descriptor here */ 366833f4e02aSdrh int vfsFlags /* Flags passed through to the VFS */ 3669b4b47411Sdanielk1977 ){ 3670bea2a948Sdanielk1977 int rc; /* Return code */ 36713f56e6ebSdrh 36720f7eb611Sdrh #ifdef SQLITE_TEST 3673af6df11fSdrh sqlite3_opentemp_count++; /* Used for testing and analysis only */ 36740f7eb611Sdrh #endif 3675b4b47411Sdanielk1977 367633f4e02aSdrh vfsFlags |= SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE | 367733f4e02aSdrh SQLITE_OPEN_EXCLUSIVE | SQLITE_OPEN_DELETEONCLOSE; 367817b90b53Sdanielk1977 rc = sqlite3OsOpen(pPager->pVfs, 0, pFile, vfsFlags, 0); 3679bea2a948Sdanielk1977 assert( rc!=SQLITE_OK || isOpen(pFile) ); 3680fa86c412Sdrh return rc; 3681fa86c412Sdrh } 3682fa86c412Sdrh 3683ed7c855cSdrh /* 368490f5ecb3Sdrh ** Set the busy handler function. 3685bea2a948Sdanielk1977 ** 3686bea2a948Sdanielk1977 ** The pager invokes the busy-handler if sqlite3OsLock() returns 3687bea2a948Sdanielk1977 ** SQLITE_BUSY when trying to upgrade from no-lock to a SHARED lock, 3688bea2a948Sdanielk1977 ** or when trying to upgrade from a RESERVED lock to an EXCLUSIVE 3689bea2a948Sdanielk1977 ** lock. It does *not* invoke the busy handler when upgrading from 3690bea2a948Sdanielk1977 ** SHARED to RESERVED, or when upgrading from SHARED to EXCLUSIVE 3691bea2a948Sdanielk1977 ** (which occurs during hot-journal rollback). Summary: 3692bea2a948Sdanielk1977 ** 3693bea2a948Sdanielk1977 ** Transition | Invokes xBusyHandler 3694bea2a948Sdanielk1977 ** -------------------------------------------------------- 3695bea2a948Sdanielk1977 ** NO_LOCK -> SHARED_LOCK | Yes 3696bea2a948Sdanielk1977 ** SHARED_LOCK -> RESERVED_LOCK | No 3697bea2a948Sdanielk1977 ** SHARED_LOCK -> EXCLUSIVE_LOCK | No 3698bea2a948Sdanielk1977 ** RESERVED_LOCK -> EXCLUSIVE_LOCK | Yes 3699bea2a948Sdanielk1977 ** 3700bea2a948Sdanielk1977 ** If the busy-handler callback returns non-zero, the lock is 3701bea2a948Sdanielk1977 ** retried. If it returns zero, then the SQLITE_BUSY error is 3702bea2a948Sdanielk1977 ** returned to the caller of the pager API function. 370390f5ecb3Sdrh */ 370480262896Sdrh void sqlite3PagerSetBusyHandler( 3705bea2a948Sdanielk1977 Pager *pPager, /* Pager object */ 3706bea2a948Sdanielk1977 int (*xBusyHandler)(void *), /* Pointer to busy-handler function */ 3707bea2a948Sdanielk1977 void *pBusyHandlerArg /* Argument to pass to xBusyHandler */ 37081ceedd37Sdanielk1977 ){ 3709afb39a4cSdrh void **ap; 37101ceedd37Sdanielk1977 pPager->xBusyHandler = xBusyHandler; 37111ceedd37Sdanielk1977 pPager->pBusyHandlerArg = pBusyHandlerArg; 3712afb39a4cSdrh ap = (void **)&pPager->xBusyHandler; 371380bb6f82Sdan assert( ((int(*)(void *))(ap[0]))==xBusyHandler ); 371480bb6f82Sdan assert( ap[1]==pBusyHandlerArg ); 371544c4fcb9Sdan sqlite3OsFileControlHint(pPager->fd, SQLITE_FCNTL_BUSYHANDLER, (void *)ap); 371680bb6f82Sdan } 371790f5ecb3Sdrh 371890f5ecb3Sdrh /* 3719bea2a948Sdanielk1977 ** Change the page size used by the Pager object. The new page size 3720bea2a948Sdanielk1977 ** is passed in *pPageSize. 3721bea2a948Sdanielk1977 ** 3722bea2a948Sdanielk1977 ** If the pager is in the error state when this function is called, it 3723bea2a948Sdanielk1977 ** is a no-op. The value returned is the error state error code (i.e. 3724a81a2207Sdan ** one of SQLITE_IOERR, an SQLITE_IOERR_xxx sub-code or SQLITE_FULL). 3725bea2a948Sdanielk1977 ** 3726bea2a948Sdanielk1977 ** Otherwise, if all of the following are true: 3727bea2a948Sdanielk1977 ** 3728bea2a948Sdanielk1977 ** * the new page size (value of *pPageSize) is valid (a power 3729bea2a948Sdanielk1977 ** of two between 512 and SQLITE_MAX_PAGE_SIZE, inclusive), and 3730bea2a948Sdanielk1977 ** 3731bea2a948Sdanielk1977 ** * there are no outstanding page references, and 3732bea2a948Sdanielk1977 ** 3733bea2a948Sdanielk1977 ** * the database is either not an in-memory database or it is 3734bea2a948Sdanielk1977 ** an in-memory database that currently consists of zero pages. 3735bea2a948Sdanielk1977 ** 3736bea2a948Sdanielk1977 ** then the pager object page size is set to *pPageSize. 3737bea2a948Sdanielk1977 ** 3738bea2a948Sdanielk1977 ** If the page size is changed, then this function uses sqlite3PagerMalloc() 3739bea2a948Sdanielk1977 ** to obtain a new Pager.pTmpSpace buffer. If this allocation attempt 3740bea2a948Sdanielk1977 ** fails, SQLITE_NOMEM is returned and the page size remains unchanged. 3741bea2a948Sdanielk1977 ** In all other cases, SQLITE_OK is returned. 3742bea2a948Sdanielk1977 ** 3743bea2a948Sdanielk1977 ** If the page size is not changed, either because one of the enumerated 3744bea2a948Sdanielk1977 ** conditions above is not true, the pager was in error state when this 3745bea2a948Sdanielk1977 ** function was called, or because the memory allocation attempt failed, 3746bea2a948Sdanielk1977 ** then *pPageSize is set to the old, retained page size before returning. 374790f5ecb3Sdrh */ 3748b2eced5dSdrh int sqlite3PagerSetPagesize(Pager *pPager, u32 *pPageSize, int nReserve){ 37491879b088Sdan int rc = SQLITE_OK; 37501879b088Sdan 3751a42c66bdSdan /* It is not possible to do a full assert_pager_state() here, as this 3752a42c66bdSdan ** function may be called from within PagerOpen(), before the state 3753a42c66bdSdan ** of the Pager object is internally consistent. 375422b328b2Sdan ** 375522b328b2Sdan ** At one point this function returned an error if the pager was in 375622b328b2Sdan ** PAGER_ERROR state. But since PAGER_ERROR state guarantees that 375722b328b2Sdan ** there is at least one outstanding page reference, this function 375822b328b2Sdan ** is a no-op for that case anyhow. 3759a42c66bdSdan */ 3760a42c66bdSdan 3761b2eced5dSdrh u32 pageSize = *pPageSize; 37629663b8f9Sdanielk1977 assert( pageSize==0 || (pageSize>=512 && pageSize<=SQLITE_MAX_PAGE_SIZE) ); 37638a938f98Sdrh if( (pPager->memDb==0 || pPager->dbSize==0) 37647426f864Sdrh && sqlite3PcacheRefCount(pPager->pPCache)==0 376543b18e1eSdrh && pageSize && pageSize!=(u32)pPager->pageSize 3766a1644fd8Sdanielk1977 ){ 37671df2db7fSshaneh char *pNew = NULL; /* New temp space */ 3768763afe62Sdan i64 nByte = 0; 37691879b088Sdan 3770de1ae34eSdan if( pPager->eState>PAGER_OPEN && isOpen(pPager->fd) ){ 37711879b088Sdan rc = sqlite3OsFileSize(pPager->fd, &nByte); 3772763afe62Sdan } 37731879b088Sdan if( rc==SQLITE_OK ){ 3774763afe62Sdan pNew = (char *)sqlite3PageMalloc(pageSize); 3775fad3039cSmistachkin if( !pNew ) rc = SQLITE_NOMEM_BKPT; 37761879b088Sdan } 37771879b088Sdan 37781879b088Sdan if( rc==SQLITE_OK ){ 3779c7c7e623Sdanielk1977 pager_reset(pPager); 3780c3031c61Sdrh rc = sqlite3PcacheSetPageSize(pPager->pPCache, pageSize); 37811c7880e5Sdrh } 378260da7274Sdrh if( rc==SQLITE_OK ){ 37836a154403Sdrh sqlite3PageFree(pPager->pTmpSpace); 37846a154403Sdrh pPager->pTmpSpace = pNew; 378560da7274Sdrh pPager->dbSize = (Pgno)((nByte+pageSize-1)/pageSize); 378660da7274Sdrh pPager->pageSize = pageSize; 37876a154403Sdrh }else{ 37886a154403Sdrh sqlite3PageFree(pNew); 378960da7274Sdrh } 3790a1644fd8Sdanielk1977 } 379122b328b2Sdan 3792b2eced5dSdrh *pPageSize = pPager->pageSize; 37931879b088Sdan if( rc==SQLITE_OK ){ 3794fa9601a9Sdrh if( nReserve<0 ) nReserve = pPager->nReserve; 3795fa9601a9Sdrh assert( nReserve>=0 && nReserve<1000 ); 3796fa9601a9Sdrh pPager->nReserve = (i16)nReserve; 3797fa9601a9Sdrh pagerReportSize(pPager); 37985d8a1372Sdan pagerFixMaplimit(pPager); 37991879b088Sdan } 38001879b088Sdan return rc; 380190f5ecb3Sdrh } 380290f5ecb3Sdrh 380390f5ecb3Sdrh /* 380426b7994aSdrh ** Return a pointer to the "temporary page" buffer held internally 380526b7994aSdrh ** by the pager. This is a buffer that is big enough to hold the 380626b7994aSdrh ** entire content of a database page. This buffer is used internally 380726b7994aSdrh ** during rollback and will be overwritten whenever a rollback 380826b7994aSdrh ** occurs. But other modules are free to use it too, as long as 380926b7994aSdrh ** no rollbacks are happening. 381026b7994aSdrh */ 381126b7994aSdrh void *sqlite3PagerTempSpace(Pager *pPager){ 381226b7994aSdrh return pPager->pTmpSpace; 381326b7994aSdrh } 381426b7994aSdrh 381526b7994aSdrh /* 3816f8e632b6Sdrh ** Attempt to set the maximum database page count if mxPage is positive. 3817f8e632b6Sdrh ** Make no changes if mxPage is zero or negative. And never reduce the 3818f8e632b6Sdrh ** maximum page count below the current size of the database. 3819f8e632b6Sdrh ** 3820f8e632b6Sdrh ** Regardless of mxPage, return the current maximum page count. 3821f8e632b6Sdrh */ 3822f8e632b6Sdrh int sqlite3PagerMaxPageCount(Pager *pPager, int mxPage){ 3823f8e632b6Sdrh if( mxPage>0 ){ 3824f8e632b6Sdrh pPager->mxPgno = mxPage; 3825f8e632b6Sdrh } 3826c84e033cSdrh assert( pPager->eState!=PAGER_OPEN ); /* Called only by OP_MaxPgcnt */ 3827c84e033cSdrh assert( pPager->mxPgno>=pPager->dbSize ); /* OP_MaxPgcnt enforces this */ 3828f8e632b6Sdrh return pPager->mxPgno; 3829f8e632b6Sdrh } 3830f8e632b6Sdrh 3831f8e632b6Sdrh /* 3832c9ac5caaSdrh ** The following set of routines are used to disable the simulated 3833c9ac5caaSdrh ** I/O error mechanism. These routines are used to avoid simulated 3834c9ac5caaSdrh ** errors in places where we do not care about errors. 3835c9ac5caaSdrh ** 3836c9ac5caaSdrh ** Unless -DSQLITE_TEST=1 is used, these routines are all no-ops 3837c9ac5caaSdrh ** and generate no code. 3838c9ac5caaSdrh */ 3839c9ac5caaSdrh #ifdef SQLITE_TEST 3840c9ac5caaSdrh extern int sqlite3_io_error_pending; 3841c9ac5caaSdrh extern int sqlite3_io_error_hit; 3842c9ac5caaSdrh static int saved_cnt; 3843c9ac5caaSdrh void disable_simulated_io_errors(void){ 3844c9ac5caaSdrh saved_cnt = sqlite3_io_error_pending; 3845c9ac5caaSdrh sqlite3_io_error_pending = -1; 3846c9ac5caaSdrh } 3847c9ac5caaSdrh void enable_simulated_io_errors(void){ 3848c9ac5caaSdrh sqlite3_io_error_pending = saved_cnt; 3849c9ac5caaSdrh } 3850c9ac5caaSdrh #else 3851152410faSdrh # define disable_simulated_io_errors() 3852152410faSdrh # define enable_simulated_io_errors() 3853c9ac5caaSdrh #endif 3854c9ac5caaSdrh 3855c9ac5caaSdrh /* 385690f5ecb3Sdrh ** Read the first N bytes from the beginning of the file into memory 3857aef0bf64Sdanielk1977 ** that pDest points to. 3858aef0bf64Sdanielk1977 ** 3859bea2a948Sdanielk1977 ** If the pager was opened on a transient file (zFilename==""), or 3860bea2a948Sdanielk1977 ** opened on a file less than N bytes in size, the output buffer is 3861bea2a948Sdanielk1977 ** zeroed and SQLITE_OK returned. The rationale for this is that this 3862bea2a948Sdanielk1977 ** function is used to read database headers, and a new transient or 3863bea2a948Sdanielk1977 ** zero sized database has a header than consists entirely of zeroes. 3864bea2a948Sdanielk1977 ** 3865bea2a948Sdanielk1977 ** If any IO error apart from SQLITE_IOERR_SHORT_READ is encountered, 3866bea2a948Sdanielk1977 ** the error code is returned to the caller and the contents of the 3867bea2a948Sdanielk1977 ** output buffer undefined. 386890f5ecb3Sdrh */ 38693b8a05f6Sdanielk1977 int sqlite3PagerReadFileheader(Pager *pPager, int N, unsigned char *pDest){ 3870551b7736Sdrh int rc = SQLITE_OK; 387190f5ecb3Sdrh memset(pDest, 0, N); 3872bea2a948Sdanielk1977 assert( isOpen(pPager->fd) || pPager->tempFile ); 3873b6e099a9Sdan 387482043b30Sdrh /* This routine is only called by btree immediately after creating 387582043b30Sdrh ** the Pager object. There has not been an opportunity to transition 387682043b30Sdrh ** to WAL mode yet. 387782043b30Sdrh */ 387882043b30Sdrh assert( !pagerUseWal(pPager) ); 3879b6e099a9Sdan 3880bea2a948Sdanielk1977 if( isOpen(pPager->fd) ){ 3881b0603416Sdrh IOTRACE(("DBHDR %p 0 %d\n", pPager, N)) 388262079060Sdanielk1977 rc = sqlite3OsRead(pPager->fd, pDest, N, 0); 3883551b7736Sdrh if( rc==SQLITE_IOERR_SHORT_READ ){ 3884551b7736Sdrh rc = SQLITE_OK; 388590f5ecb3Sdrh } 388690f5ecb3Sdrh } 3887551b7736Sdrh return rc; 3888551b7736Sdrh } 388990f5ecb3Sdrh 389090f5ecb3Sdrh /* 3891937ac9daSdan ** This function may only be called when a read-transaction is open on 3892937ac9daSdan ** the pager. It returns the total number of pages in the database. 3893937ac9daSdan ** 3894bea2a948Sdanielk1977 ** However, if the file is between 1 and <page-size> bytes in size, then 3895bea2a948Sdanielk1977 ** this is considered a 1 page file. 3896ed7c855cSdrh */ 38978fb8b537Sdrh void sqlite3PagerPagecount(Pager *pPager, int *pnPage){ 389854919f82Sdan assert( pPager->eState>=PAGER_READER ); 3899763afe62Sdan assert( pPager->eState!=PAGER_WRITER_FINISHED ); 3900937ac9daSdan *pnPage = (int)pPager->dbSize; 3901ed7c855cSdrh } 3902ed7c855cSdrh 3903ac69b05eSdrh 3904ac69b05eSdrh /* 3905bea2a948Sdanielk1977 ** Try to obtain a lock of type locktype on the database file. If 3906bea2a948Sdanielk1977 ** a similar or greater lock is already held, this function is a no-op 3907bea2a948Sdanielk1977 ** (returning SQLITE_OK immediately). 3908bea2a948Sdanielk1977 ** 3909bea2a948Sdanielk1977 ** Otherwise, attempt to obtain the lock using sqlite3OsLock(). Invoke 3910bea2a948Sdanielk1977 ** the busy callback if the lock is currently not available. Repeat 3911bea2a948Sdanielk1977 ** until the busy callback returns false or until the attempt to 3912bea2a948Sdanielk1977 ** obtain the lock succeeds. 391317221813Sdanielk1977 ** 391417221813Sdanielk1977 ** Return SQLITE_OK on success and an error code if we cannot obtain 3915bea2a948Sdanielk1977 ** the lock. If the lock is obtained successfully, set the Pager.state 3916bea2a948Sdanielk1977 ** variable to locktype before returning. 391717221813Sdanielk1977 */ 391817221813Sdanielk1977 static int pager_wait_on_lock(Pager *pPager, int locktype){ 3919bea2a948Sdanielk1977 int rc; /* Return code */ 39201aa2d8b5Sdrh 3921bea2a948Sdanielk1977 /* Check that this is either a no-op (because the requested lock is 392260ec914cSpeter.d.reid ** already held), or one of the transitions that the busy-handler 3923bea2a948Sdanielk1977 ** may be invoked during, according to the comment above 3924bea2a948Sdanielk1977 ** sqlite3PagerSetBusyhandler(). 3925bea2a948Sdanielk1977 */ 3926d0864087Sdan assert( (pPager->eLock>=locktype) 3927d0864087Sdan || (pPager->eLock==NO_LOCK && locktype==SHARED_LOCK) 3928d0864087Sdan || (pPager->eLock==RESERVED_LOCK && locktype==EXCLUSIVE_LOCK) 3929bea2a948Sdanielk1977 ); 3930bea2a948Sdanielk1977 393117221813Sdanielk1977 do { 39324e004aa6Sdan rc = pagerLockDb(pPager, locktype); 39331ceedd37Sdanielk1977 }while( rc==SQLITE_BUSY && pPager->xBusyHandler(pPager->pBusyHandlerArg) ); 393417221813Sdanielk1977 return rc; 393517221813Sdanielk1977 } 393617221813Sdanielk1977 39373460d19cSdanielk1977 /* 39389f0b6be8Sdanielk1977 ** Function assertTruncateConstraint(pPager) checks that one of the 39399f0b6be8Sdanielk1977 ** following is true for all dirty pages currently in the page-cache: 39409f0b6be8Sdanielk1977 ** 39419f0b6be8Sdanielk1977 ** a) The page number is less than or equal to the size of the 39429f0b6be8Sdanielk1977 ** current database image, in pages, OR 39439f0b6be8Sdanielk1977 ** 39449f0b6be8Sdanielk1977 ** b) if the page content were written at this time, it would not 39459f0b6be8Sdanielk1977 ** be necessary to write the current content out to the sub-journal 39469f0b6be8Sdanielk1977 ** (as determined by function subjRequiresPage()). 39479f0b6be8Sdanielk1977 ** 39489f0b6be8Sdanielk1977 ** If the condition asserted by this function were not true, and the 39499f0b6be8Sdanielk1977 ** dirty page were to be discarded from the cache via the pagerStress() 39509f0b6be8Sdanielk1977 ** routine, pagerStress() would not write the current page content to 39519f0b6be8Sdanielk1977 ** the database file. If a savepoint transaction were rolled back after 395248864df9Smistachkin ** this happened, the correct behavior would be to restore the current 39539f0b6be8Sdanielk1977 ** content of the page. However, since this content is not present in either 39549f0b6be8Sdanielk1977 ** the database file or the portion of the rollback journal and 39559f0b6be8Sdanielk1977 ** sub-journal rolled back the content could not be restored and the 39569f0b6be8Sdanielk1977 ** database image would become corrupt. It is therefore fortunate that 39579f0b6be8Sdanielk1977 ** this circumstance cannot arise. 39589f0b6be8Sdanielk1977 */ 39599f0b6be8Sdanielk1977 #if defined(SQLITE_DEBUG) 39609f0b6be8Sdanielk1977 static void assertTruncateConstraintCb(PgHdr *pPg){ 39619f0b6be8Sdanielk1977 assert( pPg->flags&PGHDR_DIRTY ); 39629f0b6be8Sdanielk1977 assert( !subjRequiresPage(pPg) || pPg->pgno<=pPg->pPager->dbSize ); 39639f0b6be8Sdanielk1977 } 39649f0b6be8Sdanielk1977 static void assertTruncateConstraint(Pager *pPager){ 39659f0b6be8Sdanielk1977 sqlite3PcacheIterateDirty(pPager->pPCache, assertTruncateConstraintCb); 39669f0b6be8Sdanielk1977 } 39679f0b6be8Sdanielk1977 #else 39689f0b6be8Sdanielk1977 # define assertTruncateConstraint(pPager) 39699f0b6be8Sdanielk1977 #endif 39709f0b6be8Sdanielk1977 39719f0b6be8Sdanielk1977 /* 3972f90b7260Sdanielk1977 ** Truncate the in-memory database file image to nPage pages. This 3973f90b7260Sdanielk1977 ** function does not actually modify the database file on disk. It 3974f90b7260Sdanielk1977 ** just sets the internal state of the pager object so that the 3975f90b7260Sdanielk1977 ** truncation will be done when the current transaction is committed. 3976e0ac363cSdan ** 3977e0ac363cSdan ** This function is only called right before committing a transaction. 3978e0ac363cSdan ** Once this function has been called, the transaction must either be 3979e0ac363cSdan ** rolled back or committed. It is not safe to call this function and 3980e0ac363cSdan ** then continue writing to the database. 39813460d19cSdanielk1977 */ 39823460d19cSdanielk1977 void sqlite3PagerTruncateImage(Pager *pPager, Pgno nPage){ 39833460d19cSdanielk1977 assert( pPager->dbSize>=nPage ); 3984d0864087Sdan assert( pPager->eState>=PAGER_WRITER_CACHEMOD ); 39853460d19cSdanielk1977 pPager->dbSize = nPage; 3986e0ac363cSdan 3987e0ac363cSdan /* At one point the code here called assertTruncateConstraint() to 3988e0ac363cSdan ** ensure that all pages being truncated away by this operation are, 3989e0ac363cSdan ** if one or more savepoints are open, present in the savepoint 3990e0ac363cSdan ** journal so that they can be restored if the savepoint is rolled 3991e0ac363cSdan ** back. This is no longer necessary as this function is now only 3992e0ac363cSdan ** called right before committing a transaction. So although the 3993e0ac363cSdan ** Pager object may still have open savepoints (Pager.nSavepoint!=0), 3994e0ac363cSdan ** they cannot be rolled back. So the assertTruncateConstraint() call 3995e0ac363cSdan ** is no longer correct. */ 39963460d19cSdanielk1977 } 39973460d19cSdanielk1977 39987c24610eSdan 3999f7c57531Sdrh /* 4000eada58aaSdan ** This function is called before attempting a hot-journal rollback. It 4001eada58aaSdan ** syncs the journal file to disk, then sets pPager->journalHdr to the 4002eada58aaSdan ** size of the journal file so that the pager_playback() routine knows 4003eada58aaSdan ** that the entire journal file has been synced. 4004eada58aaSdan ** 4005eada58aaSdan ** Syncing a hot-journal to disk before attempting to roll it back ensures 4006eada58aaSdan ** that if a power-failure occurs during the rollback, the process that 4007eada58aaSdan ** attempts rollback following system recovery sees the same journal 4008eada58aaSdan ** content as this process. 4009eada58aaSdan ** 4010eada58aaSdan ** If everything goes as planned, SQLITE_OK is returned. Otherwise, 4011eada58aaSdan ** an SQLite error code. 4012eada58aaSdan */ 4013eada58aaSdan static int pagerSyncHotJournal(Pager *pPager){ 4014eada58aaSdan int rc = SQLITE_OK; 4015eada58aaSdan if( !pPager->noSync ){ 4016eada58aaSdan rc = sqlite3OsSync(pPager->jfd, SQLITE_SYNC_NORMAL); 4017eada58aaSdan } 4018eada58aaSdan if( rc==SQLITE_OK ){ 4019eada58aaSdan rc = sqlite3OsFileSize(pPager->jfd, &pPager->journalHdr); 4020eada58aaSdan } 4021eada58aaSdan return rc; 4022eada58aaSdan } 4023eada58aaSdan 40249c4dc229Sdrh #if SQLITE_MAX_MMAP_SIZE>0 4025b2d3de3bSdan /* 40265d8a1372Sdan ** Obtain a reference to a memory mapped page object for page number pgno. 4027f23da966Sdan ** The new object will use the pointer pData, obtained from xFetch(). 4028f23da966Sdan ** If successful, set *ppPage to point to the new page reference 40295d8a1372Sdan ** and return SQLITE_OK. Otherwise, return an SQLite error code and set 40305d8a1372Sdan ** *ppPage to zero. 40315d8a1372Sdan ** 40325d8a1372Sdan ** Page references obtained by calling this function should be released 40335d8a1372Sdan ** by calling pagerReleaseMapPage(). 40345d8a1372Sdan */ 4035f23da966Sdan static int pagerAcquireMapPage( 4036f23da966Sdan Pager *pPager, /* Pager object */ 4037f23da966Sdan Pgno pgno, /* Page number */ 4038f23da966Sdan void *pData, /* xFetch()'d data for this page */ 4039f23da966Sdan PgHdr **ppPage /* OUT: Acquired page object */ 4040f23da966Sdan ){ 40415d8a1372Sdan PgHdr *p; /* Memory mapped page to return */ 4042b2d3de3bSdan 4043c86e5135Sdrh if( pPager->pMmapFreelist ){ 4044c86e5135Sdrh *ppPage = p = pPager->pMmapFreelist; 4045c86e5135Sdrh pPager->pMmapFreelist = p->pDirty; 4046b2d3de3bSdan p->pDirty = 0; 4047a2ee589cSdrh assert( pPager->nExtra>=8 ); 4048a2ee589cSdrh memset(p->pExtra, 0, 8); 4049b2d3de3bSdan }else{ 40505d8a1372Sdan *ppPage = p = (PgHdr *)sqlite3MallocZero(sizeof(PgHdr) + pPager->nExtra); 40515d8a1372Sdan if( p==0 ){ 4052df737fe6Sdan sqlite3OsUnfetch(pPager->fd, (i64)(pgno-1) * pPager->pageSize, pData); 4053fad3039cSmistachkin return SQLITE_NOMEM_BKPT; 40545d8a1372Sdan } 4055b2d3de3bSdan p->pExtra = (void *)&p[1]; 4056b2d3de3bSdan p->flags = PGHDR_MMAP; 4057b2d3de3bSdan p->nRef = 1; 4058b2d3de3bSdan p->pPager = pPager; 4059b2d3de3bSdan } 4060b2d3de3bSdan 4061b2d3de3bSdan assert( p->pExtra==(void *)&p[1] ); 4062b2d3de3bSdan assert( p->pPage==0 ); 4063b2d3de3bSdan assert( p->flags==PGHDR_MMAP ); 4064b2d3de3bSdan assert( p->pPager==pPager ); 4065b2d3de3bSdan assert( p->nRef==1 ); 4066b2d3de3bSdan 4067b2d3de3bSdan p->pgno = pgno; 4068f23da966Sdan p->pData = pData; 4069b2d3de3bSdan pPager->nMmapOut++; 4070b2d3de3bSdan 4071b2d3de3bSdan return SQLITE_OK; 4072b2d3de3bSdan } 40739c4dc229Sdrh #endif 4074b2d3de3bSdan 40755d8a1372Sdan /* 40765d8a1372Sdan ** Release a reference to page pPg. pPg must have been returned by an 40775d8a1372Sdan ** earlier call to pagerAcquireMapPage(). 40785d8a1372Sdan */ 4079b2d3de3bSdan static void pagerReleaseMapPage(PgHdr *pPg){ 4080b2d3de3bSdan Pager *pPager = pPg->pPager; 4081b2d3de3bSdan pPager->nMmapOut--; 4082c86e5135Sdrh pPg->pDirty = pPager->pMmapFreelist; 4083c86e5135Sdrh pPager->pMmapFreelist = pPg; 4084f23da966Sdan 4085f23da966Sdan assert( pPager->fd->pMethods->iVersion>=3 ); 4086df737fe6Sdan sqlite3OsUnfetch(pPager->fd, (i64)(pPg->pgno-1)*pPager->pageSize, pPg->pData); 4087b2d3de3bSdan } 4088b2d3de3bSdan 40895d8a1372Sdan /* 4090c86e5135Sdrh ** Free all PgHdr objects stored in the Pager.pMmapFreelist list. 40915d8a1372Sdan */ 4092b2d3de3bSdan static void pagerFreeMapHdrs(Pager *pPager){ 4093b2d3de3bSdan PgHdr *p; 4094b2d3de3bSdan PgHdr *pNext; 4095c86e5135Sdrh for(p=pPager->pMmapFreelist; p; p=pNext){ 4096b2d3de3bSdan pNext = p->pDirty; 4097b2d3de3bSdan sqlite3_free(p); 4098b2d3de3bSdan } 4099b2d3de3bSdan } 4100b2d3de3bSdan 4101fa68815fSdan /* Verify that the database file has not be deleted or renamed out from 4102b189e410Smistachkin ** under the pager. Return SQLITE_OK if the database is still where it ought 4103fa68815fSdan ** to be on disk. Return non-zero (SQLITE_READONLY_DBMOVED or some other error 4104fa68815fSdan ** code from sqlite3OsAccess()) if the database has gone missing. 4105fa68815fSdan */ 4106fa68815fSdan static int databaseIsUnmoved(Pager *pPager){ 4107fa68815fSdan int bHasMoved = 0; 4108fa68815fSdan int rc; 4109fa68815fSdan 4110fa68815fSdan if( pPager->tempFile ) return SQLITE_OK; 4111fa68815fSdan if( pPager->dbSize==0 ) return SQLITE_OK; 4112fa68815fSdan assert( pPager->zFilename && pPager->zFilename[0] ); 4113fa68815fSdan rc = sqlite3OsFileControl(pPager->fd, SQLITE_FCNTL_HAS_MOVED, &bHasMoved); 4114fa68815fSdan if( rc==SQLITE_NOTFOUND ){ 4115fa68815fSdan /* If the HAS_MOVED file-control is unimplemented, assume that the file 4116fa68815fSdan ** has not been moved. That is the historical behavior of SQLite: prior to 4117fa68815fSdan ** version 3.8.3, it never checked */ 4118fa68815fSdan rc = SQLITE_OK; 4119fa68815fSdan }else if( rc==SQLITE_OK && bHasMoved ){ 4120fa68815fSdan rc = SQLITE_READONLY_DBMOVED; 4121fa68815fSdan } 4122fa68815fSdan return rc; 4123fa68815fSdan } 4124fa68815fSdan 4125b2d3de3bSdan 4126eada58aaSdan /* 4127ed7c855cSdrh ** Shutdown the page cache. Free all memory and close all files. 4128ed7c855cSdrh ** 4129ed7c855cSdrh ** If a transaction was in progress when this routine is called, that 4130ed7c855cSdrh ** transaction is rolled back. All outstanding pages are invalidated 4131ed7c855cSdrh ** and their memory is freed. Any attempt to use a page associated 4132ed7c855cSdrh ** with this page cache after this function returns will likely 4133ed7c855cSdrh ** result in a coredump. 4134aef0bf64Sdanielk1977 ** 4135aef0bf64Sdanielk1977 ** This function always succeeds. If a transaction is active an attempt 4136aef0bf64Sdanielk1977 ** is made to roll it back. If an error occurs during the rollback 4137aef0bf64Sdanielk1977 ** a hot journal may be left in the filesystem but no error is returned 4138aef0bf64Sdanielk1977 ** to the caller. 4139ed7c855cSdrh */ 41407fb89906Sdan int sqlite3PagerClose(Pager *pPager, sqlite3 *db){ 41417c24610eSdan u8 *pTmp = (u8*)pPager->pTmpSpace; 41427fb89906Sdan assert( db || pagerUseWal(pPager)==0 ); 41432a5d9908Sdrh assert( assert_pager_state(pPager) ); 4144c9ac5caaSdrh disable_simulated_io_errors(); 41452d1d86fbSdanielk1977 sqlite3BeginBenignMalloc(); 4146b2d3de3bSdan pagerFreeMapHdrs(pPager); 4147a42c66bdSdan /* pPager->errCode = 0; */ 414841483468Sdanielk1977 pPager->exclusiveMode = 0; 41495cf53537Sdan #ifndef SQLITE_OMIT_WAL 4150fa68815fSdan { 4151fa68815fSdan u8 *a = 0; 41524a5bad57Sdan assert( db || pPager->pWal==0 ); 4153fa68815fSdan if( db && 0==(db->flags & SQLITE_NoCkptOnClose) 4154fa68815fSdan && SQLITE_OK==databaseIsUnmoved(pPager) 4155fa68815fSdan ){ 4156fa68815fSdan a = pTmp; 4157fa68815fSdan } 4158fa68815fSdan sqlite3WalClose(pPager->pWal, db, pPager->walSyncFlags, pPager->pageSize,a); 41597ed91f23Sdrh pPager->pWal = 0; 4160fa68815fSdan } 41615cf53537Sdan #endif 4162bafda096Sdrh pager_reset(pPager); 4163bea2a948Sdanielk1977 if( MEMDB ){ 4164bea2a948Sdanielk1977 pager_unlock(pPager); 4165bea2a948Sdanielk1977 }else{ 4166a42c66bdSdan /* If it is open, sync the journal file before calling UnlockAndRollback. 4167a42c66bdSdan ** If this is not done, then an unsynced portion of the open journal 4168a42c66bdSdan ** file may be played back into the database. If a power failure occurs 4169a42c66bdSdan ** while this is happening, the database could become corrupt. 4170a42c66bdSdan ** 4171a42c66bdSdan ** If an error occurs while trying to sync the journal, shift the pager 4172a42c66bdSdan ** into the ERROR state. This causes UnlockAndRollback to unlock the 4173a42c66bdSdan ** database and close the journal file without attempting to roll it 4174a42c66bdSdan ** back or finalize it. The next database user will have to do hot-journal 4175a42c66bdSdan ** rollback before accessing the database file. 4176f2c31ad8Sdanielk1977 */ 4177eada58aaSdan if( isOpen(pPager->jfd) ){ 4178a42c66bdSdan pager_error(pPager, pagerSyncHotJournal(pPager)); 4179eada58aaSdan } 4180e277be05Sdanielk1977 pagerUnlockAndRollback(pPager); 4181b3175389Sdanielk1977 } 418245d6882fSdanielk1977 sqlite3EndBenignMalloc(); 4183bea2a948Sdanielk1977 enable_simulated_io_errors(); 418430d53701Sdrh PAGERTRACE(("CLOSE %d\n", PAGERID(pPager))); 4185b0603416Sdrh IOTRACE(("CLOSE %p\n", pPager)) 4186e08341c6Sdan sqlite3OsClose(pPager->jfd); 4187b4b47411Sdanielk1977 sqlite3OsClose(pPager->fd); 41887c24610eSdan sqlite3PageFree(pTmp); 41898c0a791aSdanielk1977 sqlite3PcacheClose(pPager->pPCache); 4190bea2a948Sdanielk1977 4191fa9601a9Sdrh #ifdef SQLITE_HAS_CODEC 4192fa9601a9Sdrh if( pPager->xCodecFree ) pPager->xCodecFree(pPager->pCodec); 4193fa9601a9Sdrh #endif 4194fa9601a9Sdrh 4195bea2a948Sdanielk1977 assert( !pPager->aSavepoint && !pPager->pInJournal ); 4196bea2a948Sdanielk1977 assert( !isOpen(pPager->jfd) && !isOpen(pPager->sjfd) ); 4197bea2a948Sdanielk1977 419817435752Sdrh sqlite3_free(pPager); 4199ed7c855cSdrh return SQLITE_OK; 4200ed7c855cSdrh } 4201ed7c855cSdrh 420287cc3b31Sdrh #if !defined(NDEBUG) || defined(SQLITE_TEST) 4203ed7c855cSdrh /* 4204bea2a948Sdanielk1977 ** Return the page number for page pPg. 4205ed7c855cSdrh */ 4206bea2a948Sdanielk1977 Pgno sqlite3PagerPagenumber(DbPage *pPg){ 4207bea2a948Sdanielk1977 return pPg->pgno; 4208ed7c855cSdrh } 420987cc3b31Sdrh #endif 4210ed7c855cSdrh 4211ed7c855cSdrh /* 4212bea2a948Sdanielk1977 ** Increment the reference count for page pPg. 4213df0b3b09Sdrh */ 4214bea2a948Sdanielk1977 void sqlite3PagerRef(DbPage *pPg){ 42158c0a791aSdanielk1977 sqlite3PcacheRef(pPg); 42167e3b0a07Sdrh } 42177e3b0a07Sdrh 42187e3b0a07Sdrh /* 421934e79ceeSdrh ** Sync the journal. In other words, make sure all the pages that have 422034e79ceeSdrh ** been written to the journal have actually reached the surface of the 4221bea2a948Sdanielk1977 ** disk and can be restored in the event of a hot-journal rollback. 4222b19a2bc6Sdrh ** 422351133eaeSdan ** If the Pager.noSync flag is set, then this function is a no-op. 422451133eaeSdan ** Otherwise, the actions required depend on the journal-mode and the 4225d5578433Smistachkin ** device characteristics of the file-system, as follows: 4226fa86c412Sdrh ** 4227bea2a948Sdanielk1977 ** * If the journal file is an in-memory journal file, no action need 4228bea2a948Sdanielk1977 ** be taken. 42294cd2cd5cSdanielk1977 ** 4230bea2a948Sdanielk1977 ** * Otherwise, if the device does not support the SAFE_APPEND property, 4231bea2a948Sdanielk1977 ** then the nRec field of the most recently written journal header 4232bea2a948Sdanielk1977 ** is updated to contain the number of journal records that have 4233bea2a948Sdanielk1977 ** been written following it. If the pager is operating in full-sync 4234bea2a948Sdanielk1977 ** mode, then the journal file is synced before this field is updated. 423534e79ceeSdrh ** 4236bea2a948Sdanielk1977 ** * If the device does not support the SEQUENTIAL property, then 4237bea2a948Sdanielk1977 ** journal file is synced. 4238bea2a948Sdanielk1977 ** 4239bea2a948Sdanielk1977 ** Or, in pseudo-code: 4240bea2a948Sdanielk1977 ** 4241bea2a948Sdanielk1977 ** if( NOT <in-memory journal> ){ 4242bea2a948Sdanielk1977 ** if( NOT SAFE_APPEND ){ 4243bea2a948Sdanielk1977 ** if( <full-sync mode> ) xSync(<journal file>); 4244bea2a948Sdanielk1977 ** <update nRec field> 4245bea2a948Sdanielk1977 ** } 4246bea2a948Sdanielk1977 ** if( NOT SEQUENTIAL ) xSync(<journal file>); 4247bea2a948Sdanielk1977 ** } 4248bea2a948Sdanielk1977 ** 4249bea2a948Sdanielk1977 ** If successful, this routine clears the PGHDR_NEED_SYNC flag of every 4250bea2a948Sdanielk1977 ** page currently held in memory before returning SQLITE_OK. If an IO 4251bea2a948Sdanielk1977 ** error is encountered, then the IO error code is returned to the caller. 425250e5dadfSdrh */ 4253937ac9daSdan static int syncJournal(Pager *pPager, int newHdr){ 4254d0864087Sdan int rc; /* Return code */ 4255d0864087Sdan 4256d0864087Sdan assert( pPager->eState==PAGER_WRITER_CACHEMOD 4257d0864087Sdan || pPager->eState==PAGER_WRITER_DBMOD 4258d0864087Sdan ); 4259d0864087Sdan assert( assert_pager_state(pPager) ); 4260937ac9daSdan assert( !pagerUseWal(pPager) ); 4261d0864087Sdan 4262d0864087Sdan rc = sqlite3PagerExclusiveLock(pPager); 4263d0864087Sdan if( rc!=SQLITE_OK ) return rc; 4264d0864087Sdan 426551133eaeSdan if( !pPager->noSync ){ 4266b3175389Sdanielk1977 assert( !pPager->tempFile ); 4267d0864087Sdan if( isOpen(pPager->jfd) && pPager->journalMode!=PAGER_JOURNALMODE_MEMORY ){ 4268bea2a948Sdanielk1977 const int iDc = sqlite3OsDeviceCharacteristics(pPager->fd); 4269bea2a948Sdanielk1977 assert( isOpen(pPager->jfd) ); 42704cd2cd5cSdanielk1977 42714cd2cd5cSdanielk1977 if( 0==(iDc&SQLITE_IOCAP_SAFE_APPEND) ){ 4272112f752bSdanielk1977 /* This block deals with an obscure problem. If the last connection 4273112f752bSdanielk1977 ** that wrote to this database was operating in persistent-journal 4274112f752bSdanielk1977 ** mode, then the journal file may at this point actually be larger 4275112f752bSdanielk1977 ** than Pager.journalOff bytes. If the next thing in the journal 4276112f752bSdanielk1977 ** file happens to be a journal-header (written as part of the 427791781bd7Sdrh ** previous connection's transaction), and a crash or power-failure 4278112f752bSdanielk1977 ** occurs after nRec is updated but before this connection writes 4279112f752bSdanielk1977 ** anything else to the journal file (or commits/rolls back its 4280112f752bSdanielk1977 ** transaction), then SQLite may become confused when doing the 4281112f752bSdanielk1977 ** hot-journal rollback following recovery. It may roll back all 4282112f752bSdanielk1977 ** of this connections data, then proceed to rolling back the old, 4283112f752bSdanielk1977 ** out-of-date data that follows it. Database corruption. 4284112f752bSdanielk1977 ** 4285112f752bSdanielk1977 ** To work around this, if the journal file does appear to contain 4286112f752bSdanielk1977 ** a valid header following Pager.journalOff, then write a 0x00 4287112f752bSdanielk1977 ** byte to the start of it to prevent it from being recognized. 4288bea2a948Sdanielk1977 ** 4289bea2a948Sdanielk1977 ** Variable iNextHdrOffset is set to the offset at which this 4290bea2a948Sdanielk1977 ** problematic header will occur, if it exists. aMagic is used 4291bea2a948Sdanielk1977 ** as a temporary buffer to inspect the first couple of bytes of 4292bea2a948Sdanielk1977 ** the potential journal header. 4293112f752bSdanielk1977 */ 42947b746030Sdrh i64 iNextHdrOffset; 4295bea2a948Sdanielk1977 u8 aMagic[8]; 42967b746030Sdrh u8 zHeader[sizeof(aJournalMagic)+4]; 42977b746030Sdrh 42987b746030Sdrh memcpy(zHeader, aJournalMagic, sizeof(aJournalMagic)); 42997b746030Sdrh put32bits(&zHeader[sizeof(aJournalMagic)], pPager->nRec); 43007b746030Sdrh 43017b746030Sdrh iNextHdrOffset = journalHdrOffset(pPager); 4302bea2a948Sdanielk1977 rc = sqlite3OsRead(pPager->jfd, aMagic, 8, iNextHdrOffset); 4303bea2a948Sdanielk1977 if( rc==SQLITE_OK && 0==memcmp(aMagic, aJournalMagic, 8) ){ 4304112f752bSdanielk1977 static const u8 zerobyte = 0; 4305bea2a948Sdanielk1977 rc = sqlite3OsWrite(pPager->jfd, &zerobyte, 1, iNextHdrOffset); 4306112f752bSdanielk1977 } 4307112f752bSdanielk1977 if( rc!=SQLITE_OK && rc!=SQLITE_IOERR_SHORT_READ ){ 4308112f752bSdanielk1977 return rc; 4309112f752bSdanielk1977 } 4310112f752bSdanielk1977 43117657240aSdanielk1977 /* Write the nRec value into the journal file header. If in 43127657240aSdanielk1977 ** full-synchronous mode, sync the journal first. This ensures that 43137657240aSdanielk1977 ** all data has really hit the disk before nRec is updated to mark 43147657240aSdanielk1977 ** it as a candidate for rollback. 43154cd2cd5cSdanielk1977 ** 43164cd2cd5cSdanielk1977 ** This is not required if the persistent media supports the 43174cd2cd5cSdanielk1977 ** SAFE_APPEND property. Because in this case it is not possible 43184cd2cd5cSdanielk1977 ** for garbage data to be appended to the file, the nRec field 43194cd2cd5cSdanielk1977 ** is populated with 0xFFFFFFFF when the journal header is written 43204cd2cd5cSdanielk1977 ** and never needs to be updated. 43217657240aSdanielk1977 */ 43224cd2cd5cSdanielk1977 if( pPager->fullSync && 0==(iDc&SQLITE_IOCAP_SEQUENTIAL) ){ 432330d53701Sdrh PAGERTRACE(("SYNC journal of %d\n", PAGERID(pPager))); 4324b0603416Sdrh IOTRACE(("JSYNC %p\n", pPager)) 4325c97d8463Sdrh rc = sqlite3OsSync(pPager->jfd, pPager->syncFlags); 4326bea2a948Sdanielk1977 if( rc!=SQLITE_OK ) return rc; 4327968af52aSdrh } 43287b746030Sdrh IOTRACE(("JHDR %p %lld\n", pPager, pPager->journalHdr)); 43296f4c73eeSdanielk1977 rc = sqlite3OsWrite( 43306f4c73eeSdanielk1977 pPager->jfd, zHeader, sizeof(zHeader), pPager->journalHdr 43316f4c73eeSdanielk1977 ); 4332bea2a948Sdanielk1977 if( rc!=SQLITE_OK ) return rc; 4333d8d66e8cSdrh } 43344cd2cd5cSdanielk1977 if( 0==(iDc&SQLITE_IOCAP_SEQUENTIAL) ){ 433530d53701Sdrh PAGERTRACE(("SYNC journal of %d\n", PAGERID(pPager))); 4336126afe6bSdrh IOTRACE(("JSYNC %p\n", pPager)) 4337c97d8463Sdrh rc = sqlite3OsSync(pPager->jfd, pPager->syncFlags| 4338c97d8463Sdrh (pPager->syncFlags==SQLITE_SYNC_FULL?SQLITE_SYNC_DATAONLY:0) 4339f036aef0Sdanielk1977 ); 4340bea2a948Sdanielk1977 if( rc!=SQLITE_OK ) return rc; 43414cd2cd5cSdanielk1977 } 434245d6882fSdanielk1977 434391781bd7Sdrh pPager->journalHdr = pPager->journalOff; 4344937ac9daSdan if( newHdr && 0==(iDc&SQLITE_IOCAP_SAFE_APPEND) ){ 4345937ac9daSdan pPager->nRec = 0; 4346937ac9daSdan rc = writeJournalHdr(pPager); 43475761dbe4Sdan if( rc!=SQLITE_OK ) return rc; 4348937ac9daSdan } 4349937ac9daSdan }else{ 4350937ac9daSdan pPager->journalHdr = pPager->journalOff; 4351937ac9daSdan } 4352341eae8dSdrh } 4353341eae8dSdrh 4354d0864087Sdan /* Unless the pager is in noSync mode, the journal file was just 4355d0864087Sdan ** successfully synced. Either way, clear the PGHDR_NEED_SYNC flag on 4356d0864087Sdan ** all pages. 4357d0864087Sdan */ 4358d0864087Sdan sqlite3PcacheClearSyncFlags(pPager->pPCache); 4359d0864087Sdan pPager->eState = PAGER_WRITER_DBMOD; 4360d0864087Sdan assert( assert_pager_state(pPager) ); 4361bea2a948Sdanielk1977 return SQLITE_OK; 436250e5dadfSdrh } 436350e5dadfSdrh 436450e5dadfSdrh /* 4365bea2a948Sdanielk1977 ** The argument is the first in a linked list of dirty pages connected 4366bea2a948Sdanielk1977 ** by the PgHdr.pDirty pointer. This function writes each one of the 4367bea2a948Sdanielk1977 ** in-memory pages in the list to the database file. The argument may 4368bea2a948Sdanielk1977 ** be NULL, representing an empty list. In this case this function is 4369bea2a948Sdanielk1977 ** a no-op. 4370bea2a948Sdanielk1977 ** 4371bea2a948Sdanielk1977 ** The pager must hold at least a RESERVED lock when this function 4372bea2a948Sdanielk1977 ** is called. Before writing anything to the database file, this lock 4373bea2a948Sdanielk1977 ** is upgraded to an EXCLUSIVE lock. If the lock cannot be obtained, 4374bea2a948Sdanielk1977 ** SQLITE_BUSY is returned and no data is written to the database file. 4375bea2a948Sdanielk1977 ** 4376bea2a948Sdanielk1977 ** If the pager is a temp-file pager and the actual file-system file 4377bea2a948Sdanielk1977 ** is not yet open, it is created and opened before any data is 4378bea2a948Sdanielk1977 ** written out. 4379bea2a948Sdanielk1977 ** 4380bea2a948Sdanielk1977 ** Once the lock has been upgraded and, if necessary, the file opened, 4381bea2a948Sdanielk1977 ** the pages are written out to the database file in list order. Writing 4382bea2a948Sdanielk1977 ** a page is skipped if it meets either of the following criteria: 4383bea2a948Sdanielk1977 ** 4384bea2a948Sdanielk1977 ** * The page number is greater than Pager.dbSize, or 4385bea2a948Sdanielk1977 ** * The PGHDR_DONT_WRITE flag is set on the page. 4386bea2a948Sdanielk1977 ** 4387bea2a948Sdanielk1977 ** If writing out a page causes the database file to grow, Pager.dbFileSize 4388bea2a948Sdanielk1977 ** is updated accordingly. If page 1 is written out, then the value cached 4389bea2a948Sdanielk1977 ** in Pager.dbFileVers[] is updated to match the new value stored in 4390bea2a948Sdanielk1977 ** the database file. 4391bea2a948Sdanielk1977 ** 4392bea2a948Sdanielk1977 ** If everything is successful, SQLITE_OK is returned. If an IO error 4393bea2a948Sdanielk1977 ** occurs, an IO error code is returned. Or, if the EXCLUSIVE lock cannot 4394bea2a948Sdanielk1977 ** be obtained, SQLITE_BUSY is returned. 43952554f8b0Sdrh */ 4396146151cdSdrh static int pager_write_pagelist(Pager *pPager, PgHdr *pList){ 4397c864912aSdan int rc = SQLITE_OK; /* Return code */ 43982554f8b0Sdrh 4399c864912aSdan /* This function is only called for rollback pagers in WRITER_DBMOD state. */ 4400146151cdSdrh assert( !pagerUseWal(pPager) ); 440141113b64Sdan assert( pPager->tempFile || pPager->eState==PAGER_WRITER_DBMOD ); 4402c864912aSdan assert( pPager->eLock==EXCLUSIVE_LOCK ); 4403199f56b9Sdan assert( isOpen(pPager->fd) || pList->pDirty==0 ); 4404bea2a948Sdanielk1977 4405bea2a948Sdanielk1977 /* If the file is a temp-file has not yet been opened, open it now. It 4406bea2a948Sdanielk1977 ** is not possible for rc to be other than SQLITE_OK if this branch 4407bea2a948Sdanielk1977 ** is taken, as pager_wait_on_lock() is a no-op for temp-files. 4408bea2a948Sdanielk1977 */ 4409bea2a948Sdanielk1977 if( !isOpen(pPager->fd) ){ 4410bea2a948Sdanielk1977 assert( pPager->tempFile && rc==SQLITE_OK ); 4411bea2a948Sdanielk1977 rc = pagerOpentemp(pPager, pPager->fd, pPager->vfsFlags); 44129eed5057Sdanielk1977 } 44139eed5057Sdanielk1977 44149ff27ecdSdrh /* Before the first write, give the VFS a hint of what the final 44159ff27ecdSdrh ** file size will be. 44169ff27ecdSdrh */ 44177fb574ecSdan assert( rc!=SQLITE_OK || isOpen(pPager->fd) ); 4418eb97b293Sdan if( rc==SQLITE_OK 44193719f5f6Sdan && pPager->dbHintSize<pPager->dbSize 44203719f5f6Sdan && (pList->pDirty || pList->pgno>pPager->dbHintSize) 4421eb97b293Sdan ){ 44229ff27ecdSdrh sqlite3_int64 szFile = pPager->pageSize * (sqlite3_int64)pPager->dbSize; 4423c02372ceSdrh sqlite3OsFileControlHint(pPager->fd, SQLITE_FCNTL_SIZE_HINT, &szFile); 4424c864912aSdan pPager->dbHintSize = pPager->dbSize; 44259ff27ecdSdrh } 44269ff27ecdSdrh 4427bea2a948Sdanielk1977 while( rc==SQLITE_OK && pList ){ 4428bea2a948Sdanielk1977 Pgno pgno = pList->pgno; 44297a2b1eebSdanielk1977 4430687566d7Sdanielk1977 /* If there are dirty pages in the page cache with page numbers greater 4431f90b7260Sdanielk1977 ** than Pager.dbSize, this means sqlite3PagerTruncateImage() was called to 4432687566d7Sdanielk1977 ** make the file smaller (presumably by auto-vacuum code). Do not write 4433687566d7Sdanielk1977 ** any such pages to the file. 4434bea2a948Sdanielk1977 ** 4435bea2a948Sdanielk1977 ** Also, do not write out any page that has the PGHDR_DONT_WRITE flag 44365b47efa6Sdrh ** set (set by sqlite3PagerDontWrite()). 4437687566d7Sdanielk1977 */ 4438bea2a948Sdanielk1977 if( pgno<=pPager->dbSize && 0==(pList->flags&PGHDR_DONT_WRITE) ){ 4439bea2a948Sdanielk1977 i64 offset = (pgno-1)*(i64)pPager->pageSize; /* Offset to write */ 444085d2bd22Sdrh char *pData; /* Data to write */ 444185d2bd22Sdrh 444251133eaeSdan assert( (pList->flags&PGHDR_NEED_SYNC)==0 ); 4443d40d7ec7Sdrh if( pList->pgno==1 ) pager_write_changecounter(pList); 444451133eaeSdan 444585d2bd22Sdrh /* Encode the database */ 4446fad3039cSmistachkin CODEC2(pPager, pList->pData, pgno, 6, return SQLITE_NOMEM_BKPT, pData); 4447443c0597Sdanielk1977 4448bea2a948Sdanielk1977 /* Write out the page data. */ 4449f23da966Sdan rc = sqlite3OsWrite(pPager->fd, pData, pPager->pageSize, offset); 4450bea2a948Sdanielk1977 4451bea2a948Sdanielk1977 /* If page 1 was just written, update Pager.dbFileVers to match 4452bea2a948Sdanielk1977 ** the value now stored in the database file. If writing this 4453bea2a948Sdanielk1977 ** page caused the database file to grow, update dbFileSize. 4454bea2a948Sdanielk1977 */ 4455bea2a948Sdanielk1977 if( pgno==1 ){ 445645d6882fSdanielk1977 memcpy(&pPager->dbFileVers, &pData[24], sizeof(pPager->dbFileVers)); 4457687566d7Sdanielk1977 } 4458bea2a948Sdanielk1977 if( pgno>pPager->dbFileSize ){ 4459bea2a948Sdanielk1977 pPager->dbFileSize = pgno; 446045d6882fSdanielk1977 } 44619ad3ee40Sdrh pPager->aStat[PAGER_STAT_WRITE]++; 4462bea2a948Sdanielk1977 44630410302eSdanielk1977 /* Update any backup objects copying the contents of this pager. */ 44640719ee29Sdrh sqlite3BackupUpdate(pPager->pBackup, pgno, (u8*)pList->pData); 44650410302eSdanielk1977 4466bea2a948Sdanielk1977 PAGERTRACE(("STORE %d page %d hash(%08x)\n", 4467bea2a948Sdanielk1977 PAGERID(pPager), pgno, pager_pagehash(pList))); 4468bea2a948Sdanielk1977 IOTRACE(("PGOUT %p %d\n", pPager, pgno)); 4469bea2a948Sdanielk1977 PAGER_INCR(sqlite3_pager_writedb_count); 4470bea2a948Sdanielk1977 }else{ 4471bea2a948Sdanielk1977 PAGERTRACE(("NOSTORE %d page %d\n", PAGERID(pPager), pgno)); 447245d6882fSdanielk1977 } 44735f848c3aSdan pager_set_pagehash(pList); 44742554f8b0Sdrh pList = pList->pDirty; 44752554f8b0Sdrh } 44768c0a791aSdanielk1977 4477bea2a948Sdanielk1977 return rc; 44782554f8b0Sdrh } 44792554f8b0Sdrh 44802554f8b0Sdrh /* 4481459564f4Sdan ** Ensure that the sub-journal file is open. If it is already open, this 4482459564f4Sdan ** function is a no-op. 4483459564f4Sdan ** 4484459564f4Sdan ** SQLITE_OK is returned if everything goes according to plan. An 4485459564f4Sdan ** SQLITE_IOERR_XXX error code is returned if a call to sqlite3OsOpen() 4486459564f4Sdan ** fails. 4487459564f4Sdan */ 4488459564f4Sdan static int openSubJournal(Pager *pPager){ 4489459564f4Sdan int rc = SQLITE_OK; 4490459564f4Sdan if( !isOpen(pPager->sjfd) ){ 44916e76326dSdan const int flags = SQLITE_OPEN_SUBJOURNAL | SQLITE_OPEN_READWRITE 44926e76326dSdan | SQLITE_OPEN_CREATE | SQLITE_OPEN_EXCLUSIVE 44936e76326dSdan | SQLITE_OPEN_DELETEONCLOSE; 44948c71a98cSdrh int nStmtSpill = sqlite3Config.nStmtSpill; 4495459564f4Sdan if( pPager->journalMode==PAGER_JOURNALMODE_MEMORY || pPager->subjInMemory ){ 44968c71a98cSdrh nStmtSpill = -1; 4497459564f4Sdan } 44988c71a98cSdrh rc = sqlite3JournalOpen(pPager->pVfs, 0, pPager->sjfd, flags, nStmtSpill); 4499459564f4Sdan } 4500459564f4Sdan return rc; 4501459564f4Sdan } 4502459564f4Sdan 4503459564f4Sdan /* 4504bea2a948Sdanielk1977 ** Append a record of the current state of page pPg to the sub-journal. 4505bea2a948Sdanielk1977 ** 4506bea2a948Sdanielk1977 ** If successful, set the bit corresponding to pPg->pgno in the bitvecs 4507bea2a948Sdanielk1977 ** for all open savepoints before returning. 4508bea2a948Sdanielk1977 ** 4509bea2a948Sdanielk1977 ** This function returns SQLITE_OK if everything is successful, an IO 4510bea2a948Sdanielk1977 ** error code if the attempt to write to the sub-journal fails, or 4511bea2a948Sdanielk1977 ** SQLITE_NOMEM if a malloc fails while setting a bit in a savepoint 4512bea2a948Sdanielk1977 ** bitvec. 4513f2c31ad8Sdanielk1977 */ 4514f2c31ad8Sdanielk1977 static int subjournalPage(PgHdr *pPg){ 4515651a52faSdanielk1977 int rc = SQLITE_OK; 4516f2c31ad8Sdanielk1977 Pager *pPager = pPg->pPager; 4517459564f4Sdan if( pPager->journalMode!=PAGER_JOURNALMODE_OFF ){ 4518459564f4Sdan 4519459564f4Sdan /* Open the sub-journal, if it has not already been opened */ 4520459564f4Sdan assert( pPager->useJournal ); 4521459564f4Sdan assert( isOpen(pPager->jfd) || pagerUseWal(pPager) ); 4522459564f4Sdan assert( isOpen(pPager->sjfd) || pPager->nSubRec==0 ); 4523459564f4Sdan assert( pagerUseWal(pPager) 45245dee6afcSdrh || pageInJournal(pPager, pPg) 4525459564f4Sdan || pPg->pgno>pPager->dbOrigSize 4526459564f4Sdan ); 4527459564f4Sdan rc = openSubJournal(pPager); 4528459564f4Sdan 4529459564f4Sdan /* If the sub-journal was opened successfully (or was already open), 4530459564f4Sdan ** write the journal record into the file. */ 4531459564f4Sdan if( rc==SQLITE_OK ){ 4532651a52faSdanielk1977 void *pData = pPg->pData; 45337c3210e6Sdan i64 offset = (i64)pPager->nSubRec*(4+pPager->pageSize); 453485d2bd22Sdrh char *pData2; 4535f2c31ad8Sdanielk1977 4536614c6a09Sdrh #if SQLITE_HAS_CODEC 4537614c6a09Sdrh if( !pPager->subjInMemory ){ 4538fad3039cSmistachkin CODEC2(pPager, pData, pPg->pgno, 7, return SQLITE_NOMEM_BKPT, pData2); 4539614c6a09Sdrh }else 4540614c6a09Sdrh #endif 4541614c6a09Sdrh pData2 = pData; 454230d53701Sdrh PAGERTRACE(("STMT-JOURNAL %d page %d\n", PAGERID(pPager), pPg->pgno)); 4543f2c31ad8Sdanielk1977 rc = write32bits(pPager->sjfd, offset, pPg->pgno); 4544f2c31ad8Sdanielk1977 if( rc==SQLITE_OK ){ 4545f2c31ad8Sdanielk1977 rc = sqlite3OsWrite(pPager->sjfd, pData2, pPager->pageSize, offset+4); 4546f2c31ad8Sdanielk1977 } 4547651a52faSdanielk1977 } 4548459564f4Sdan } 4549f2c31ad8Sdanielk1977 if( rc==SQLITE_OK ){ 4550bea2a948Sdanielk1977 pPager->nSubRec++; 4551f2c31ad8Sdanielk1977 assert( pPager->nSavepoint>0 ); 4552f2c31ad8Sdanielk1977 rc = addToSavepointBitvecs(pPager, pPg->pgno); 4553f2c31ad8Sdanielk1977 } 4554f2c31ad8Sdanielk1977 return rc; 4555f2c31ad8Sdanielk1977 } 455660e32edbSdrh static int subjournalPageIfRequired(PgHdr *pPg){ 455760e32edbSdrh if( subjRequiresPage(pPg) ){ 455860e32edbSdrh return subjournalPage(pPg); 455960e32edbSdrh }else{ 456060e32edbSdrh return SQLITE_OK; 456160e32edbSdrh } 456260e32edbSdrh } 4563f2c31ad8Sdanielk1977 45643306c4a9Sdan /* 45658c0a791aSdanielk1977 ** This function is called by the pcache layer when it has reached some 4566bea2a948Sdanielk1977 ** soft memory limit. The first argument is a pointer to a Pager object 4567bea2a948Sdanielk1977 ** (cast as a void*). The pager is always 'purgeable' (not an in-memory 4568bea2a948Sdanielk1977 ** database). The second argument is a reference to a page that is 4569bea2a948Sdanielk1977 ** currently dirty but has no outstanding references. The page 4570bea2a948Sdanielk1977 ** is always associated with the Pager object passed as the first 4571bea2a948Sdanielk1977 ** argument. 4572bea2a948Sdanielk1977 ** 4573bea2a948Sdanielk1977 ** The job of this function is to make pPg clean by writing its contents 4574bea2a948Sdanielk1977 ** out to the database file, if possible. This may involve syncing the 4575bea2a948Sdanielk1977 ** journal file. 4576bea2a948Sdanielk1977 ** 4577bea2a948Sdanielk1977 ** If successful, sqlite3PcacheMakeClean() is called on the page and 4578bea2a948Sdanielk1977 ** SQLITE_OK returned. If an IO error occurs while trying to make the 4579bea2a948Sdanielk1977 ** page clean, the IO error code is returned. If the page cannot be 4580bea2a948Sdanielk1977 ** made clean for some other reason, but no error occurs, then SQLITE_OK 4581bea2a948Sdanielk1977 ** is returned by sqlite3PcacheMakeClean() is not called. 45822554f8b0Sdrh */ 4583a858aa2eSdanielk1977 static int pagerStress(void *p, PgHdr *pPg){ 45848c0a791aSdanielk1977 Pager *pPager = (Pager *)p; 45858c0a791aSdanielk1977 int rc = SQLITE_OK; 45868f2e9a1aSdrh 4587bea2a948Sdanielk1977 assert( pPg->pPager==pPager ); 4588bea2a948Sdanielk1977 assert( pPg->flags&PGHDR_DIRTY ); 4589bea2a948Sdanielk1977 459040c3941cSdrh /* The doNotSpill NOSYNC bit is set during times when doing a sync of 4591314f30dbSdrh ** journal (and adding a new header) is not allowed. This occurs 4592314f30dbSdrh ** during calls to sqlite3PagerWrite() while trying to journal multiple 4593314f30dbSdrh ** pages belonging to the same sector. 45947cf4c7adSdrh ** 459540c3941cSdrh ** The doNotSpill ROLLBACK and OFF bits inhibits all cache spilling 459640c3941cSdrh ** regardless of whether or not a sync is required. This is set during 459740c3941cSdrh ** a rollback or by user request, respectively. 4598314f30dbSdrh ** 45990028486bSdrh ** Spilling is also prohibited when in an error state since that could 460060ec914cSpeter.d.reid ** lead to database corruption. In the current implementation it 4601c3031c61Sdrh ** is impossible for sqlite3PcacheFetch() to be called with createFlag==3 46020028486bSdrh ** while in the error state, hence it is impossible for this routine to 46030028486bSdrh ** be called in the error state. Nevertheless, we include a NEVER() 46040028486bSdrh ** test for the error state as a safeguard against future changes. 46057cf4c7adSdrh */ 46060028486bSdrh if( NEVER(pPager->errCode) ) return SQLITE_OK; 460740c3941cSdrh testcase( pPager->doNotSpill & SPILLFLAG_ROLLBACK ); 460840c3941cSdrh testcase( pPager->doNotSpill & SPILLFLAG_OFF ); 460940c3941cSdrh testcase( pPager->doNotSpill & SPILLFLAG_NOSYNC ); 461040c3941cSdrh if( pPager->doNotSpill 461140c3941cSdrh && ((pPager->doNotSpill & (SPILLFLAG_ROLLBACK|SPILLFLAG_OFF))!=0 461240c3941cSdrh || (pPg->flags & PGHDR_NEED_SYNC)!=0) 461340c3941cSdrh ){ 46147cf4c7adSdrh return SQLITE_OK; 46157cf4c7adSdrh } 46167cf4c7adSdrh 4617ffc78a41Sdrh pPager->aStat[PAGER_STAT_SPILL]++; 46184a4b01dcSdan pPg->pDirty = 0; 46197ed91f23Sdrh if( pagerUseWal(pPager) ){ 46204cc6fb61Sdan /* Write a single frame for this page to the log. */ 462160e32edbSdrh rc = subjournalPageIfRequired(pPg); 46224cd78b4dSdan if( rc==SQLITE_OK ){ 46234eb02a45Sdrh rc = pagerWalFrames(pPager, pPg, 0, 0); 46244cd78b4dSdan } 46254cc6fb61Sdan }else{ 46268c20014aSdanielk1977 4627d67a9770Sdan #ifdef SQLITE_ENABLE_BATCH_ATOMIC_WRITE 4628efe16971Sdan if( pPager->tempFile==0 ){ 4629efe16971Sdan rc = sqlite3JournalCreate(pPager->jfd); 4630efe16971Sdan if( rc!=SQLITE_OK ) return pager_error(pPager, rc); 4631efe16971Sdan } 4632efe16971Sdan #endif 4633efe16971Sdan 4634bea2a948Sdanielk1977 /* Sync the journal file if required. */ 4635c864912aSdan if( pPg->flags&PGHDR_NEED_SYNC 4636c864912aSdan || pPager->eState==PAGER_WRITER_CACHEMOD 4637c864912aSdan ){ 4638937ac9daSdan rc = syncJournal(pPager, 1); 46398c0a791aSdanielk1977 } 4640bea2a948Sdanielk1977 4641bea2a948Sdanielk1977 /* Write the contents of the page out to the database file. */ 464245d6882fSdanielk1977 if( rc==SQLITE_OK ){ 464351133eaeSdan assert( (pPg->flags&PGHDR_NEED_SYNC)==0 ); 4644146151cdSdrh rc = pager_write_pagelist(pPager, pPg); 46458c0a791aSdanielk1977 } 46464cc6fb61Sdan } 4647a858aa2eSdanielk1977 4648bea2a948Sdanielk1977 /* Mark the page as clean. */ 4649a858aa2eSdanielk1977 if( rc==SQLITE_OK ){ 465030d53701Sdrh PAGERTRACE(("STRESS %d page %d\n", PAGERID(pPager), pPg->pgno)); 4651a858aa2eSdanielk1977 sqlite3PcacheMakeClean(pPg); 46528c0a791aSdanielk1977 } 4653bea2a948Sdanielk1977 4654bea2a948Sdanielk1977 return pager_error(pPager, rc); 46558c0a791aSdanielk1977 } 46568c0a791aSdanielk1977 46576fa255fdSdan /* 46586fa255fdSdan ** Flush all unreferenced dirty pages to disk. 46596fa255fdSdan */ 46606fa255fdSdan int sqlite3PagerFlush(Pager *pPager){ 4661dbf6773eSdan int rc = pPager->errCode; 46629fb13abcSdan if( !MEMDB ){ 46636fa255fdSdan PgHdr *pList = sqlite3PcacheDirtyList(pPager->pPCache); 4664dbf6773eSdan assert( assert_pager_state(pPager) ); 46656fa255fdSdan while( rc==SQLITE_OK && pList ){ 46666fa255fdSdan PgHdr *pNext = pList->pDirty; 46676fa255fdSdan if( pList->nRef==0 ){ 46686fa255fdSdan rc = pagerStress((void*)pPager, pList); 46696fa255fdSdan } 46706fa255fdSdan pList = pNext; 46716fa255fdSdan } 46729fb13abcSdan } 46736fa255fdSdan 46746fa255fdSdan return rc; 46756fa255fdSdan } 46762554f8b0Sdrh 46772554f8b0Sdrh /* 4678bea2a948Sdanielk1977 ** Allocate and initialize a new Pager object and put a pointer to it 4679bea2a948Sdanielk1977 ** in *ppPager. The pager should eventually be freed by passing it 4680bea2a948Sdanielk1977 ** to sqlite3PagerClose(). 4681bea2a948Sdanielk1977 ** 4682bea2a948Sdanielk1977 ** The zFilename argument is the path to the database file to open. 4683bea2a948Sdanielk1977 ** If zFilename is NULL then a randomly-named temporary file is created 4684bea2a948Sdanielk1977 ** and used as the file to be cached. Temporary files are be deleted 4685bea2a948Sdanielk1977 ** automatically when they are closed. If zFilename is ":memory:" then 4686bea2a948Sdanielk1977 ** all information is held in cache. It is never written to disk. 4687bea2a948Sdanielk1977 ** This can be used to implement an in-memory database. 4688bea2a948Sdanielk1977 ** 4689bea2a948Sdanielk1977 ** The nExtra parameter specifies the number of bytes of space allocated 4690bea2a948Sdanielk1977 ** along with each page reference. This space is available to the user 4691a2ee589cSdrh ** via the sqlite3PagerGetExtra() API. When a new page is allocated, the 4692a2ee589cSdrh ** first 8 bytes of this space are zeroed but the remainder is uninitialized. 4693a2ee589cSdrh ** (The extra space is used by btree as the MemPage object.) 4694bea2a948Sdanielk1977 ** 4695bea2a948Sdanielk1977 ** The flags argument is used to specify properties that affect the 4696bea2a948Sdanielk1977 ** operation of the pager. It should be passed some bitwise combination 469733f111dcSdrh ** of the PAGER_* flags. 4698bea2a948Sdanielk1977 ** 4699bea2a948Sdanielk1977 ** The vfsFlags parameter is a bitmask to pass to the flags parameter 4700bea2a948Sdanielk1977 ** of the xOpen() method of the supplied VFS when opening files. 4701bea2a948Sdanielk1977 ** 4702bea2a948Sdanielk1977 ** If the pager object is allocated and the specified file opened 4703bea2a948Sdanielk1977 ** successfully, SQLITE_OK is returned and *ppPager set to point to 4704bea2a948Sdanielk1977 ** the new pager object. If an error occurs, *ppPager is set to NULL 4705bea2a948Sdanielk1977 ** and error code returned. This function may return SQLITE_NOMEM 4706bea2a948Sdanielk1977 ** (sqlite3Malloc() is used to allocate memory), SQLITE_CANTOPEN or 4707bea2a948Sdanielk1977 ** various SQLITE_IO_XXX errors. 4708bea2a948Sdanielk1977 */ 4709bea2a948Sdanielk1977 int sqlite3PagerOpen( 4710bea2a948Sdanielk1977 sqlite3_vfs *pVfs, /* The virtual file system to use */ 4711bea2a948Sdanielk1977 Pager **ppPager, /* OUT: Return the Pager structure here */ 4712bea2a948Sdanielk1977 const char *zFilename, /* Name of the database file to open */ 4713bea2a948Sdanielk1977 int nExtra, /* Extra bytes append to each in-memory page */ 4714bea2a948Sdanielk1977 int flags, /* flags controlling this file */ 47154775ecd0Sdrh int vfsFlags, /* flags passed through to sqlite3_vfs.xOpen() */ 47164775ecd0Sdrh void (*xReinit)(DbPage*) /* Function to reinitialize pages */ 4717bea2a948Sdanielk1977 ){ 4718bea2a948Sdanielk1977 u8 *pPtr; 4719bea2a948Sdanielk1977 Pager *pPager = 0; /* Pager object to allocate and return */ 4720bea2a948Sdanielk1977 int rc = SQLITE_OK; /* Return code */ 4721bea2a948Sdanielk1977 int tempFile = 0; /* True for temp files (incl. in-memory files) */ 4722bea2a948Sdanielk1977 int memDb = 0; /* True if this is an in-memory file */ 47239c6396ecSdrh #ifdef SQLITE_ENABLE_DESERIALIZE 4724ac442f41Sdrh int memJM = 0; /* Memory journal mode */ 47259c6396ecSdrh #else 47269c6396ecSdrh # define memJM 0 47279c6396ecSdrh #endif 4728bea2a948Sdanielk1977 int readOnly = 0; /* True if this is a read-only file */ 4729bea2a948Sdanielk1977 int journalFileSize; /* Bytes to allocate for each journal fd */ 4730bea2a948Sdanielk1977 char *zPathname = 0; /* Full path to database file */ 4731bea2a948Sdanielk1977 int nPathname = 0; /* Number of bytes in zPathname */ 4732bea2a948Sdanielk1977 int useJournal = (flags & PAGER_OMIT_JOURNAL)==0; /* False to omit journal */ 4733bea2a948Sdanielk1977 int pcacheSize = sqlite3PcacheSize(); /* Bytes to allocate for PCache */ 4734b2eced5dSdrh u32 szPageDflt = SQLITE_DEFAULT_PAGE_SIZE; /* Default page size */ 4735cd74b611Sdan const char *zUri = 0; /* URI args to copy */ 4736cd74b611Sdan int nUri = 0; /* Number of bytes of URI args at *zUri */ 4737bea2a948Sdanielk1977 4738bea2a948Sdanielk1977 /* Figure out how much space is required for each journal file-handle 47392491de28Sdan ** (there are two of them, the main journal and the sub-journal). */ 4740ea598cbdSdrh journalFileSize = ROUND8(sqlite3JournalSize(pVfs)); 4741bea2a948Sdanielk1977 4742bea2a948Sdanielk1977 /* Set the output variable to NULL in case an error occurs. */ 4743bea2a948Sdanielk1977 *ppPager = 0; 4744bea2a948Sdanielk1977 474575c014c3Sdrh #ifndef SQLITE_OMIT_MEMORYDB 474675c014c3Sdrh if( flags & PAGER_MEMORY ){ 474775c014c3Sdrh memDb = 1; 4748d4e0bb0eSdrh if( zFilename && zFilename[0] ){ 4749afc8b7f0Sdrh zPathname = sqlite3DbStrDup(0, zFilename); 4750fad3039cSmistachkin if( zPathname==0 ) return SQLITE_NOMEM_BKPT; 4751afc8b7f0Sdrh nPathname = sqlite3Strlen30(zPathname); 475275c014c3Sdrh zFilename = 0; 475375c014c3Sdrh } 47544ab9d254Sdrh } 475575c014c3Sdrh #endif 475675c014c3Sdrh 4757bea2a948Sdanielk1977 /* Compute and store the full pathname in an allocated buffer pointed 4758bea2a948Sdanielk1977 ** to by zPathname, length nPathname. Or, if this is a temporary file, 4759bea2a948Sdanielk1977 ** leave both nPathname and zPathname set to 0. 4760bea2a948Sdanielk1977 */ 4761bea2a948Sdanielk1977 if( zFilename && zFilename[0] ){ 4762cd74b611Sdan const char *z; 4763bea2a948Sdanielk1977 nPathname = pVfs->mxPathname+1; 4764a879342bSdan zPathname = sqlite3DbMallocRaw(0, nPathname*2); 4765bea2a948Sdanielk1977 if( zPathname==0 ){ 4766fad3039cSmistachkin return SQLITE_NOMEM_BKPT; 4767bea2a948Sdanielk1977 } 4768e8df800dSdrh zPathname[0] = 0; /* Make sure initialized even if FullPathname() fails */ 4769bea2a948Sdanielk1977 rc = sqlite3OsFullPathname(pVfs, zFilename, nPathname, zPathname); 4770bea2a948Sdanielk1977 nPathname = sqlite3Strlen30(zPathname); 4771cd74b611Sdan z = zUri = &zFilename[sqlite3Strlen30(zFilename)+1]; 4772cd74b611Sdan while( *z ){ 4773cd74b611Sdan z += sqlite3Strlen30(z)+1; 4774cd74b611Sdan z += sqlite3Strlen30(z)+1; 4775cd74b611Sdan } 47760e208252Sdan nUri = (int)(&z[1] - zUri); 47770e208252Sdan assert( nUri>=0 ); 4778bea2a948Sdanielk1977 if( rc==SQLITE_OK && nPathname+8>pVfs->mxPathname ){ 4779bea2a948Sdanielk1977 /* This branch is taken when the journal path required by 4780bea2a948Sdanielk1977 ** the database being opened will be more than pVfs->mxPathname 4781bea2a948Sdanielk1977 ** bytes in length. This means the database cannot be opened, 4782bea2a948Sdanielk1977 ** as it will not be possible to open the journal file or even 4783bea2a948Sdanielk1977 ** check for a hot-journal before reading. 4784bea2a948Sdanielk1977 */ 47859978c97eSdrh rc = SQLITE_CANTOPEN_BKPT; 4786bea2a948Sdanielk1977 } 4787bea2a948Sdanielk1977 if( rc!=SQLITE_OK ){ 4788a879342bSdan sqlite3DbFree(0, zPathname); 4789bea2a948Sdanielk1977 return rc; 4790bea2a948Sdanielk1977 } 4791bea2a948Sdanielk1977 } 4792bea2a948Sdanielk1977 4793bea2a948Sdanielk1977 /* Allocate memory for the Pager structure, PCache object, the 4794bea2a948Sdanielk1977 ** three file descriptors, the database file name and the journal 4795bea2a948Sdanielk1977 ** file name. The layout in memory is as follows: 4796bea2a948Sdanielk1977 ** 4797bea2a948Sdanielk1977 ** Pager object (sizeof(Pager) bytes) 4798bea2a948Sdanielk1977 ** PCache object (sqlite3PcacheSize() bytes) 4799bea2a948Sdanielk1977 ** Database file handle (pVfs->szOsFile bytes) 4800bea2a948Sdanielk1977 ** Sub-journal file handle (journalFileSize bytes) 4801bea2a948Sdanielk1977 ** Main journal file handle (journalFileSize bytes) 4802bea2a948Sdanielk1977 ** Database file name (nPathname+1 bytes) 4803bea2a948Sdanielk1977 ** Journal file name (nPathname+8+1 bytes) 4804bea2a948Sdanielk1977 */ 4805bea2a948Sdanielk1977 pPtr = (u8 *)sqlite3MallocZero( 4806ea598cbdSdrh ROUND8(sizeof(*pPager)) + /* Pager structure */ 4807ea598cbdSdrh ROUND8(pcacheSize) + /* PCache object */ 4808ea598cbdSdrh ROUND8(pVfs->szOsFile) + /* The main db file */ 4809bea2a948Sdanielk1977 journalFileSize * 2 + /* The two journal files */ 4810cd74b611Sdan nPathname + 1 + nUri + /* zFilename */ 481152bcde0eSdrh nPathname + 8 + 2 /* zJournal */ 48123e875ef3Sdan #ifndef SQLITE_OMIT_WAL 481352bcde0eSdrh + nPathname + 4 + 2 /* zWal */ 48143e875ef3Sdan #endif 4815bea2a948Sdanielk1977 ); 481660a4b538Sshane assert( EIGHT_BYTE_ALIGNMENT(SQLITE_INT_TO_PTR(journalFileSize)) ); 4817bea2a948Sdanielk1977 if( !pPtr ){ 4818a879342bSdan sqlite3DbFree(0, zPathname); 4819fad3039cSmistachkin return SQLITE_NOMEM_BKPT; 4820bea2a948Sdanielk1977 } 4821bea2a948Sdanielk1977 pPager = (Pager*)(pPtr); 4822ea598cbdSdrh pPager->pPCache = (PCache*)(pPtr += ROUND8(sizeof(*pPager))); 4823ea598cbdSdrh pPager->fd = (sqlite3_file*)(pPtr += ROUND8(pcacheSize)); 4824ea598cbdSdrh pPager->sjfd = (sqlite3_file*)(pPtr += ROUND8(pVfs->szOsFile)); 4825bea2a948Sdanielk1977 pPager->jfd = (sqlite3_file*)(pPtr += journalFileSize); 4826bea2a948Sdanielk1977 pPager->zFilename = (char*)(pPtr += journalFileSize); 4827ea598cbdSdrh assert( EIGHT_BYTE_ALIGNMENT(pPager->jfd) ); 4828bea2a948Sdanielk1977 4829bea2a948Sdanielk1977 /* Fill in the Pager.zFilename and Pager.zJournal buffers, if required. */ 4830bea2a948Sdanielk1977 if( zPathname ){ 48318c96a6eaSdrh assert( nPathname>0 ); 4832cd74b611Sdan pPager->zJournal = (char*)(pPtr += nPathname + 1 + nUri); 4833bea2a948Sdanielk1977 memcpy(pPager->zFilename, zPathname, nPathname); 48345e0c4938Sdrh if( nUri ) memcpy(&pPager->zFilename[nPathname+1], zUri, nUri); 4835bea2a948Sdanielk1977 memcpy(pPager->zJournal, zPathname, nPathname); 483655b4c226Sdrh memcpy(&pPager->zJournal[nPathname], "-journal\000", 8+2); 483781cc5163Sdrh sqlite3FileSuffix3(pPager->zFilename, pPager->zJournal); 48383e875ef3Sdan #ifndef SQLITE_OMIT_WAL 48393e875ef3Sdan pPager->zWal = &pPager->zJournal[nPathname+8+1]; 48403e875ef3Sdan memcpy(pPager->zWal, zPathname, nPathname); 484152bcde0eSdrh memcpy(&pPager->zWal[nPathname], "-wal\000", 4+1); 484281cc5163Sdrh sqlite3FileSuffix3(pPager->zFilename, pPager->zWal); 48433e875ef3Sdan #endif 4844a879342bSdan sqlite3DbFree(0, zPathname); 4845bea2a948Sdanielk1977 } 4846bea2a948Sdanielk1977 pPager->pVfs = pVfs; 4847bea2a948Sdanielk1977 pPager->vfsFlags = vfsFlags; 4848bea2a948Sdanielk1977 4849bea2a948Sdanielk1977 /* Open the pager file. 4850bea2a948Sdanielk1977 */ 48518c96a6eaSdrh if( zFilename && zFilename[0] ){ 4852bea2a948Sdanielk1977 int fout = 0; /* VFS flags returned by xOpen() */ 4853bea2a948Sdanielk1977 rc = sqlite3OsOpen(pVfs, pPager->zFilename, pPager->fd, vfsFlags, &fout); 48548c96a6eaSdrh assert( !memDb ); 48559c6396ecSdrh #ifdef SQLITE_ENABLE_DESERIALIZE 4856ac442f41Sdrh memJM = (fout&SQLITE_OPEN_MEMORY)!=0; 48579c6396ecSdrh #endif 4858ac442f41Sdrh readOnly = (fout&SQLITE_OPEN_READONLY)!=0; 4859bea2a948Sdanielk1977 4860bea2a948Sdanielk1977 /* If the file was successfully opened for read/write access, 4861bea2a948Sdanielk1977 ** choose a default page size in case we have to create the 4862bea2a948Sdanielk1977 ** database file. The default page size is the maximum of: 4863bea2a948Sdanielk1977 ** 4864bea2a948Sdanielk1977 ** + SQLITE_DEFAULT_PAGE_SIZE, 4865bea2a948Sdanielk1977 ** + The value returned by sqlite3OsSectorSize() 4866bea2a948Sdanielk1977 ** + The largest page size that can be written atomically. 4867bea2a948Sdanielk1977 */ 4868d1ae96d3Sdrh if( rc==SQLITE_OK ){ 4869d1ae96d3Sdrh int iDc = sqlite3OsDeviceCharacteristics(pPager->fd); 48706451c2b0Sdrh if( !readOnly ){ 4871bea2a948Sdanielk1977 setSectorSize(pPager); 4872d87897dfSshane assert(SQLITE_DEFAULT_PAGE_SIZE<=SQLITE_MAX_DEFAULT_PAGE_SIZE); 4873bea2a948Sdanielk1977 if( szPageDflt<pPager->sectorSize ){ 4874d87897dfSshane if( pPager->sectorSize>SQLITE_MAX_DEFAULT_PAGE_SIZE ){ 4875d87897dfSshane szPageDflt = SQLITE_MAX_DEFAULT_PAGE_SIZE; 4876d87897dfSshane }else{ 4877b2eced5dSdrh szPageDflt = (u32)pPager->sectorSize; 4878d87897dfSshane } 4879bea2a948Sdanielk1977 } 4880bea2a948Sdanielk1977 #ifdef SQLITE_ENABLE_ATOMIC_WRITE 4881bea2a948Sdanielk1977 { 4882bea2a948Sdanielk1977 int ii; 4883bea2a948Sdanielk1977 assert(SQLITE_IOCAP_ATOMIC512==(512>>8)); 4884bea2a948Sdanielk1977 assert(SQLITE_IOCAP_ATOMIC64K==(65536>>8)); 4885bea2a948Sdanielk1977 assert(SQLITE_MAX_DEFAULT_PAGE_SIZE<=65536); 4886bea2a948Sdanielk1977 for(ii=szPageDflt; ii<=SQLITE_MAX_DEFAULT_PAGE_SIZE; ii=ii*2){ 4887bea2a948Sdanielk1977 if( iDc&(SQLITE_IOCAP_ATOMIC|(ii>>8)) ){ 4888bea2a948Sdanielk1977 szPageDflt = ii; 4889bea2a948Sdanielk1977 } 4890bea2a948Sdanielk1977 } 4891bea2a948Sdanielk1977 } 4892bea2a948Sdanielk1977 #endif 48936451c2b0Sdrh } 489457fe136bSdrh pPager->noLock = sqlite3_uri_boolean(zFilename, "nolock", 0); 4895d1ae96d3Sdrh if( (iDc & SQLITE_IOCAP_IMMUTABLE)!=0 4896d1ae96d3Sdrh || sqlite3_uri_boolean(zFilename, "immutable", 0) ){ 4897d1ae96d3Sdrh vfsFlags |= SQLITE_OPEN_READONLY; 4898d1ae96d3Sdrh goto act_like_temp_file; 4899d1ae96d3Sdrh } 4900d1ae96d3Sdrh } 4901bea2a948Sdanielk1977 }else{ 4902bea2a948Sdanielk1977 /* If a temporary file is requested, it is not opened immediately. 4903bea2a948Sdanielk1977 ** In this case we accept the default page size and delay actually 4904bea2a948Sdanielk1977 ** opening the file until the first call to OsWrite(). 4905bea2a948Sdanielk1977 ** 4906bea2a948Sdanielk1977 ** This branch is also run for an in-memory database. An in-memory 4907bea2a948Sdanielk1977 ** database is the same as a temp-file that is never written out to 4908bea2a948Sdanielk1977 ** disk and uses an in-memory rollback journal. 490957fe136bSdrh ** 491057fe136bSdrh ** This branch also runs for files marked as immutable. 4911bea2a948Sdanielk1977 */ 4912d1ae96d3Sdrh act_like_temp_file: 4913bea2a948Sdanielk1977 tempFile = 1; 491457fe136bSdrh pPager->eState = PAGER_READER; /* Pretend we already have a lock */ 4915e399ac2eSdrh pPager->eLock = EXCLUSIVE_LOCK; /* Pretend we are in EXCLUSIVE mode */ 491657fe136bSdrh pPager->noLock = 1; /* Do no locking */ 4917aed24608Sdrh readOnly = (vfsFlags&SQLITE_OPEN_READONLY); 4918bea2a948Sdanielk1977 } 4919bea2a948Sdanielk1977 4920bea2a948Sdanielk1977 /* The following call to PagerSetPagesize() serves to set the value of 4921bea2a948Sdanielk1977 ** Pager.pageSize and to allocate the Pager.pTmpSpace buffer. 4922bea2a948Sdanielk1977 */ 4923bea2a948Sdanielk1977 if( rc==SQLITE_OK ){ 4924bea2a948Sdanielk1977 assert( pPager->memDb==0 ); 4925fa9601a9Sdrh rc = sqlite3PagerSetPagesize(pPager, &szPageDflt, -1); 4926bea2a948Sdanielk1977 testcase( rc!=SQLITE_OK ); 4927bea2a948Sdanielk1977 } 4928bea2a948Sdanielk1977 4929c3031c61Sdrh /* Initialize the PCache object. */ 4930c3031c61Sdrh if( rc==SQLITE_OK ){ 4931c3031c61Sdrh nExtra = ROUND8(nExtra); 4932a2ee589cSdrh assert( nExtra>=8 && nExtra<1000 ); 4933c3031c61Sdrh rc = sqlite3PcacheOpen(szPageDflt, nExtra, !memDb, 4934c3031c61Sdrh !memDb?pagerStress:0, (void *)pPager, pPager->pPCache); 4935c3031c61Sdrh } 4936c3031c61Sdrh 4937c3031c61Sdrh /* If an error occurred above, free the Pager structure and close the file. 4938bea2a948Sdanielk1977 */ 4939bea2a948Sdanielk1977 if( rc!=SQLITE_OK ){ 4940bea2a948Sdanielk1977 sqlite3OsClose(pPager->fd); 4941c3031c61Sdrh sqlite3PageFree(pPager->pTmpSpace); 4942bea2a948Sdanielk1977 sqlite3_free(pPager); 4943bea2a948Sdanielk1977 return rc; 4944bea2a948Sdanielk1977 } 4945bea2a948Sdanielk1977 4946bea2a948Sdanielk1977 PAGERTRACE(("OPEN %d %s\n", FILEHANDLEID(pPager->fd), pPager->zFilename)); 4947bea2a948Sdanielk1977 IOTRACE(("OPEN %p %s\n", pPager, pPager->zFilename)) 4948bea2a948Sdanielk1977 4949bea2a948Sdanielk1977 pPager->useJournal = (u8)useJournal; 4950bea2a948Sdanielk1977 /* pPager->stmtOpen = 0; */ 4951bea2a948Sdanielk1977 /* pPager->stmtInUse = 0; */ 4952bea2a948Sdanielk1977 /* pPager->nRef = 0; */ 4953bea2a948Sdanielk1977 /* pPager->stmtSize = 0; */ 4954bea2a948Sdanielk1977 /* pPager->stmtJSize = 0; */ 4955bea2a948Sdanielk1977 /* pPager->nPage = 0; */ 4956bea2a948Sdanielk1977 pPager->mxPgno = SQLITE_MAX_PAGE_COUNT; 4957bea2a948Sdanielk1977 /* pPager->state = PAGER_UNLOCK; */ 4958bea2a948Sdanielk1977 /* pPager->errMask = 0; */ 4959bea2a948Sdanielk1977 pPager->tempFile = (u8)tempFile; 4960bea2a948Sdanielk1977 assert( tempFile==PAGER_LOCKINGMODE_NORMAL 4961bea2a948Sdanielk1977 || tempFile==PAGER_LOCKINGMODE_EXCLUSIVE ); 4962bea2a948Sdanielk1977 assert( PAGER_LOCKINGMODE_EXCLUSIVE==1 ); 4963bea2a948Sdanielk1977 pPager->exclusiveMode = (u8)tempFile; 4964bea2a948Sdanielk1977 pPager->changeCountDone = pPager->tempFile; 4965bea2a948Sdanielk1977 pPager->memDb = (u8)memDb; 4966bea2a948Sdanielk1977 pPager->readOnly = (u8)readOnly; 49674775ecd0Sdrh assert( useJournal || pPager->tempFile ); 49684775ecd0Sdrh pPager->noSync = pPager->tempFile; 49694eb02a45Sdrh if( pPager->noSync ){ 49704eb02a45Sdrh assert( pPager->fullSync==0 ); 49716841b1cbSdrh assert( pPager->extraSync==0 ); 49724eb02a45Sdrh assert( pPager->syncFlags==0 ); 49734eb02a45Sdrh assert( pPager->walSyncFlags==0 ); 49744eb02a45Sdrh }else{ 49754eb02a45Sdrh pPager->fullSync = 1; 49766841b1cbSdrh pPager->extraSync = 0; 49774eb02a45Sdrh pPager->syncFlags = SQLITE_SYNC_NORMAL; 4978daaae7b9Sdrh pPager->walSyncFlags = SQLITE_SYNC_NORMAL | (SQLITE_SYNC_NORMAL<<2); 49794eb02a45Sdrh } 4980bea2a948Sdanielk1977 /* pPager->pFirst = 0; */ 4981bea2a948Sdanielk1977 /* pPager->pFirstSynced = 0; */ 4982bea2a948Sdanielk1977 /* pPager->pLast = 0; */ 4983fa9601a9Sdrh pPager->nExtra = (u16)nExtra; 4984bea2a948Sdanielk1977 pPager->journalSizeLimit = SQLITE_DEFAULT_JOURNAL_SIZE_LIMIT; 4985bea2a948Sdanielk1977 assert( isOpen(pPager->fd) || tempFile ); 4986bea2a948Sdanielk1977 setSectorSize(pPager); 49874775ecd0Sdrh if( !useJournal ){ 49884775ecd0Sdrh pPager->journalMode = PAGER_JOURNALMODE_OFF; 4989ac442f41Sdrh }else if( memDb || memJM ){ 4990bea2a948Sdanielk1977 pPager->journalMode = PAGER_JOURNALMODE_MEMORY; 4991bea2a948Sdanielk1977 } 4992bea2a948Sdanielk1977 /* pPager->xBusyHandler = 0; */ 4993bea2a948Sdanielk1977 /* pPager->pBusyHandlerArg = 0; */ 49944775ecd0Sdrh pPager->xReiniter = xReinit; 499512e6f682Sdrh setGetterMethod(pPager); 4996bea2a948Sdanielk1977 /* memset(pPager->aHash, 0, sizeof(pPager->aHash)); */ 49979b4c59faSdrh /* pPager->szMmap = SQLITE_DEFAULT_MMAP_SIZE // will be set by btree.c */ 499829391c5bSdrh 4999bea2a948Sdanielk1977 *ppPager = pPager; 5000bea2a948Sdanielk1977 return SQLITE_OK; 5001bea2a948Sdanielk1977 } 5002bea2a948Sdanielk1977 5003bea2a948Sdanielk1977 5004bea2a948Sdanielk1977 5005bea2a948Sdanielk1977 /* 5006bea2a948Sdanielk1977 ** This function is called after transitioning from PAGER_UNLOCK to 5007bea2a948Sdanielk1977 ** PAGER_SHARED state. It tests if there is a hot journal present in 5008bea2a948Sdanielk1977 ** the file-system for the given pager. A hot journal is one that 5009bea2a948Sdanielk1977 ** needs to be played back. According to this function, a hot-journal 5010ee8b799dSdanielk1977 ** file exists if the following criteria are met: 5011bea2a948Sdanielk1977 ** 5012bea2a948Sdanielk1977 ** * The journal file exists in the file system, and 5013bea2a948Sdanielk1977 ** * No process holds a RESERVED or greater lock on the database file, and 5014ee8b799dSdanielk1977 ** * The database file itself is greater than 0 bytes in size, and 5015ee8b799dSdanielk1977 ** * The first byte of the journal file exists and is not 0x00. 5016165ffe97Sdrh ** 5017165ffe97Sdrh ** If the current size of the database file is 0 but a journal file 5018165ffe97Sdrh ** exists, that is probably an old journal left over from a prior 5019bea2a948Sdanielk1977 ** database with the same name. In this case the journal file is 5020bea2a948Sdanielk1977 ** just deleted using OsDelete, *pExists is set to 0 and SQLITE_OK 5021bea2a948Sdanielk1977 ** is returned. 502282ed1e5bSdrh ** 5023ee8b799dSdanielk1977 ** This routine does not check if there is a master journal filename 5024ee8b799dSdanielk1977 ** at the end of the file. If there is, and that master journal file 5025ee8b799dSdanielk1977 ** does not exist, then the journal file is not really hot. In this 5026ee8b799dSdanielk1977 ** case this routine will return a false-positive. The pager_playback() 5027ee8b799dSdanielk1977 ** routine will discover that the journal file is not really hot and 5028ee8b799dSdanielk1977 ** will not roll it back. 5029bea2a948Sdanielk1977 ** 5030bea2a948Sdanielk1977 ** If a hot-journal file is found to exist, *pExists is set to 1 and 5031bea2a948Sdanielk1977 ** SQLITE_OK returned. If no hot-journal file is present, *pExists is 5032bea2a948Sdanielk1977 ** set to 0 and SQLITE_OK returned. If an IO error occurs while trying 5033bea2a948Sdanielk1977 ** to determine whether or not a hot-journal file exists, the IO error 5034bea2a948Sdanielk1977 ** code is returned and the value of *pExists is undefined. 5035165ffe97Sdrh */ 5036d300b8a3Sdanielk1977 static int hasHotJournal(Pager *pPager, int *pExists){ 5037bea2a948Sdanielk1977 sqlite3_vfs * const pVfs = pPager->pVfs; 50382a321c75Sdan int rc = SQLITE_OK; /* Return code */ 50392a321c75Sdan int exists = 1; /* True if a journal file is present */ 50402a321c75Sdan int jrnlOpen = !!isOpen(pPager->jfd); 5041bea2a948Sdanielk1977 5042d05c223cSdrh assert( pPager->useJournal ); 5043bea2a948Sdanielk1977 assert( isOpen(pPager->fd) ); 5044de1ae34eSdan assert( pPager->eState==PAGER_OPEN ); 5045d0864087Sdan 50468ce49d6aSdan assert( jrnlOpen==0 || ( sqlite3OsDeviceCharacteristics(pPager->jfd) & 50478ce49d6aSdan SQLITE_IOCAP_UNDELETABLE_WHEN_OPEN 50488ce49d6aSdan )); 5049bea2a948Sdanielk1977 50500a846f96Sdrh *pExists = 0; 50512a321c75Sdan if( !jrnlOpen ){ 5052861f7456Sdanielk1977 rc = sqlite3OsAccess(pVfs, pPager->zJournal, SQLITE_ACCESS_EXISTS, &exists); 50532a321c75Sdan } 5054861f7456Sdanielk1977 if( rc==SQLITE_OK && exists ){ 5055431b0b42Sdan int locked = 0; /* True if some process holds a RESERVED lock */ 5056f0039ad8Sdrh 5057f0039ad8Sdrh /* Race condition here: Another process might have been holding the 5058f0039ad8Sdrh ** the RESERVED lock and have a journal open at the sqlite3OsAccess() 5059f0039ad8Sdrh ** call above, but then delete the journal and drop the lock before 5060f0039ad8Sdrh ** we get to the following sqlite3OsCheckReservedLock() call. If that 5061f0039ad8Sdrh ** is the case, this routine might think there is a hot journal when 5062f0039ad8Sdrh ** in fact there is none. This results in a false-positive which will 50639fe769f1Sdrh ** be dealt with by the playback routine. Ticket #3883. 5064f0039ad8Sdrh */ 5065861f7456Sdanielk1977 rc = sqlite3OsCheckReservedLock(pPager->fd, &locked); 5066bea2a948Sdanielk1977 if( rc==SQLITE_OK && !locked ){ 5067763afe62Sdan Pgno nPage; /* Number of pages in database file */ 5068ee8b799dSdanielk1977 5069835f22deSdrh assert( pPager->tempFile==0 ); 5070763afe62Sdan rc = pagerPagecount(pPager, &nPage); 5071d300b8a3Sdanielk1977 if( rc==SQLITE_OK ){ 5072f3ccc38aSdrh /* If the database is zero pages in size, that means that either (1) the 5073f3ccc38aSdrh ** journal is a remnant from a prior database with the same name where 5074f3ccc38aSdrh ** the database file but not the journal was deleted, or (2) the initial 5075f3ccc38aSdrh ** transaction that populates a new database is being rolled back. 5076f3ccc38aSdrh ** In either case, the journal file can be deleted. However, take care 5077f3ccc38aSdrh ** not to delete the journal file if it is already open due to 5078f3ccc38aSdrh ** journal_mode=PERSIST. 5079f3ccc38aSdrh */ 5080eb443925Smistachkin if( nPage==0 && !jrnlOpen ){ 5081cc0acb26Sdrh sqlite3BeginBenignMalloc(); 50824e004aa6Sdan if( pagerLockDb(pPager, RESERVED_LOCK)==SQLITE_OK ){ 5083f0039ad8Sdrh sqlite3OsDelete(pVfs, pPager->zJournal, 0); 508476de8a75Sdan if( !pPager->exclusiveMode ) pagerUnlockDb(pPager, SHARED_LOCK); 5085f0039ad8Sdrh } 5086cc0acb26Sdrh sqlite3EndBenignMalloc(); 5087d300b8a3Sdanielk1977 }else{ 5088ee8b799dSdanielk1977 /* The journal file exists and no other connection has a reserved 5089ee8b799dSdanielk1977 ** or greater lock on the database file. Now check that there is 5090ee8b799dSdanielk1977 ** at least one non-zero bytes at the start of the journal file. 5091ee8b799dSdanielk1977 ** If there is, then we consider this journal to be hot. If not, 5092ee8b799dSdanielk1977 ** it can be ignored. 5093ee8b799dSdanielk1977 */ 50942a321c75Sdan if( !jrnlOpen ){ 5095ee8b799dSdanielk1977 int f = SQLITE_OPEN_READONLY|SQLITE_OPEN_MAIN_JOURNAL; 5096ee8b799dSdanielk1977 rc = sqlite3OsOpen(pVfs, pPager->zJournal, pPager->jfd, f, &f); 50972a321c75Sdan } 5098ee8b799dSdanielk1977 if( rc==SQLITE_OK ){ 5099ee8b799dSdanielk1977 u8 first = 0; 5100ee8b799dSdanielk1977 rc = sqlite3OsRead(pPager->jfd, (void *)&first, 1, 0); 5101ee8b799dSdanielk1977 if( rc==SQLITE_IOERR_SHORT_READ ){ 5102ee8b799dSdanielk1977 rc = SQLITE_OK; 5103ee8b799dSdanielk1977 } 51042a321c75Sdan if( !jrnlOpen ){ 5105ee8b799dSdanielk1977 sqlite3OsClose(pPager->jfd); 51062a321c75Sdan } 5107ee8b799dSdanielk1977 *pExists = (first!=0); 5108cc0acb26Sdrh }else if( rc==SQLITE_CANTOPEN ){ 5109f0039ad8Sdrh /* If we cannot open the rollback journal file in order to see if 511060ec914cSpeter.d.reid ** it has a zero header, that might be due to an I/O error, or 5111f0039ad8Sdrh ** it might be due to the race condition described above and in 5112f0039ad8Sdrh ** ticket #3883. Either way, assume that the journal is hot. 5113f0039ad8Sdrh ** This might be a false positive. But if it is, then the 5114f0039ad8Sdrh ** automatic journal playback and recovery mechanism will deal 5115f0039ad8Sdrh ** with it under an EXCLUSIVE lock where we do not need to 5116f0039ad8Sdrh ** worry so much with race conditions. 5117f0039ad8Sdrh */ 5118f0039ad8Sdrh *pExists = 1; 5119f0039ad8Sdrh rc = SQLITE_OK; 5120d300b8a3Sdanielk1977 } 5121d300b8a3Sdanielk1977 } 5122165ffe97Sdrh } 5123bea2a948Sdanielk1977 } 5124ee8b799dSdanielk1977 } 5125ee8b799dSdanielk1977 5126d300b8a3Sdanielk1977 return rc; 5127861f7456Sdanielk1977 } 5128861f7456Sdanielk1977 5129a470aeb4Sdan /* 513089bc4bc6Sdanielk1977 ** This function is called to obtain a shared lock on the database file. 51319584f58cSdrh ** It is illegal to call sqlite3PagerGet() until after this function 513289bc4bc6Sdanielk1977 ** has been successfully called. If a shared-lock is already held when 513389bc4bc6Sdanielk1977 ** this function is called, it is a no-op. 513489bc4bc6Sdanielk1977 ** 513589bc4bc6Sdanielk1977 ** The following operations are also performed by this function. 5136393f0689Sdanielk1977 ** 5137a81a2207Sdan ** 1) If the pager is currently in PAGER_OPEN state (no lock held 5138bea2a948Sdanielk1977 ** on the database file), then an attempt is made to obtain a 5139bea2a948Sdanielk1977 ** SHARED lock on the database file. Immediately after obtaining 5140bea2a948Sdanielk1977 ** the SHARED lock, the file-system is checked for a hot-journal, 5141bea2a948Sdanielk1977 ** which is played back if present. Following any hot-journal 5142bea2a948Sdanielk1977 ** rollback, the contents of the cache are validated by checking 5143bea2a948Sdanielk1977 ** the 'change-counter' field of the database file header and 5144bea2a948Sdanielk1977 ** discarded if they are found to be invalid. 5145bea2a948Sdanielk1977 ** 5146bea2a948Sdanielk1977 ** 2) If the pager is running in exclusive-mode, and there are currently 5147bea2a948Sdanielk1977 ** no outstanding references to any pages, and is in the error state, 5148bea2a948Sdanielk1977 ** then an attempt is made to clear the error state by discarding 5149bea2a948Sdanielk1977 ** the contents of the page cache and rolling back any open journal 5150bea2a948Sdanielk1977 ** file. 5151bea2a948Sdanielk1977 ** 5152a81a2207Sdan ** If everything is successful, SQLITE_OK is returned. If an IO error 5153a81a2207Sdan ** occurs while locking the database, checking for a hot-journal file or 5154a81a2207Sdan ** rolling back a journal file, the IO error code is returned. 5155ed7c855cSdrh */ 515689bc4bc6Sdanielk1977 int sqlite3PagerSharedLock(Pager *pPager){ 5157bea2a948Sdanielk1977 int rc = SQLITE_OK; /* Return code */ 5158ed7c855cSdrh 51598a938f98Sdrh /* This routine is only called from b-tree and only when there are no 5160763afe62Sdan ** outstanding pages. This implies that the pager state should either 5161de1ae34eSdan ** be OPEN or READER. READER is only possible if the pager is or was in 51626572c16aSdan ** exclusive access mode. */ 51638a938f98Sdrh assert( sqlite3PcacheRefCount(pPager->pPCache)==0 ); 5164763afe62Sdan assert( assert_pager_state(pPager) ); 5165de1ae34eSdan assert( pPager->eState==PAGER_OPEN || pPager->eState==PAGER_READER ); 51666572c16aSdan assert( pPager->errCode==SQLITE_OK ); 51678a938f98Sdrh 5168de1ae34eSdan if( !pagerUseWal(pPager) && pPager->eState==PAGER_OPEN ){ 5169431b0b42Sdan int bHotJournal = 1; /* True if there exists a hot journal-file */ 5170d0864087Sdan 51714e004aa6Sdan assert( !MEMDB ); 51726572c16aSdan assert( pPager->tempFile==0 || pPager->eLock==EXCLUSIVE_LOCK ); 5173d0864087Sdan 51749281bf2aSdan rc = pager_wait_on_lock(pPager, SHARED_LOCK); 51759281bf2aSdan if( rc!=SQLITE_OK ){ 517654919f82Sdan assert( pPager->eLock==NO_LOCK || pPager->eLock==UNKNOWN_LOCK ); 5177b22aa4a6Sdan goto failed; 51789281bf2aSdan } 51797c24610eSdan 518013adf8a0Sdanielk1977 /* If a journal file exists, and there is no RESERVED lock on the 518113adf8a0Sdanielk1977 ** database file, then it either needs to be played back or deleted. 5182ed7c855cSdrh */ 5183431b0b42Sdan if( pPager->eLock<=SHARED_LOCK ){ 5184431b0b42Sdan rc = hasHotJournal(pPager, &bHotJournal); 5185431b0b42Sdan } 5186d300b8a3Sdanielk1977 if( rc!=SQLITE_OK ){ 518752b472aeSdanielk1977 goto failed; 518819db9352Sdrh } 5189431b0b42Sdan if( bHotJournal ){ 5190e3664fb0Sdan if( pPager->readOnly ){ 5191e3664fb0Sdan rc = SQLITE_READONLY_ROLLBACK; 5192e3664fb0Sdan goto failed; 5193e3664fb0Sdan } 5194e3664fb0Sdan 519590ba3bd0Sdanielk1977 /* Get an EXCLUSIVE lock on the database file. At this point it is 519690ba3bd0Sdanielk1977 ** important that a RESERVED lock is not obtained on the way to the 519790ba3bd0Sdanielk1977 ** EXCLUSIVE lock. If it were, another process might open the 519890ba3bd0Sdanielk1977 ** database file, detect the RESERVED lock, and conclude that the 5199bea2a948Sdanielk1977 ** database is safe to read while this process is still rolling the 5200bea2a948Sdanielk1977 ** hot-journal back. 520190ba3bd0Sdanielk1977 ** 5202bea2a948Sdanielk1977 ** Because the intermediate RESERVED lock is not requested, any 5203bea2a948Sdanielk1977 ** other process attempting to access the database file will get to 5204bea2a948Sdanielk1977 ** this point in the code and fail to obtain its own EXCLUSIVE lock 5205bea2a948Sdanielk1977 ** on the database file. 5206d0864087Sdan ** 5207d0864087Sdan ** Unless the pager is in locking_mode=exclusive mode, the lock is 5208d0864087Sdan ** downgraded to SHARED_LOCK before this function returns. 520990ba3bd0Sdanielk1977 */ 52104e004aa6Sdan rc = pagerLockDb(pPager, EXCLUSIVE_LOCK); 5211a7fcb059Sdrh if( rc!=SQLITE_OK ){ 521252b472aeSdanielk1977 goto failed; 5213a7fcb059Sdrh } 5214a7fcb059Sdrh 5215d0864087Sdan /* If it is not already open and the file exists on disk, open the 5216d0864087Sdan ** journal for read/write access. Write access is required because 5217d0864087Sdan ** in exclusive-access mode the file descriptor will be kept open 5218d0864087Sdan ** and possibly used for a transaction later on. Also, write-access 5219d0864087Sdan ** is usually required to finalize the journal in journal_mode=persist 5220d0864087Sdan ** mode (and also for journal_mode=truncate on some systems). 5221d0864087Sdan ** 5222d0864087Sdan ** If the journal does not exist, it usually means that some 5223d0864087Sdan ** other connection managed to get in and roll it back before 5224d0864087Sdan ** this connection obtained the exclusive lock above. Or, it 5225d0864087Sdan ** may mean that the pager was in the error-state when this 5226d0864087Sdan ** function was called and the journal file does not exist. 5227ed7c855cSdrh */ 5228bea2a948Sdanielk1977 if( !isOpen(pPager->jfd) ){ 5229431b0b42Sdan sqlite3_vfs * const pVfs = pPager->pVfs; 5230431b0b42Sdan int bExists; /* True if journal file exists */ 5231431b0b42Sdan rc = sqlite3OsAccess( 5232431b0b42Sdan pVfs, pPager->zJournal, SQLITE_ACCESS_EXISTS, &bExists); 5233431b0b42Sdan if( rc==SQLITE_OK && bExists ){ 5234b4b47411Sdanielk1977 int fout = 0; 5235ae72d982Sdanielk1977 int f = SQLITE_OPEN_READWRITE|SQLITE_OPEN_MAIN_JOURNAL; 52367152de8dSdanielk1977 assert( !pPager->tempFile ); 5237ae72d982Sdanielk1977 rc = sqlite3OsOpen(pVfs, pPager->zJournal, pPager->jfd, f, &fout); 5238bea2a948Sdanielk1977 assert( rc!=SQLITE_OK || isOpen(pPager->jfd) ); 5239281d8bd3Sdanielk1977 if( rc==SQLITE_OK && fout&SQLITE_OPEN_READONLY ){ 52409978c97eSdrh rc = SQLITE_CANTOPEN_BKPT; 5241b4b47411Sdanielk1977 sqlite3OsClose(pPager->jfd); 5242979f38e5Sdanielk1977 } 5243861f7456Sdanielk1977 } 5244979f38e5Sdanielk1977 } 524591781bd7Sdrh 5246ed7c855cSdrh /* Playback and delete the journal. Drop the database write 5247112f752bSdanielk1977 ** lock and reacquire the read lock. Purge the cache before 5248112f752bSdanielk1977 ** playing back the hot-journal so that we don't end up with 524991781bd7Sdrh ** an inconsistent cache. Sync the hot journal before playing 525091781bd7Sdrh ** it back since the process that crashed and left the hot journal 525191781bd7Sdrh ** probably did not sync it and we are required to always sync 525291781bd7Sdrh ** the journal before playing it back. 5253ed7c855cSdrh */ 5254641a0bd2Sdanielk1977 if( isOpen(pPager->jfd) ){ 52554e004aa6Sdan assert( rc==SQLITE_OK ); 5256eada58aaSdan rc = pagerSyncHotJournal(pPager); 525791781bd7Sdrh if( rc==SQLITE_OK ){ 52586572c16aSdan rc = pager_playback(pPager, !pPager->tempFile); 5259de1ae34eSdan pPager->eState = PAGER_OPEN; 526091781bd7Sdrh } 52614e004aa6Sdan }else if( !pPager->exclusiveMode ){ 52624e004aa6Sdan pagerUnlockDb(pPager, SHARED_LOCK); 5263ed7c855cSdrh } 52644e004aa6Sdan 52654e004aa6Sdan if( rc!=SQLITE_OK ){ 5266de1ae34eSdan /* This branch is taken if an error occurs while trying to open 5267de1ae34eSdan ** or roll back a hot-journal while holding an EXCLUSIVE lock. The 5268de1ae34eSdan ** pager_unlock() routine will be called before returning to unlock 5269de1ae34eSdan ** the file. If the unlock attempt fails, then Pager.eLock must be 5270de1ae34eSdan ** set to UNKNOWN_LOCK (see the comment above the #define for 5271de1ae34eSdan ** UNKNOWN_LOCK above for an explanation). 5272de1ae34eSdan ** 5273de1ae34eSdan ** In order to get pager_unlock() to do this, set Pager.eState to 5274de1ae34eSdan ** PAGER_ERROR now. This is not actually counted as a transition 5275de1ae34eSdan ** to ERROR state in the state diagram at the top of this file, 5276de1ae34eSdan ** since we know that the same call to pager_unlock() will very 5277de1ae34eSdan ** shortly transition the pager object to the OPEN state. Calling 5278de1ae34eSdan ** assert_pager_state() would fail now, as it should not be possible 5279de1ae34eSdan ** to be in ERROR state when there are zero outstanding page 5280de1ae34eSdan ** references. 5281de1ae34eSdan */ 52824e004aa6Sdan pager_error(pPager, rc); 52834e004aa6Sdan goto failed; 5284641a0bd2Sdanielk1977 } 5285d0864087Sdan 5286de1ae34eSdan assert( pPager->eState==PAGER_OPEN ); 5287d0864087Sdan assert( (pPager->eLock==SHARED_LOCK) 5288d0864087Sdan || (pPager->exclusiveMode && pPager->eLock>SHARED_LOCK) 5289c5859718Sdanielk1977 ); 5290ed7c855cSdrh } 5291e277be05Sdanielk1977 5292c98a4cc8Sdrh if( !pPager->tempFile && pPager->hasHeldSharedLock ){ 5293542d5586Sdrh /* The shared-lock has just been acquired then check to 5294542d5586Sdrh ** see if the database has been modified. If the database has changed, 5295c98a4cc8Sdrh ** flush the cache. The hasHeldSharedLock flag prevents this from 5296542d5586Sdrh ** occurring on the very first access to a file, in order to save a 5297542d5586Sdrh ** single unnecessary sqlite3OsRead() call at the start-up. 529886a88114Sdrh ** 5299b84c14d0Sdrh ** Database changes are detected by looking at 15 bytes beginning 530086a88114Sdrh ** at offset 24 into the file. The first 4 of these 16 bytes are 530186a88114Sdrh ** a 32-bit counter that is incremented with each change. The 530286a88114Sdrh ** other bytes change randomly with each file change when 530386a88114Sdrh ** a codec is in use. 530486a88114Sdrh ** 530586a88114Sdrh ** There is a vanishingly small chance that a change will not be 53066fa51035Sdrh ** detected. The chance of an undetected change is so small that 530786a88114Sdrh ** it can be neglected. 530824168728Sdanielk1977 */ 530986a88114Sdrh char dbFileVers[sizeof(pPager->dbFileVers)]; 531024168728Sdanielk1977 5311ae5e445bSdrh IOTRACE(("CKVERS %p %d\n", pPager, sizeof(dbFileVers))); 531262079060Sdanielk1977 rc = sqlite3OsRead(pPager->fd, &dbFileVers, sizeof(dbFileVers), 24); 53135f5a2d1cSdrh if( rc!=SQLITE_OK ){ 53145f5a2d1cSdrh if( rc!=SQLITE_IOERR_SHORT_READ ){ 531552b472aeSdanielk1977 goto failed; 5316e180dd93Sdanielk1977 } 531786a88114Sdrh memset(dbFileVers, 0, sizeof(dbFileVers)); 5318e180dd93Sdanielk1977 } 5319e180dd93Sdanielk1977 532086a88114Sdrh if( memcmp(pPager->dbFileVers, dbFileVers, sizeof(dbFileVers))!=0 ){ 5321e277be05Sdanielk1977 pager_reset(pPager); 532211dcd119Sdan 532311dcd119Sdan /* Unmap the database file. It is possible that external processes 532411dcd119Sdan ** may have truncated the database file and then extended it back 532511dcd119Sdan ** to its original size while this process was not holding a lock. 532611dcd119Sdan ** In this case there may exist a Pager.pMap mapping that appears 532711dcd119Sdan ** to be the right size but is not actually valid. Avoid this 532811dcd119Sdan ** possibility by unmapping the db here. */ 5329188d4884Sdrh if( USEFETCH(pPager) ){ 5330df737fe6Sdan sqlite3OsUnfetch(pPager->fd, 0, 0); 5331f23da966Sdan } 5332e277be05Sdanielk1977 } 5333e277be05Sdanielk1977 } 5334e04dc88bSdan 53355cf53537Sdan /* If there is a WAL file in the file-system, open this database in WAL 53365cf53537Sdan ** mode. Otherwise, the following function call is a no-op. 53375cf53537Sdan */ 53385cf53537Sdan rc = pagerOpenWalIfPresent(pPager); 53399091f775Sshaneh #ifndef SQLITE_OMIT_WAL 534022b328b2Sdan assert( pPager->pWal==0 || rc==SQLITE_OK ); 53419091f775Sshaneh #endif 5342c5859718Sdanielk1977 } 5343e277be05Sdanielk1977 534422b328b2Sdan if( pagerUseWal(pPager) ){ 534522b328b2Sdan assert( rc==SQLITE_OK ); 5346763afe62Sdan rc = pagerBeginReadTransaction(pPager); 5347763afe62Sdan } 5348763afe62Sdan 53496572c16aSdan if( pPager->tempFile==0 && pPager->eState==PAGER_OPEN && rc==SQLITE_OK ){ 5350763afe62Sdan rc = pagerPagecount(pPager, &pPager->dbSize); 5351763afe62Sdan } 5352763afe62Sdan 535352b472aeSdanielk1977 failed: 535452b472aeSdanielk1977 if( rc!=SQLITE_OK ){ 535522b328b2Sdan assert( !MEMDB ); 535652b472aeSdanielk1977 pager_unlock(pPager); 5357de1ae34eSdan assert( pPager->eState==PAGER_OPEN ); 5358763afe62Sdan }else{ 5359763afe62Sdan pPager->eState = PAGER_READER; 5360c98a4cc8Sdrh pPager->hasHeldSharedLock = 1; 536152b472aeSdanielk1977 } 5362e277be05Sdanielk1977 return rc; 5363d9b0257aSdrh } 5364e277be05Sdanielk1977 5365e277be05Sdanielk1977 /* 5366bea2a948Sdanielk1977 ** If the reference count has reached zero, rollback any active 5367bea2a948Sdanielk1977 ** transaction and unlock the pager. 536859813953Sdrh ** 536959813953Sdrh ** Except, in locking_mode=EXCLUSIVE when there is nothing to in 537059813953Sdrh ** the rollback journal, the unlock is not performed and there is 537159813953Sdrh ** nothing to rollback, so this routine is a no-op. 53728c0a791aSdanielk1977 */ 53738c0a791aSdanielk1977 static void pagerUnlockIfUnused(Pager *pPager){ 53743908fe90Sdrh if( sqlite3PcacheRefCount(pPager->pPCache)==0 ){ 53753908fe90Sdrh assert( pPager->nMmapOut==0 ); /* because page1 is never memory mapped */ 53768c0a791aSdanielk1977 pagerUnlockAndRollback(pPager); 53778c0a791aSdanielk1977 } 53788c0a791aSdanielk1977 } 53798c0a791aSdanielk1977 53808c0a791aSdanielk1977 /* 5381d5df3ff2Sdrh ** The page getter methods each try to acquire a reference to a 5382d5df3ff2Sdrh ** page with page number pgno. If the requested reference is 5383bea2a948Sdanielk1977 ** successfully obtained, it is copied to *ppPage and SQLITE_OK returned. 5384e277be05Sdanielk1977 ** 5385d5df3ff2Sdrh ** There are different implementations of the getter method depending 5386d5df3ff2Sdrh ** on the current state of the pager. 5387d5df3ff2Sdrh ** 5388d5df3ff2Sdrh ** getPageNormal() -- The normal getter 5389d5df3ff2Sdrh ** getPageError() -- Used if the pager is in an error state 5390d5df3ff2Sdrh ** getPageMmap() -- Used if memory-mapped I/O is enabled 5391d5df3ff2Sdrh ** 5392bea2a948Sdanielk1977 ** If the requested page is already in the cache, it is returned. 5393bea2a948Sdanielk1977 ** Otherwise, a new page object is allocated and populated with data 5394bea2a948Sdanielk1977 ** read from the database file. In some cases, the pcache module may 5395bea2a948Sdanielk1977 ** choose not to allocate a new page object and may reuse an existing 5396bea2a948Sdanielk1977 ** object with no outstanding references. 5397bea2a948Sdanielk1977 ** 5398bea2a948Sdanielk1977 ** The extra data appended to a page is always initialized to zeros the 5399bea2a948Sdanielk1977 ** first time a page is loaded into memory. If the page requested is 5400bea2a948Sdanielk1977 ** already in the cache when this function is called, then the extra 5401bea2a948Sdanielk1977 ** data is left as it was when the page object was last used. 5402bea2a948Sdanielk1977 ** 5403d5df3ff2Sdrh ** If the database image is smaller than the requested page or if 5404d5df3ff2Sdrh ** the flags parameter contains the PAGER_GET_NOCONTENT bit and the 5405bea2a948Sdanielk1977 ** requested page is not already stored in the cache, then no 5406bea2a948Sdanielk1977 ** actual disk read occurs. In this case the memory image of the 5407bea2a948Sdanielk1977 ** page is initialized to all zeros. 5408bea2a948Sdanielk1977 ** 5409d5df3ff2Sdrh ** If PAGER_GET_NOCONTENT is true, it means that we do not care about 5410d5df3ff2Sdrh ** the contents of the page. This occurs in two scenarios: 5411bea2a948Sdanielk1977 ** 5412bea2a948Sdanielk1977 ** a) When reading a free-list leaf page from the database, and 5413bea2a948Sdanielk1977 ** 5414bea2a948Sdanielk1977 ** b) When a savepoint is being rolled back and we need to load 541591781bd7Sdrh ** a new page into the cache to be filled with the data read 5416bea2a948Sdanielk1977 ** from the savepoint journal. 5417bea2a948Sdanielk1977 ** 5418d5df3ff2Sdrh ** If PAGER_GET_NOCONTENT is true, then the data returned is zeroed instead 5419d5df3ff2Sdrh ** of being read from the database. Additionally, the bits corresponding 5420bea2a948Sdanielk1977 ** to pgno in Pager.pInJournal (bitvec of pages already written to the 5421bea2a948Sdanielk1977 ** journal file) and the PagerSavepoint.pInSavepoint bitvecs of any open 5422bea2a948Sdanielk1977 ** savepoints are set. This means if the page is made writable at any 5423bea2a948Sdanielk1977 ** point in the future, using a call to sqlite3PagerWrite(), its contents 5424bea2a948Sdanielk1977 ** will not be journaled. This saves IO. 5425e277be05Sdanielk1977 ** 5426e277be05Sdanielk1977 ** The acquisition might fail for several reasons. In all cases, 5427e277be05Sdanielk1977 ** an appropriate error code is returned and *ppPage is set to NULL. 5428e277be05Sdanielk1977 ** 5429d33d5a89Sdrh ** See also sqlite3PagerLookup(). Both this routine and Lookup() attempt 5430e277be05Sdanielk1977 ** to find a page in the in-memory cache first. If the page is not already 5431d33d5a89Sdrh ** in memory, this routine goes to disk to read it in whereas Lookup() 5432e277be05Sdanielk1977 ** just returns 0. This routine acquires a read-lock the first time it 5433e277be05Sdanielk1977 ** has to go to disk, and could also playback an old journal if necessary. 5434d33d5a89Sdrh ** Since Lookup() never goes to disk, it never has to deal with locks 5435e277be05Sdanielk1977 ** or journal files. 5436e277be05Sdanielk1977 */ 543712e6f682Sdrh static int getPageNormal( 5438538f570cSdrh Pager *pPager, /* The pager open on the database file */ 5439538f570cSdrh Pgno pgno, /* Page number to fetch */ 5440538f570cSdrh DbPage **ppPage, /* Write a pointer to the page here */ 5441b00fc3b1Sdrh int flags /* PAGER_GET_XXX flags */ 5442538f570cSdrh ){ 544311dcd119Sdan int rc = SQLITE_OK; 5444d5df3ff2Sdrh PgHdr *pPg; 5445d5df3ff2Sdrh u8 noContent; /* True if PAGER_GET_NOCONTENT is set */ 544612e6f682Sdrh sqlite3_pcache_page *pBase; 544711dcd119Sdan 544812e6f682Sdrh assert( pPager->errCode==SQLITE_OK ); 5449d0864087Sdan assert( pPager->eState>=PAGER_READER ); 5450bea2a948Sdanielk1977 assert( assert_pager_state(pPager) ); 5451c98a4cc8Sdrh assert( pPager->hasHeldSharedLock==1 ); 5452e277be05Sdanielk1977 54535f4ade04Sdrh if( pgno==0 ) return SQLITE_CORRUPT_BKPT; 5454bc59ac0eSdrh pBase = sqlite3PcacheFetch(pPager->pPCache, pgno, 3); 5455bc59ac0eSdrh if( pBase==0 ){ 5456d5df3ff2Sdrh pPg = 0; 5457bc59ac0eSdrh rc = sqlite3PcacheFetchStress(pPager->pPCache, pgno, &pBase); 5458bc59ac0eSdrh if( rc!=SQLITE_OK ) goto pager_acquire_err; 5459d8c0ba3bSdrh if( pBase==0 ){ 5460fad3039cSmistachkin rc = SQLITE_NOMEM_BKPT; 5461d8c0ba3bSdrh goto pager_acquire_err; 5462d8c0ba3bSdrh } 5463bc59ac0eSdrh } 5464bc59ac0eSdrh pPg = *ppPage = sqlite3PcacheFetchFinish(pPager->pPCache, pgno, pBase); 5465b84c14d0Sdrh assert( pPg==(*ppPage) ); 5466b84c14d0Sdrh assert( pPg->pgno==pgno ); 5467b84c14d0Sdrh assert( pPg->pPager==pPager || pPg->pPager==0 ); 546889bc4bc6Sdanielk1977 54698a123d60Sdrh noContent = (flags & PAGER_GET_NOCONTENT)!=0; 54708a123d60Sdrh if( pPg->pPager && !noContent ){ 547189bc4bc6Sdanielk1977 /* In this case the pcache already contains an initialized copy of 547289bc4bc6Sdanielk1977 ** the page. Return without further ado. */ 5473e878a2f4Sdanielk1977 assert( pgno<=PAGER_MAX_PGNO && pgno!=PAGER_MJ_PGNO(pPager) ); 54749ad3ee40Sdrh pPager->aStat[PAGER_STAT_HIT]++; 547589bc4bc6Sdanielk1977 return SQLITE_OK; 547689bc4bc6Sdanielk1977 547789bc4bc6Sdanielk1977 }else{ 54788c0a791aSdanielk1977 /* The pager cache has created a new page. Its content needs to 5479cbed604fSdrh ** be initialized. But first some error checks: 5480cbed604fSdrh ** 54815f4ade04Sdrh ** (1) The maximum page number is 2^31 54825f4ade04Sdrh ** (2) Never try to fetch the locking page 5483cbed604fSdrh */ 54845f4ade04Sdrh if( pgno>PAGER_MAX_PGNO || pgno==PAGER_MJ_PGNO(pPager) ){ 548589bc4bc6Sdanielk1977 rc = SQLITE_CORRUPT_BKPT; 548689bc4bc6Sdanielk1977 goto pager_acquire_err; 548789bc4bc6Sdanielk1977 } 548889bc4bc6Sdanielk1977 5489cbed604fSdrh pPg->pPager = pPager; 5490cbed604fSdrh 5491835f22deSdrh assert( !isOpen(pPager->fd) || !MEMDB ); 5492835f22deSdrh if( !isOpen(pPager->fd) || pPager->dbSize<pgno || noContent ){ 5493f8e632b6Sdrh if( pgno>pPager->mxPgno ){ 549489bc4bc6Sdanielk1977 rc = SQLITE_FULL; 549589bc4bc6Sdanielk1977 goto pager_acquire_err; 5496f8e632b6Sdrh } 5497a1fa00d9Sdanielk1977 if( noContent ){ 5498bea2a948Sdanielk1977 /* Failure to set the bits in the InJournal bit-vectors is benign. 5499bea2a948Sdanielk1977 ** It merely means that we might do some extra work to journal a 5500bea2a948Sdanielk1977 ** page that does not need to be journaled. Nevertheless, be sure 5501bea2a948Sdanielk1977 ** to test the case where a malloc error occurs while trying to set 5502bea2a948Sdanielk1977 ** a bit in a bit vector. 5503bea2a948Sdanielk1977 */ 5504bea2a948Sdanielk1977 sqlite3BeginBenignMalloc(); 55058a938f98Sdrh if( pgno<=pPager->dbOrigSize ){ 5506bea2a948Sdanielk1977 TESTONLY( rc = ) sqlite3BitvecSet(pPager->pInJournal, pgno); 5507bea2a948Sdanielk1977 testcase( rc==SQLITE_NOMEM ); 5508bea2a948Sdanielk1977 } 5509bea2a948Sdanielk1977 TESTONLY( rc = ) addToSavepointBitvecs(pPager, pgno); 5510bea2a948Sdanielk1977 testcase( rc==SQLITE_NOMEM ); 5511bea2a948Sdanielk1977 sqlite3EndBenignMalloc(); 55128c0a791aSdanielk1977 } 551339187959Sdrh memset(pPg->pData, 0, pPager->pageSize); 5514538f570cSdrh IOTRACE(("ZERO %p %d\n", pPager, pgno)); 5515306dc213Sdrh }else{ 5516bea2a948Sdanielk1977 assert( pPg->pPager==pPager ); 55179ad3ee40Sdrh pPager->aStat[PAGER_STAT_MISS]++; 551856520ab8Sdrh rc = readDbPage(pPg); 5519546820e3Sdanielk1977 if( rc!=SQLITE_OK ){ 552089bc4bc6Sdanielk1977 goto pager_acquire_err; 552181a20f21Sdrh } 5522306dc213Sdrh } 55235f848c3aSdan pager_set_pagehash(pPg); 5524ed7c855cSdrh } 5525ed7c855cSdrh return SQLITE_OK; 552689bc4bc6Sdanielk1977 552789bc4bc6Sdanielk1977 pager_acquire_err: 552889bc4bc6Sdanielk1977 assert( rc!=SQLITE_OK ); 5529e878a2f4Sdanielk1977 if( pPg ){ 5530e878a2f4Sdanielk1977 sqlite3PcacheDrop(pPg); 5531e878a2f4Sdanielk1977 } 553289bc4bc6Sdanielk1977 pagerUnlockIfUnused(pPager); 553389bc4bc6Sdanielk1977 *ppPage = 0; 553489bc4bc6Sdanielk1977 return rc; 5535ed7c855cSdrh } 55368c0a791aSdanielk1977 5537d5df3ff2Sdrh #if SQLITE_MAX_MMAP_SIZE>0 553812e6f682Sdrh /* The page getter for when memory-mapped I/O is enabled */ 553912e6f682Sdrh static int getPageMMap( 554012e6f682Sdrh Pager *pPager, /* The pager open on the database file */ 554112e6f682Sdrh Pgno pgno, /* Page number to fetch */ 554212e6f682Sdrh DbPage **ppPage, /* Write a pointer to the page here */ 554312e6f682Sdrh int flags /* PAGER_GET_XXX flags */ 554412e6f682Sdrh ){ 554512e6f682Sdrh int rc = SQLITE_OK; 554612e6f682Sdrh PgHdr *pPg = 0; 554712e6f682Sdrh u32 iFrame = 0; /* Frame to read from WAL file */ 554812e6f682Sdrh 554912e6f682Sdrh /* It is acceptable to use a read-only (mmap) page for any page except 555012e6f682Sdrh ** page 1 if there is no write-transaction open or the ACQUIRE_READONLY 555112e6f682Sdrh ** flag was specified by the caller. And so long as the db is not a 555212e6f682Sdrh ** temporary or in-memory database. */ 555312e6f682Sdrh const int bMmapOk = (pgno>1 555412e6f682Sdrh && (pPager->eState==PAGER_READER || (flags & PAGER_GET_READONLY)) 555512e6f682Sdrh ); 555612e6f682Sdrh 5557380c08eaSdrh assert( USEFETCH(pPager) ); 5558380c08eaSdrh #ifdef SQLITE_HAS_CODEC 5559380c08eaSdrh assert( pPager->xCodec==0 ); 5560380c08eaSdrh #endif 5561380c08eaSdrh 556212e6f682Sdrh /* Optimization note: Adding the "pgno<=1" term before "pgno==0" here 556312e6f682Sdrh ** allows the compiler optimizer to reuse the results of the "pgno>1" 556412e6f682Sdrh ** test in the previous statement, and avoid testing pgno==0 in the 556512e6f682Sdrh ** common case where pgno is large. */ 556612e6f682Sdrh if( pgno<=1 && pgno==0 ){ 556712e6f682Sdrh return SQLITE_CORRUPT_BKPT; 556812e6f682Sdrh } 556912e6f682Sdrh assert( pPager->eState>=PAGER_READER ); 557012e6f682Sdrh assert( assert_pager_state(pPager) ); 557112e6f682Sdrh assert( pPager->hasHeldSharedLock==1 ); 557212e6f682Sdrh assert( pPager->errCode==SQLITE_OK ); 557312e6f682Sdrh 557412e6f682Sdrh if( bMmapOk && pagerUseWal(pPager) ){ 557512e6f682Sdrh rc = sqlite3WalFindFrame(pPager->pWal, pgno, &iFrame); 557612e6f682Sdrh if( rc!=SQLITE_OK ){ 557712e6f682Sdrh *ppPage = 0; 557812e6f682Sdrh return rc; 557912e6f682Sdrh } 558012e6f682Sdrh } 558112e6f682Sdrh if( bMmapOk && iFrame==0 ){ 558212e6f682Sdrh void *pData = 0; 558312e6f682Sdrh rc = sqlite3OsFetch(pPager->fd, 558412e6f682Sdrh (i64)(pgno-1) * pPager->pageSize, pPager->pageSize, &pData 558512e6f682Sdrh ); 558612e6f682Sdrh if( rc==SQLITE_OK && pData ){ 558712e6f682Sdrh if( pPager->eState>PAGER_READER || pPager->tempFile ){ 558812e6f682Sdrh pPg = sqlite3PagerLookup(pPager, pgno); 558912e6f682Sdrh } 559012e6f682Sdrh if( pPg==0 ){ 559112e6f682Sdrh rc = pagerAcquireMapPage(pPager, pgno, pData, &pPg); 559212e6f682Sdrh }else{ 559312e6f682Sdrh sqlite3OsUnfetch(pPager->fd, (i64)(pgno-1)*pPager->pageSize, pData); 559412e6f682Sdrh } 559512e6f682Sdrh if( pPg ){ 559612e6f682Sdrh assert( rc==SQLITE_OK ); 559712e6f682Sdrh *ppPage = pPg; 559812e6f682Sdrh return SQLITE_OK; 559912e6f682Sdrh } 560012e6f682Sdrh } 560112e6f682Sdrh if( rc!=SQLITE_OK ){ 560212e6f682Sdrh *ppPage = 0; 560312e6f682Sdrh return rc; 560412e6f682Sdrh } 560512e6f682Sdrh } 560612e6f682Sdrh return getPageNormal(pPager, pgno, ppPage, flags); 560712e6f682Sdrh } 5608d5df3ff2Sdrh #endif /* SQLITE_MAX_MMAP_SIZE>0 */ 560912e6f682Sdrh 561012e6f682Sdrh /* The page getter method for when the pager is an error state */ 561112e6f682Sdrh static int getPageError( 561212e6f682Sdrh Pager *pPager, /* The pager open on the database file */ 561312e6f682Sdrh Pgno pgno, /* Page number to fetch */ 561412e6f682Sdrh DbPage **ppPage, /* Write a pointer to the page here */ 561512e6f682Sdrh int flags /* PAGER_GET_XXX flags */ 561612e6f682Sdrh ){ 5617380c08eaSdrh UNUSED_PARAMETER(pgno); 5618380c08eaSdrh UNUSED_PARAMETER(flags); 561912e6f682Sdrh assert( pPager->errCode!=SQLITE_OK ); 562012e6f682Sdrh *ppPage = 0; 562112e6f682Sdrh return pPager->errCode; 562212e6f682Sdrh } 562312e6f682Sdrh 562412e6f682Sdrh 562512e6f682Sdrh /* Dispatch all page fetch requests to the appropriate getter method. 562612e6f682Sdrh */ 562712e6f682Sdrh int sqlite3PagerGet( 562812e6f682Sdrh Pager *pPager, /* The pager open on the database file */ 562912e6f682Sdrh Pgno pgno, /* Page number to fetch */ 563012e6f682Sdrh DbPage **ppPage, /* Write a pointer to the page here */ 563112e6f682Sdrh int flags /* PAGER_GET_XXX flags */ 563212e6f682Sdrh ){ 563312e6f682Sdrh return pPager->xGet(pPager, pgno, ppPage, flags); 563412e6f682Sdrh } 563512e6f682Sdrh 5636ed7c855cSdrh /* 56377e3b0a07Sdrh ** Acquire a page if it is already in the in-memory cache. Do 56387e3b0a07Sdrh ** not read the page from disk. Return a pointer to the page, 5639a81a2207Sdan ** or 0 if the page is not in cache. 56407e3b0a07Sdrh ** 56413b8a05f6Sdanielk1977 ** See also sqlite3PagerGet(). The difference between this routine 56423b8a05f6Sdanielk1977 ** and sqlite3PagerGet() is that _get() will go to the disk and read 56437e3b0a07Sdrh ** in the page if the page is not already in cache. This routine 56445e00f6c7Sdrh ** returns NULL if the page is not in cache or if a disk I/O error 56455e00f6c7Sdrh ** has ever happened. 56467e3b0a07Sdrh */ 56473b8a05f6Sdanielk1977 DbPage *sqlite3PagerLookup(Pager *pPager, Pgno pgno){ 5648bc59ac0eSdrh sqlite3_pcache_page *pPage; 5649836faa48Sdrh assert( pPager!=0 ); 5650836faa48Sdrh assert( pgno!=0 ); 5651ad7516c4Sdrh assert( pPager->pPCache!=0 ); 5652bc59ac0eSdrh pPage = sqlite3PcacheFetch(pPager->pPCache, pgno, 0); 5653c98a4cc8Sdrh assert( pPage==0 || pPager->hasHeldSharedLock ); 5654d8c0ba3bSdrh if( pPage==0 ) return 0; 5655bc59ac0eSdrh return sqlite3PcacheFetchFinish(pPager->pPCache, pgno, pPage); 56567e3b0a07Sdrh } 56577e3b0a07Sdrh 56587e3b0a07Sdrh /* 5659bea2a948Sdanielk1977 ** Release a page reference. 5660ed7c855cSdrh ** 56613908fe90Sdrh ** The sqlite3PagerUnref() and sqlite3PagerUnrefNotNull() may only be 56623908fe90Sdrh ** used if we know that the page being released is not the last page. 56633908fe90Sdrh ** The btree layer always holds page1 open until the end, so these first 56643908fe90Sdrh ** to routines can be used to release any page other than BtShared.pPage1. 56653908fe90Sdrh ** 56663908fe90Sdrh ** Use sqlite3PagerUnrefPageOne() to release page1. This latter routine 56673908fe90Sdrh ** checks the total number of outstanding pages and if the number of 56683908fe90Sdrh ** pages reaches zero it drops the database lock. 5669ed7c855cSdrh */ 5670da8a330aSdrh void sqlite3PagerUnrefNotNull(DbPage *pPg){ 56713908fe90Sdrh TESTONLY( Pager *pPager = pPg->pPager; ) 5672da8a330aSdrh assert( pPg!=0 ); 5673b2d3de3bSdan if( pPg->flags & PGHDR_MMAP ){ 56743908fe90Sdrh assert( pPg->pgno!=1 ); /* Page1 is never memory mapped */ 5675b2d3de3bSdan pagerReleaseMapPage(pPg); 5676b2d3de3bSdan }else{ 56778c0a791aSdanielk1977 sqlite3PcacheRelease(pPg); 5678b2d3de3bSdan } 56793908fe90Sdrh /* Do not use this routine to release the last reference to page1 */ 56803908fe90Sdrh assert( sqlite3PcacheRefCount(pPager->pPCache)>0 ); 56818c0a791aSdanielk1977 } 5682da8a330aSdrh void sqlite3PagerUnref(DbPage *pPg){ 5683da8a330aSdrh if( pPg ) sqlite3PagerUnrefNotNull(pPg); 5684d9b0257aSdrh } 56853908fe90Sdrh void sqlite3PagerUnrefPageOne(DbPage *pPg){ 56863908fe90Sdrh Pager *pPager; 56873908fe90Sdrh assert( pPg!=0 ); 56883908fe90Sdrh assert( pPg->pgno==1 ); 56893908fe90Sdrh assert( (pPg->flags & PGHDR_MMAP)==0 ); /* Page1 is never memory mapped */ 56903908fe90Sdrh pPager = pPg->pPager; 5691fd72563dSdrh sqlite3PagerResetLockTimeout(pPager); 56923908fe90Sdrh sqlite3PcacheRelease(pPg); 56933908fe90Sdrh pagerUnlockIfUnused(pPager); 56943908fe90Sdrh } 5695ed7c855cSdrh 56969153d850Sdanielk1977 /* 5697bea2a948Sdanielk1977 ** This function is called at the start of every write transaction. 5698bea2a948Sdanielk1977 ** There must already be a RESERVED or EXCLUSIVE lock on the database 5699bea2a948Sdanielk1977 ** file when this routine is called. 5700da47d774Sdrh ** 5701bea2a948Sdanielk1977 ** Open the journal file for pager pPager and write a journal header 5702bea2a948Sdanielk1977 ** to the start of it. If there are active savepoints, open the sub-journal 5703bea2a948Sdanielk1977 ** as well. This function is only used when the journal file is being 5704bea2a948Sdanielk1977 ** opened to write a rollback log for a transaction. It is not used 5705bea2a948Sdanielk1977 ** when opening a hot journal file to roll it back. 5706bea2a948Sdanielk1977 ** 5707bea2a948Sdanielk1977 ** If the journal file is already open (as it may be in exclusive mode), 5708bea2a948Sdanielk1977 ** then this function just writes a journal header to the start of the 5709bea2a948Sdanielk1977 ** already open file. 5710bea2a948Sdanielk1977 ** 5711bea2a948Sdanielk1977 ** Whether or not the journal file is opened by this function, the 5712bea2a948Sdanielk1977 ** Pager.pInJournal bitvec structure is allocated. 5713bea2a948Sdanielk1977 ** 5714bea2a948Sdanielk1977 ** Return SQLITE_OK if everything is successful. Otherwise, return 5715bea2a948Sdanielk1977 ** SQLITE_NOMEM if the attempt to allocate Pager.pInJournal fails, or 5716bea2a948Sdanielk1977 ** an IO error code if opening or writing the journal file fails. 5717da47d774Sdrh */ 5718da47d774Sdrh static int pager_open_journal(Pager *pPager){ 5719bea2a948Sdanielk1977 int rc = SQLITE_OK; /* Return code */ 5720bea2a948Sdanielk1977 sqlite3_vfs * const pVfs = pPager->pVfs; /* Local cache of vfs pointer */ 5721b4b47411Sdanielk1977 5722de1ae34eSdan assert( pPager->eState==PAGER_WRITER_LOCKED ); 5723d0864087Sdan assert( assert_pager_state(pPager) ); 5724f5e7bb51Sdrh assert( pPager->pInJournal==0 ); 5725bea2a948Sdanielk1977 5726ad7516c4Sdrh /* If already in the error state, this function is a no-op. But on 5727ad7516c4Sdrh ** the other hand, this routine is never called if we are already in 5728ad7516c4Sdrh ** an error state. */ 5729ad7516c4Sdrh if( NEVER(pPager->errCode) ) return pPager->errCode; 5730b4b47411Sdanielk1977 5731d0864087Sdan if( !pagerUseWal(pPager) && pPager->journalMode!=PAGER_JOURNALMODE_OFF ){ 5732937ac9daSdan pPager->pInJournal = sqlite3BitvecCreate(pPager->dbSize); 5733bea2a948Sdanielk1977 if( pPager->pInJournal==0 ){ 5734fad3039cSmistachkin return SQLITE_NOMEM_BKPT; 5735b4b47411Sdanielk1977 } 5736bea2a948Sdanielk1977 5737bea2a948Sdanielk1977 /* Open the journal file if it is not already open. */ 5738bea2a948Sdanielk1977 if( !isOpen(pPager->jfd) ){ 5739b3175389Sdanielk1977 if( pPager->journalMode==PAGER_JOURNALMODE_MEMORY ){ 5740b3175389Sdanielk1977 sqlite3MemJournalOpen(pPager->jfd); 5741b3175389Sdanielk1977 }else{ 57429131ab93Sdan int flags = SQLITE_OPEN_READWRITE|SQLITE_OPEN_CREATE; 57439131ab93Sdan int nSpill; 57449131ab93Sdan 57459131ab93Sdan if( pPager->tempFile ){ 57469131ab93Sdan flags |= (SQLITE_OPEN_DELETEONCLOSE|SQLITE_OPEN_TEMP_JOURNAL); 57479131ab93Sdan nSpill = sqlite3Config.nStmtSpill; 57489131ab93Sdan }else{ 57499131ab93Sdan flags |= SQLITE_OPEN_MAIN_JOURNAL; 57509131ab93Sdan nSpill = jrnlBufferSize(pPager); 57519131ab93Sdan } 57523fee8a63Sdrh 57533fee8a63Sdrh /* Verify that the database still has the same name as it did when 57543fee8a63Sdrh ** it was originally opened. */ 57553fee8a63Sdrh rc = databaseIsUnmoved(pPager); 57563fee8a63Sdrh if( rc==SQLITE_OK ){ 5757c7b6017cSdanielk1977 rc = sqlite3JournalOpen ( 57589131ab93Sdan pVfs, pPager->zJournal, pPager->jfd, flags, nSpill 5759c7b6017cSdanielk1977 ); 5760b3175389Sdanielk1977 } 57613fee8a63Sdrh } 5762bea2a948Sdanielk1977 assert( rc!=SQLITE_OK || isOpen(pPager->jfd) ); 5763600e46a0Sdrh } 5764bea2a948Sdanielk1977 5765bea2a948Sdanielk1977 5766bea2a948Sdanielk1977 /* Write the first journal header to the journal file and open 5767bea2a948Sdanielk1977 ** the sub-journal if necessary. 5768bea2a948Sdanielk1977 */ 5769bea2a948Sdanielk1977 if( rc==SQLITE_OK ){ 5770bea2a948Sdanielk1977 /* TODO: Check if all of these are really required. */ 5771968af52aSdrh pPager->nRec = 0; 5772bea2a948Sdanielk1977 pPager->journalOff = 0; 5773bea2a948Sdanielk1977 pPager->setMaster = 0; 5774bea2a948Sdanielk1977 pPager->journalHdr = 0; 57757657240aSdanielk1977 rc = writeJournalHdr(pPager); 5776bea2a948Sdanielk1977 } 5777d0864087Sdan } 57789c105bb9Sdrh 5779bea2a948Sdanielk1977 if( rc!=SQLITE_OK ){ 5780f5e7bb51Sdrh sqlite3BitvecDestroy(pPager->pInJournal); 5781f5e7bb51Sdrh pPager->pInJournal = 0; 5782d0864087Sdan }else{ 5783de1ae34eSdan assert( pPager->eState==PAGER_WRITER_LOCKED ); 5784d0864087Sdan pPager->eState = PAGER_WRITER_CACHEMOD; 5785bea2a948Sdanielk1977 } 5786d0864087Sdan 57879c105bb9Sdrh return rc; 5788da47d774Sdrh } 5789da47d774Sdrh 5790da47d774Sdrh /* 5791bea2a948Sdanielk1977 ** Begin a write-transaction on the specified pager object. If a 5792bea2a948Sdanielk1977 ** write-transaction has already been opened, this function is a no-op. 57934b845d7eSdrh ** 5794bea2a948Sdanielk1977 ** If the exFlag argument is false, then acquire at least a RESERVED 5795bea2a948Sdanielk1977 ** lock on the database file. If exFlag is true, then acquire at least 5796bea2a948Sdanielk1977 ** an EXCLUSIVE lock. If such a lock is already held, no locking 5797bea2a948Sdanielk1977 ** functions need be called. 57984b845d7eSdrh ** 5799d829335eSdanielk1977 ** If the subjInMemory argument is non-zero, then any sub-journal opened 5800d829335eSdanielk1977 ** within this transaction will be opened as an in-memory file. This 5801d829335eSdanielk1977 ** has no effect if the sub-journal is already opened (as it may be when 5802d829335eSdanielk1977 ** running in exclusive mode) or if the transaction does not require a 5803d829335eSdanielk1977 ** sub-journal. If the subjInMemory argument is zero, then any required 5804d829335eSdanielk1977 ** sub-journal is implemented in-memory if pPager is an in-memory database, 5805d829335eSdanielk1977 ** or using a temporary file otherwise. 58064b845d7eSdrh */ 5807d829335eSdanielk1977 int sqlite3PagerBegin(Pager *pPager, int exFlag, int subjInMemory){ 58084b845d7eSdrh int rc = SQLITE_OK; 5809719e3a7aSdrh 581089bd82aeSdrh if( pPager->errCode ) return pPager->errCode; 5811719e3a7aSdrh assert( pPager->eState>=PAGER_READER && pPager->eState<PAGER_ERROR ); 581260a4b538Sshane pPager->subjInMemory = (u8)subjInMemory; 58135543759bSdan 581422b328b2Sdan if( ALWAYS(pPager->eState==PAGER_READER) ){ 5815f5e7bb51Sdrh assert( pPager->pInJournal==0 ); 5816bea2a948Sdanielk1977 58177ed91f23Sdrh if( pagerUseWal(pPager) ){ 58185543759bSdan /* If the pager is configured to use locking_mode=exclusive, and an 58195543759bSdan ** exclusive lock on the database is not already held, obtain it now. 58205543759bSdan */ 582161e4acecSdrh if( pPager->exclusiveMode && sqlite3WalExclusiveMode(pPager->pWal, -1) ){ 58224e004aa6Sdan rc = pagerLockDb(pPager, EXCLUSIVE_LOCK); 58235543759bSdan if( rc!=SQLITE_OK ){ 58245543759bSdan return rc; 58255543759bSdan } 5826b4acd6a8Sdrh (void)sqlite3WalExclusiveMode(pPager->pWal, 1); 58275543759bSdan } 58285543759bSdan 582964d039e5Sdan /* Grab the write lock on the log file. If successful, upgrade to 58305543759bSdan ** PAGER_RESERVED state. Otherwise, return an error code to the caller. 583164d039e5Sdan ** The busy-handler is not invoked if another connection already 583264d039e5Sdan ** holds the write-lock. If possible, the upper layer will call it. 583364d039e5Sdan */ 583473b64e4dSdrh rc = sqlite3WalBeginWriteTransaction(pPager->pWal); 583564d039e5Sdan }else{ 5836bea2a948Sdanielk1977 /* Obtain a RESERVED lock on the database file. If the exFlag parameter 5837bea2a948Sdanielk1977 ** is true, then immediately upgrade this to an EXCLUSIVE lock. The 5838bea2a948Sdanielk1977 ** busy-handler callback can be used when upgrading to the EXCLUSIVE 5839bea2a948Sdanielk1977 ** lock, but not when obtaining the RESERVED lock. 5840bea2a948Sdanielk1977 */ 58414e004aa6Sdan rc = pagerLockDb(pPager, RESERVED_LOCK); 5842d0864087Sdan if( rc==SQLITE_OK && exFlag ){ 5843684917c2Sdrh rc = pager_wait_on_lock(pPager, EXCLUSIVE_LOCK); 5844684917c2Sdrh } 5845684917c2Sdrh } 58467c24610eSdan 5847d0864087Sdan if( rc==SQLITE_OK ){ 5848de1ae34eSdan /* Change to WRITER_LOCKED state. 5849d0864087Sdan ** 5850de1ae34eSdan ** WAL mode sets Pager.eState to PAGER_WRITER_LOCKED or CACHEMOD 5851d0864087Sdan ** when it has an open transaction, but never to DBMOD or FINISHED. 5852d0864087Sdan ** This is because in those states the code to roll back savepoint 5853d0864087Sdan ** transactions may copy data from the sub-journal into the database 5854d0864087Sdan ** file as well as into the page cache. Which would be incorrect in 5855d0864087Sdan ** WAL mode. 5856bea2a948Sdanielk1977 */ 5857de1ae34eSdan pPager->eState = PAGER_WRITER_LOCKED; 5858c864912aSdan pPager->dbHintSize = pPager->dbSize; 5859c864912aSdan pPager->dbFileSize = pPager->dbSize; 5860c864912aSdan pPager->dbOrigSize = pPager->dbSize; 5861d0864087Sdan pPager->journalOff = 0; 586275a40127Sdanielk1977 } 5863d0864087Sdan 5864d0864087Sdan assert( rc==SQLITE_OK || pPager->eState==PAGER_READER ); 5865de1ae34eSdan assert( rc!=SQLITE_OK || pPager->eState==PAGER_WRITER_LOCKED ); 5866d0864087Sdan assert( assert_pager_state(pPager) ); 58673ad5fd25Sdan } 58683ad5fd25Sdan 58693ad5fd25Sdan PAGERTRACE(("TRANSACTION %d\n", PAGERID(pPager))); 58704b845d7eSdrh return rc; 58714b845d7eSdrh } 58724b845d7eSdrh 58734b845d7eSdrh /* 587482ef8775Sdrh ** Write page pPg onto the end of the rollback journal. 5875ed7c855cSdrh */ 587682ef8775Sdrh static SQLITE_NOINLINE int pagerAddPageToRollbackJournal(PgHdr *pPg){ 587769688d5fSdrh Pager *pPager = pPg->pPager; 587882ef8775Sdrh int rc; 5879bf4bca54Sdrh u32 cksum; 5880bf4bca54Sdrh char *pData2; 588173d66fdbSdan i64 iOff = pPager->journalOff; 5882dd97a49cSdanielk1977 5883267cb326Sdrh /* We should never write to the journal file the page that 5884267cb326Sdrh ** contains the database locks. The following assert verifies 5885267cb326Sdrh ** that we do not. */ 5886267cb326Sdrh assert( pPg->pgno!=PAGER_MJ_PGNO(pPager) ); 588791781bd7Sdrh 588891781bd7Sdrh assert( pPager->journalHdr<=pPager->journalOff ); 5889fad3039cSmistachkin CODEC2(pPager, pPg->pData, pPg->pgno, 7, return SQLITE_NOMEM_BKPT, pData2); 58903752785fSdrh cksum = pager_cksum(pPager, (u8*)pData2); 589107cb560bSdanielk1977 589273d66fdbSdan /* Even if an IO or diskfull error occurs while journalling the 5893f3107512Sdanielk1977 ** page in the block above, set the need-sync flag for the page. 5894f3107512Sdanielk1977 ** Otherwise, when the transaction is rolled back, the logic in 5895f3107512Sdanielk1977 ** playback_one_page() will think that the page needs to be restored 5896f3107512Sdanielk1977 ** in the database file. And if an IO error occurs while doing so, 5897f3107512Sdanielk1977 ** then corruption may follow. 5898f3107512Sdanielk1977 */ 5899f3107512Sdanielk1977 pPg->flags |= PGHDR_NEED_SYNC; 5900f3107512Sdanielk1977 590173d66fdbSdan rc = write32bits(pPager->jfd, iOff, pPg->pgno); 590273d66fdbSdan if( rc!=SQLITE_OK ) return rc; 590373d66fdbSdan rc = sqlite3OsWrite(pPager->jfd, pData2, pPager->pageSize, iOff+4); 590473d66fdbSdan if( rc!=SQLITE_OK ) return rc; 590573d66fdbSdan rc = write32bits(pPager->jfd, iOff+pPager->pageSize+4, cksum); 590673d66fdbSdan if( rc!=SQLITE_OK ) return rc; 590707cb560bSdanielk1977 590873d66fdbSdan IOTRACE(("JOUT %p %d %lld %d\n", pPager, pPg->pgno, 590973d66fdbSdan pPager->journalOff, pPager->pageSize)); 591073d66fdbSdan PAGER_INCR(sqlite3_pager_writej_count); 591173d66fdbSdan PAGERTRACE(("JOURNAL %d page %d needSync=%d hash(%08x)\n", 591273d66fdbSdan PAGERID(pPager), pPg->pgno, 591373d66fdbSdan ((pPg->flags&PGHDR_NEED_SYNC)?1:0), pager_pagehash(pPg))); 591473d66fdbSdan 591573d66fdbSdan pPager->journalOff += 8 + pPager->pageSize; 591699ee3600Sdrh pPager->nRec++; 5917f5e7bb51Sdrh assert( pPager->pInJournal!=0 ); 59187539b6b8Sdrh rc = sqlite3BitvecSet(pPager->pInJournal, pPg->pgno); 59197539b6b8Sdrh testcase( rc==SQLITE_NOMEM ); 59207539b6b8Sdrh assert( rc==SQLITE_OK || rc==SQLITE_NOMEM ); 59217539b6b8Sdrh rc |= addToSavepointBitvecs(pPager, pPg->pgno); 592282ef8775Sdrh assert( rc==SQLITE_OK || rc==SQLITE_NOMEM ); 592382ef8775Sdrh return rc; 592482ef8775Sdrh } 592582ef8775Sdrh 592682ef8775Sdrh /* 592782ef8775Sdrh ** Mark a single data page as writeable. The page is written into the 592882ef8775Sdrh ** main journal or sub-journal as required. If the page is written into 592982ef8775Sdrh ** one of the journals, the corresponding bit is set in the 593082ef8775Sdrh ** Pager.pInJournal bitvec and the PagerSavepoint.pInSavepoint bitvecs 593182ef8775Sdrh ** of any open savepoints as appropriate. 593282ef8775Sdrh */ 593382ef8775Sdrh static int pager_write(PgHdr *pPg){ 593482ef8775Sdrh Pager *pPager = pPg->pPager; 593582ef8775Sdrh int rc = SQLITE_OK; 593682ef8775Sdrh 593782ef8775Sdrh /* This routine is not called unless a write-transaction has already 593882ef8775Sdrh ** been started. The journal file may or may not be open at this point. 593982ef8775Sdrh ** It is never called in the ERROR state. 594082ef8775Sdrh */ 594182ef8775Sdrh assert( pPager->eState==PAGER_WRITER_LOCKED 594282ef8775Sdrh || pPager->eState==PAGER_WRITER_CACHEMOD 594382ef8775Sdrh || pPager->eState==PAGER_WRITER_DBMOD 594482ef8775Sdrh ); 594582ef8775Sdrh assert( assert_pager_state(pPager) ); 594682ef8775Sdrh assert( pPager->errCode==0 ); 594782ef8775Sdrh assert( pPager->readOnly==0 ); 594882ef8775Sdrh CHECK_PAGE(pPg); 594982ef8775Sdrh 595082ef8775Sdrh /* The journal file needs to be opened. Higher level routines have already 595182ef8775Sdrh ** obtained the necessary locks to begin the write-transaction, but the 595282ef8775Sdrh ** rollback journal might not yet be open. Open it now if this is the case. 595382ef8775Sdrh ** 595482ef8775Sdrh ** This is done before calling sqlite3PcacheMakeDirty() on the page. 595582ef8775Sdrh ** Otherwise, if it were done after calling sqlite3PcacheMakeDirty(), then 595682ef8775Sdrh ** an error might occur and the pager would end up in WRITER_LOCKED state 595782ef8775Sdrh ** with pages marked as dirty in the cache. 595882ef8775Sdrh */ 595982ef8775Sdrh if( pPager->eState==PAGER_WRITER_LOCKED ){ 596082ef8775Sdrh rc = pager_open_journal(pPager); 596182ef8775Sdrh if( rc!=SQLITE_OK ) return rc; 596282ef8775Sdrh } 596382ef8775Sdrh assert( pPager->eState>=PAGER_WRITER_CACHEMOD ); 596482ef8775Sdrh assert( assert_pager_state(pPager) ); 596582ef8775Sdrh 596682ef8775Sdrh /* Mark the page that is about to be modified as dirty. */ 596782ef8775Sdrh sqlite3PcacheMakeDirty(pPg); 596882ef8775Sdrh 596982ef8775Sdrh /* If a rollback journal is in use, them make sure the page that is about 597082ef8775Sdrh ** to change is in the rollback journal, or if the page is a new page off 597182ef8775Sdrh ** then end of the file, make sure it is marked as PGHDR_NEED_SYNC. 597282ef8775Sdrh */ 597382ef8775Sdrh assert( (pPager->pInJournal!=0) == isOpen(pPager->jfd) ); 5974e399ac2eSdrh if( pPager->pInJournal!=0 5975e399ac2eSdrh && sqlite3BitvecTestNotNull(pPager->pInJournal, pPg->pgno)==0 597682ef8775Sdrh ){ 597782ef8775Sdrh assert( pagerUseWal(pPager)==0 ); 597882ef8775Sdrh if( pPg->pgno<=pPager->dbOrigSize ){ 597982ef8775Sdrh rc = pagerAddPageToRollbackJournal(pPg); 59807539b6b8Sdrh if( rc!=SQLITE_OK ){ 59817539b6b8Sdrh return rc; 59827539b6b8Sdrh } 5983db48ee02Sdrh }else{ 5984937ac9daSdan if( pPager->eState!=PAGER_WRITER_DBMOD ){ 59858c0a791aSdanielk1977 pPg->flags |= PGHDR_NEED_SYNC; 5986db48ee02Sdrh } 598730d53701Sdrh PAGERTRACE(("APPEND %d page %d needSync=%d\n", 59888c0a791aSdanielk1977 PAGERID(pPager), pPg->pgno, 598930d53701Sdrh ((pPg->flags&PGHDR_NEED_SYNC)?1:0))); 59908c0a791aSdanielk1977 } 5991d9b0257aSdrh } 59926446c4dcSdrh 59931aacbdb3Sdrh /* The PGHDR_DIRTY bit is set above when the page was added to the dirty-list 59941aacbdb3Sdrh ** and before writing the page into the rollback journal. Wait until now, 59951aacbdb3Sdrh ** after the page has been successfully journalled, before setting the 59961aacbdb3Sdrh ** PGHDR_WRITEABLE bit that indicates that the page can be safely modified. 59971aacbdb3Sdrh */ 59981aacbdb3Sdrh pPg->flags |= PGHDR_WRITEABLE; 59991aacbdb3Sdrh 6000ac69b05eSdrh /* If the statement journal is open and the page is not in it, 600182ef8775Sdrh ** then write the page into the statement journal. 60026446c4dcSdrh */ 600360e32edbSdrh if( pPager->nSavepoint>0 ){ 600460e32edbSdrh rc = subjournalPageIfRequired(pPg); 6005ac69b05eSdrh } 6006fa86c412Sdrh 600782ef8775Sdrh /* Update the database size and return. */ 6008d92db531Sdanielk1977 if( pPager->dbSize<pPg->pgno ){ 6009306dc213Sdrh pPager->dbSize = pPg->pgno; 6010306dc213Sdrh } 601169688d5fSdrh return rc; 6012ed7c855cSdrh } 6013ed7c855cSdrh 6014ed7c855cSdrh /* 6015f063e08fSdrh ** This is a variant of sqlite3PagerWrite() that runs when the sector size 6016f063e08fSdrh ** is larger than the page size. SQLite makes the (reasonable) assumption that 6017f063e08fSdrh ** all bytes of a sector are written together by hardware. Hence, all bytes of 6018f063e08fSdrh ** a sector need to be journalled in case of a power loss in the middle of 6019f063e08fSdrh ** a write. 60204099f6e1Sdanielk1977 ** 6021f063e08fSdrh ** Usually, the sector size is less than or equal to the page size, in which 6022e399ac2eSdrh ** case pages can be individually written. This routine only runs in the 6023e399ac2eSdrh ** exceptional case where the page size is smaller than the sector size. 60244099f6e1Sdanielk1977 */ 6025f063e08fSdrh static SQLITE_NOINLINE int pagerWriteLargeSector(PgHdr *pPg){ 6026f063e08fSdrh int rc = SQLITE_OK; /* Return code */ 60274099f6e1Sdanielk1977 Pgno nPageCount; /* Total number of pages in database file */ 60284099f6e1Sdanielk1977 Pgno pg1; /* First page of the sector pPg is located on. */ 60297d113eb0Sdrh int nPage = 0; /* Number of pages starting at pg1 to journal */ 6030bea2a948Sdanielk1977 int ii; /* Loop counter */ 6031bea2a948Sdanielk1977 int needSync = 0; /* True if any page has PGHDR_NEED_SYNC */ 6032f063e08fSdrh Pager *pPager = pPg->pPager; /* The pager that owns pPg */ 6033c65faab2Sdrh Pgno nPagePerSector = (pPager->sectorSize/pPager->pageSize); 60344099f6e1Sdanielk1977 603540c3941cSdrh /* Set the doNotSpill NOSYNC bit to 1. This is because we cannot allow 6036314f30dbSdrh ** a journal header to be written between the pages journaled by 6037314f30dbSdrh ** this function. 60384099f6e1Sdanielk1977 */ 6039b3175389Sdanielk1977 assert( !MEMDB ); 604040c3941cSdrh assert( (pPager->doNotSpill & SPILLFLAG_NOSYNC)==0 ); 604140c3941cSdrh pPager->doNotSpill |= SPILLFLAG_NOSYNC; 60424099f6e1Sdanielk1977 60434099f6e1Sdanielk1977 /* This trick assumes that both the page-size and sector-size are 60444099f6e1Sdanielk1977 ** an integer power of 2. It sets variable pg1 to the identifier 60454099f6e1Sdanielk1977 ** of the first page of the sector pPg is located on. 60464099f6e1Sdanielk1977 */ 60474099f6e1Sdanielk1977 pg1 = ((pPg->pgno-1) & ~(nPagePerSector-1)) + 1; 60484099f6e1Sdanielk1977 6049937ac9daSdan nPageCount = pPager->dbSize; 60504099f6e1Sdanielk1977 if( pPg->pgno>nPageCount ){ 60514099f6e1Sdanielk1977 nPage = (pPg->pgno - pg1)+1; 60524099f6e1Sdanielk1977 }else if( (pg1+nPagePerSector-1)>nPageCount ){ 60534099f6e1Sdanielk1977 nPage = nPageCount+1-pg1; 60544099f6e1Sdanielk1977 }else{ 60554099f6e1Sdanielk1977 nPage = nPagePerSector; 60564099f6e1Sdanielk1977 } 60574099f6e1Sdanielk1977 assert(nPage>0); 60584099f6e1Sdanielk1977 assert(pg1<=pPg->pgno); 60594099f6e1Sdanielk1977 assert((pg1+nPage)>pPg->pgno); 60604099f6e1Sdanielk1977 60614099f6e1Sdanielk1977 for(ii=0; ii<nPage && rc==SQLITE_OK; ii++){ 60624099f6e1Sdanielk1977 Pgno pg = pg1+ii; 6063dd97a49cSdanielk1977 PgHdr *pPage; 6064f5e7bb51Sdrh if( pg==pPg->pgno || !sqlite3BitvecTest(pPager->pInJournal, pg) ){ 60654099f6e1Sdanielk1977 if( pg!=PAGER_MJ_PGNO(pPager) ){ 60669584f58cSdrh rc = sqlite3PagerGet(pPager, pg, &pPage, 0); 60674099f6e1Sdanielk1977 if( rc==SQLITE_OK ){ 60684099f6e1Sdanielk1977 rc = pager_write(pPage); 60698c0a791aSdanielk1977 if( pPage->flags&PGHDR_NEED_SYNC ){ 6070dd97a49cSdanielk1977 needSync = 1; 6071dd97a49cSdanielk1977 } 6072da8a330aSdrh sqlite3PagerUnrefNotNull(pPage); 60734099f6e1Sdanielk1977 } 60744099f6e1Sdanielk1977 } 6075c137807aSdrh }else if( (pPage = sqlite3PagerLookup(pPager, pg))!=0 ){ 60768c0a791aSdanielk1977 if( pPage->flags&PGHDR_NEED_SYNC ){ 6077dd97a49cSdanielk1977 needSync = 1; 60784099f6e1Sdanielk1977 } 6079da8a330aSdrh sqlite3PagerUnrefNotNull(pPage); 60804099f6e1Sdanielk1977 } 6081dd97a49cSdanielk1977 } 6082dd97a49cSdanielk1977 6083ee03d629Sdrh /* If the PGHDR_NEED_SYNC flag is set for any of the nPage pages 6084dd97a49cSdanielk1977 ** starting at pg1, then it needs to be set for all of them. Because 6085dd97a49cSdanielk1977 ** writing to any of these nPage pages may damage the others, the 6086dd97a49cSdanielk1977 ** journal file must contain sync()ed copies of all of them 6087dd97a49cSdanielk1977 ** before any of them can be written out to the database file. 6088dd97a49cSdanielk1977 */ 6089a299d612Sdanielk1977 if( rc==SQLITE_OK && needSync ){ 609073d66fdbSdan assert( !MEMDB ); 6091b480dc23Sdrh for(ii=0; ii<nPage; ii++){ 6092c137807aSdrh PgHdr *pPage = sqlite3PagerLookup(pPager, pg1+ii); 6093ee03d629Sdrh if( pPage ){ 6094ee03d629Sdrh pPage->flags |= PGHDR_NEED_SYNC; 6095da8a330aSdrh sqlite3PagerUnrefNotNull(pPage); 6096dd97a49cSdanielk1977 } 6097ee03d629Sdrh } 6098dd97a49cSdanielk1977 } 60994099f6e1Sdanielk1977 610040c3941cSdrh assert( (pPager->doNotSpill & SPILLFLAG_NOSYNC)!=0 ); 610140c3941cSdrh pPager->doNotSpill &= ~SPILLFLAG_NOSYNC; 61024099f6e1Sdanielk1977 return rc; 61034099f6e1Sdanielk1977 } 61044099f6e1Sdanielk1977 61054099f6e1Sdanielk1977 /* 6106f063e08fSdrh ** Mark a data page as writeable. This routine must be called before 6107f063e08fSdrh ** making changes to a page. The caller must check the return value 6108f063e08fSdrh ** of this function and be careful not to change any page data unless 6109f063e08fSdrh ** this routine returns SQLITE_OK. 6110f063e08fSdrh ** 6111f063e08fSdrh ** The difference between this function and pager_write() is that this 6112f063e08fSdrh ** function also deals with the special case where 2 or more pages 6113f063e08fSdrh ** fit on a single disk sector. In this case all co-resident pages 6114f063e08fSdrh ** must have been written to the journal file before returning. 6115f063e08fSdrh ** 6116f063e08fSdrh ** If an error occurs, SQLITE_NOMEM or an IO error code is returned 6117f063e08fSdrh ** as appropriate. Otherwise, SQLITE_OK. 6118f063e08fSdrh */ 6119f063e08fSdrh int sqlite3PagerWrite(PgHdr *pPg){ 6120b3475530Sdrh Pager *pPager = pPg->pPager; 612150642b1dSdrh assert( (pPg->flags & PGHDR_MMAP)==0 ); 612250642b1dSdrh assert( pPager->eState>=PAGER_WRITER_LOCKED ); 612350642b1dSdrh assert( assert_pager_state(pPager) ); 61246606586dSdrh if( (pPg->flags & PGHDR_WRITEABLE)!=0 && pPager->dbSize>=pPg->pgno ){ 6125b3475530Sdrh if( pPager->nSavepoint ) return subjournalPageIfRequired(pPg); 6126b3475530Sdrh return SQLITE_OK; 61276606586dSdrh }else if( pPager->errCode ){ 61286606586dSdrh return pPager->errCode; 6129b3475530Sdrh }else if( pPager->sectorSize > (u32)pPager->pageSize ){ 613041113b64Sdan assert( pPager->tempFile==0 ); 6131f063e08fSdrh return pagerWriteLargeSector(pPg); 6132f063e08fSdrh }else{ 6133f063e08fSdrh return pager_write(pPg); 6134f063e08fSdrh } 6135f063e08fSdrh } 6136f063e08fSdrh 6137f063e08fSdrh /* 6138aacc543eSdrh ** Return TRUE if the page given in the argument was previously passed 61393b8a05f6Sdanielk1977 ** to sqlite3PagerWrite(). In other words, return TRUE if it is ok 61406019e168Sdrh ** to change the content of the page. 61416019e168Sdrh */ 61427d3a666fSdanielk1977 #ifndef NDEBUG 61433b8a05f6Sdanielk1977 int sqlite3PagerIswriteable(DbPage *pPg){ 61441aacbdb3Sdrh return pPg->flags & PGHDR_WRITEABLE; 61456019e168Sdrh } 61467d3a666fSdanielk1977 #endif 61476019e168Sdrh 6148001bbcbbSdrh /* 614930e58750Sdrh ** A call to this routine tells the pager that it is not necessary to 6150538f570cSdrh ** write the information on page pPg back to the disk, even though 6151dfe88eceSdrh ** that page might be marked as dirty. This happens, for example, when 6152dfe88eceSdrh ** the page has been added as a leaf of the freelist and so its 6153dfe88eceSdrh ** content no longer matters. 615430e58750Sdrh ** 615530e58750Sdrh ** The overlying software layer calls this routine when all of the data 615630e58750Sdrh ** on the given page is unused. The pager marks the page as clean so 615730e58750Sdrh ** that it does not get written to disk. 615830e58750Sdrh ** 6159bea2a948Sdanielk1977 ** Tests show that this optimization can quadruple the speed of large 6160bea2a948Sdanielk1977 ** DELETE operations. 6161c88ae52dSdan ** 6162c88ae52dSdan ** This optimization cannot be used with a temp-file, as the page may 6163c88ae52dSdan ** have been dirty at the start of the transaction. In that case, if 6164c88ae52dSdan ** memory pressure forces page pPg out of the cache, the data does need 6165c88ae52dSdan ** to be written out to disk so that it may be read back in if the 6166c88ae52dSdan ** current transaction is rolled back. 616730e58750Sdrh */ 6168bea2a948Sdanielk1977 void sqlite3PagerDontWrite(PgHdr *pPg){ 6169538f570cSdrh Pager *pPager = pPg->pPager; 6170c88ae52dSdan if( !pPager->tempFile && (pPg->flags&PGHDR_DIRTY) && pPager->nSavepoint==0 ){ 617130d53701Sdrh PAGERTRACE(("DONT_WRITE page %d of %d\n", pPg->pgno, PAGERID(pPager))); 6172538f570cSdrh IOTRACE(("CLEAN %p %d\n", pPager, pPg->pgno)) 617333e3216aSdanielk1977 pPg->flags |= PGHDR_DONT_WRITE; 6174b3475530Sdrh pPg->flags &= ~PGHDR_WRITEABLE; 6175a0f6b124Sdrh testcase( pPg->flags & PGHDR_NEED_SYNC ); 61765f848c3aSdan pager_set_pagehash(pPg); 617730e58750Sdrh } 617830e58750Sdrh } 617945d6882fSdanielk1977 618045d6882fSdanielk1977 /* 6181bea2a948Sdanielk1977 ** This routine is called to increment the value of the database file 6182bea2a948Sdanielk1977 ** change-counter, stored as a 4-byte big-endian integer starting at 618354a7347aSdrh ** byte offset 24 of the pager file. The secondary change counter at 618454a7347aSdrh ** 92 is also updated, as is the SQLite version number at offset 96. 618554a7347aSdrh ** 618654a7347aSdrh ** But this only happens if the pPager->changeCountDone flag is false. 618754a7347aSdrh ** To avoid excess churning of page 1, the update only happens once. 618854a7347aSdrh ** See also the pager_write_changecounter() routine that does an 618954a7347aSdrh ** unconditional update of the change counters. 619045d6882fSdanielk1977 ** 6191b480dc23Sdrh ** If the isDirectMode flag is zero, then this is done by calling 6192bea2a948Sdanielk1977 ** sqlite3PagerWrite() on page 1, then modifying the contents of the 6193bea2a948Sdanielk1977 ** page data. In this case the file will be updated when the current 6194bea2a948Sdanielk1977 ** transaction is committed. 619545d6882fSdanielk1977 ** 6196b480dc23Sdrh ** The isDirectMode flag may only be non-zero if the library was compiled 6197bea2a948Sdanielk1977 ** with the SQLITE_ENABLE_ATOMIC_WRITE macro defined. In this case, 6198bea2a948Sdanielk1977 ** if isDirect is non-zero, then the database file is updated directly 6199bea2a948Sdanielk1977 ** by writing an updated version of page 1 using a call to the 6200bea2a948Sdanielk1977 ** sqlite3OsWrite() function. 620145d6882fSdanielk1977 */ 6202bea2a948Sdanielk1977 static int pager_incr_changecounter(Pager *pPager, int isDirectMode){ 6203c7b6017cSdanielk1977 int rc = SQLITE_OK; 620480e35f46Sdrh 6205d0864087Sdan assert( pPager->eState==PAGER_WRITER_CACHEMOD 6206d0864087Sdan || pPager->eState==PAGER_WRITER_DBMOD 6207d0864087Sdan ); 6208d0864087Sdan assert( assert_pager_state(pPager) ); 6209d0864087Sdan 6210bea2a948Sdanielk1977 /* Declare and initialize constant integer 'isDirect'. If the 6211bea2a948Sdanielk1977 ** atomic-write optimization is enabled in this build, then isDirect 6212bea2a948Sdanielk1977 ** is initialized to the value passed as the isDirectMode parameter 6213bea2a948Sdanielk1977 ** to this function. Otherwise, it is always set to zero. 6214bea2a948Sdanielk1977 ** 6215bea2a948Sdanielk1977 ** The idea is that if the atomic-write optimization is not 6216bea2a948Sdanielk1977 ** enabled at compile time, the compiler can omit the tests of 6217bea2a948Sdanielk1977 ** 'isDirect' below, as well as the block enclosed in the 6218bea2a948Sdanielk1977 ** "if( isDirect )" condition. 6219bea2a948Sdanielk1977 */ 6220701bb3b4Sdrh #ifndef SQLITE_ENABLE_ATOMIC_WRITE 6221b480dc23Sdrh # define DIRECT_MODE 0 6222bea2a948Sdanielk1977 assert( isDirectMode==0 ); 6223dc86e2b2Sdrh UNUSED_PARAMETER(isDirectMode); 6224bea2a948Sdanielk1977 #else 6225b480dc23Sdrh # define DIRECT_MODE isDirectMode 6226701bb3b4Sdrh #endif 6227bea2a948Sdanielk1977 6228aa2db79aSdrh if( !pPager->changeCountDone && ALWAYS(pPager->dbSize>0) ){ 6229bea2a948Sdanielk1977 PgHdr *pPgHdr; /* Reference to page 1 */ 6230bea2a948Sdanielk1977 6231bea2a948Sdanielk1977 assert( !pPager->tempFile && isOpen(pPager->fd) ); 6232bea2a948Sdanielk1977 623380e35f46Sdrh /* Open page 1 of the file for writing. */ 62349584f58cSdrh rc = sqlite3PagerGet(pPager, 1, &pPgHdr, 0); 6235bea2a948Sdanielk1977 assert( pPgHdr==0 || rc==SQLITE_OK ); 6236c7b6017cSdanielk1977 6237bea2a948Sdanielk1977 /* If page one was fetched successfully, and this function is not 6238ad7516c4Sdrh ** operating in direct-mode, make page 1 writable. When not in 6239ad7516c4Sdrh ** direct mode, page 1 is always held in cache and hence the PagerGet() 6240ad7516c4Sdrh ** above is always successful - hence the ALWAYS on rc==SQLITE_OK. 6241bea2a948Sdanielk1977 */ 6242c5aae5c9Sdrh if( !DIRECT_MODE && ALWAYS(rc==SQLITE_OK) ){ 624380e35f46Sdrh rc = sqlite3PagerWrite(pPgHdr); 6244c7b6017cSdanielk1977 } 624580e35f46Sdrh 6246bea2a948Sdanielk1977 if( rc==SQLITE_OK ){ 624754a7347aSdrh /* Actually do the update of the change counter */ 624854a7347aSdrh pager_write_changecounter(pPgHdr); 6249f92a4e35Sdrh 6250bea2a948Sdanielk1977 /* If running in direct mode, write the contents of page 1 to the file. */ 6251b480dc23Sdrh if( DIRECT_MODE ){ 625268928b6cSdan const void *zBuf; 62533460d19cSdanielk1977 assert( pPager->dbFileSize>0 ); 6254fad3039cSmistachkin CODEC2(pPager, pPgHdr->pData, 1, 6, rc=SQLITE_NOMEM_BKPT, zBuf); 625568928b6cSdan if( rc==SQLITE_OK ){ 6256c7b6017cSdanielk1977 rc = sqlite3OsWrite(pPager->fd, zBuf, pPager->pageSize, 0); 62579ad3ee40Sdrh pPager->aStat[PAGER_STAT_WRITE]++; 625868928b6cSdan } 6259bea2a948Sdanielk1977 if( rc==SQLITE_OK ){ 62608e4714b3Sdan /* Update the pager's copy of the change-counter. Otherwise, the 62618e4714b3Sdan ** next time a read transaction is opened the cache will be 62628e4714b3Sdan ** flushed (as the change-counter values will not match). */ 62638e4714b3Sdan const void *pCopy = (const void *)&((const char *)zBuf)[24]; 62648e4714b3Sdan memcpy(&pPager->dbFileVers, pCopy, sizeof(pPager->dbFileVers)); 6265bea2a948Sdanielk1977 pPager->changeCountDone = 1; 6266bea2a948Sdanielk1977 } 6267b480dc23Sdrh }else{ 6268b480dc23Sdrh pPager->changeCountDone = 1; 6269b480dc23Sdrh } 6270bea2a948Sdanielk1977 } 6271c7b6017cSdanielk1977 627280e35f46Sdrh /* Release the page reference. */ 627380e35f46Sdrh sqlite3PagerUnref(pPgHdr); 627480e35f46Sdrh } 6275c7b6017cSdanielk1977 return rc; 627680e35f46Sdrh } 627780e35f46Sdrh 627880e35f46Sdrh /* 6279c97d8463Sdrh ** Sync the database file to disk. This is a no-op for in-memory databases 6280bea2a948Sdanielk1977 ** or pages with the Pager.noSync flag set. 6281bea2a948Sdanielk1977 ** 6282c97d8463Sdrh ** If successful, or if called on a pager for which it is a no-op, this 6283bea2a948Sdanielk1977 ** function returns SQLITE_OK. Otherwise, an IO error code is returned. 6284f653d782Sdanielk1977 */ 6285999cd08aSdan int sqlite3PagerSync(Pager *pPager, const char *zMaster){ 6286534a58a7Sdrh int rc = SQLITE_OK; 6287999cd08aSdan void *pArg = (void*)zMaster; 62886f68f164Sdan rc = sqlite3OsFileControl(pPager->fd, SQLITE_FCNTL_SYNC, pArg); 6289999cd08aSdan if( rc==SQLITE_NOTFOUND ) rc = SQLITE_OK; 6290a01abc30Sdan if( rc==SQLITE_OK && !pPager->noSync ){ 6291d1cf7e29Sdan assert( !MEMDB ); 6292c97d8463Sdrh rc = sqlite3OsSync(pPager->fd, pPager->syncFlags); 6293354bfe03Sdan } 6294f653d782Sdanielk1977 return rc; 6295f653d782Sdanielk1977 } 6296f653d782Sdanielk1977 6297f653d782Sdanielk1977 /* 6298eb9444a4Sdan ** This function may only be called while a write-transaction is active in 6299eb9444a4Sdan ** rollback. If the connection is in WAL mode, this call is a no-op. 6300eb9444a4Sdan ** Otherwise, if the connection does not already have an EXCLUSIVE lock on 6301eb9444a4Sdan ** the database file, an attempt is made to obtain one. 6302eb9444a4Sdan ** 6303eb9444a4Sdan ** If the EXCLUSIVE lock is already held or the attempt to obtain it is 6304eb9444a4Sdan ** successful, or the connection is in WAL mode, SQLITE_OK is returned. 6305eb9444a4Sdan ** Otherwise, either SQLITE_BUSY or an SQLITE_IOERR_XXX error code is 6306eb9444a4Sdan ** returned. 6307eb9444a4Sdan */ 6308eb9444a4Sdan int sqlite3PagerExclusiveLock(Pager *pPager){ 6309dbf6773eSdan int rc = pPager->errCode; 6310dbf6773eSdan assert( assert_pager_state(pPager) ); 6311dbf6773eSdan if( rc==SQLITE_OK ){ 6312d0864087Sdan assert( pPager->eState==PAGER_WRITER_CACHEMOD 6313d0864087Sdan || pPager->eState==PAGER_WRITER_DBMOD 6314de1ae34eSdan || pPager->eState==PAGER_WRITER_LOCKED 6315d0864087Sdan ); 6316d0864087Sdan assert( assert_pager_state(pPager) ); 6317eb9444a4Sdan if( 0==pagerUseWal(pPager) ){ 631854919f82Sdan rc = pager_wait_on_lock(pPager, EXCLUSIVE_LOCK); 6319eb9444a4Sdan } 6320dbf6773eSdan } 6321eb9444a4Sdan return rc; 6322eb9444a4Sdan } 6323eb9444a4Sdan 6324eb9444a4Sdan /* 632580e35f46Sdrh ** Sync the database file for the pager pPager. zMaster points to the name 632680e35f46Sdrh ** of a master journal file that should be written into the individual 632780e35f46Sdrh ** journal file. zMaster may be NULL, which is interpreted as no master 632880e35f46Sdrh ** journal (a single database transaction). 632980e35f46Sdrh ** 6330bea2a948Sdanielk1977 ** This routine ensures that: 6331bea2a948Sdanielk1977 ** 6332bea2a948Sdanielk1977 ** * The database file change-counter is updated, 6333bea2a948Sdanielk1977 ** * the journal is synced (unless the atomic-write optimization is used), 6334bea2a948Sdanielk1977 ** * all dirty pages are written to the database file, 6335bea2a948Sdanielk1977 ** * the database file is truncated (if required), and 6336bea2a948Sdanielk1977 ** * the database file synced. 6337bea2a948Sdanielk1977 ** 6338bea2a948Sdanielk1977 ** The only thing that remains to commit the transaction is to finalize 6339bea2a948Sdanielk1977 ** (delete, truncate or zero the first part of) the journal file (or 6340bea2a948Sdanielk1977 ** delete the master journal file if specified). 634180e35f46Sdrh ** 634280e35f46Sdrh ** Note that if zMaster==NULL, this does not overwrite a previous value 634380e35f46Sdrh ** passed to an sqlite3PagerCommitPhaseOne() call. 634480e35f46Sdrh ** 6345f653d782Sdanielk1977 ** If the final parameter - noSync - is true, then the database file itself 6346f653d782Sdanielk1977 ** is not synced. The caller must call sqlite3PagerSync() directly to 6347f653d782Sdanielk1977 ** sync the database file before calling CommitPhaseTwo() to delete the 6348f653d782Sdanielk1977 ** journal file in this case. 634980e35f46Sdrh */ 6350f653d782Sdanielk1977 int sqlite3PagerCommitPhaseOne( 6351bea2a948Sdanielk1977 Pager *pPager, /* Pager object */ 6352bea2a948Sdanielk1977 const char *zMaster, /* If not NULL, the master journal name */ 6353bea2a948Sdanielk1977 int noSync /* True to omit the xSync on the db file */ 6354f653d782Sdanielk1977 ){ 6355bea2a948Sdanielk1977 int rc = SQLITE_OK; /* Return code */ 635680e35f46Sdrh 6357de1ae34eSdan assert( pPager->eState==PAGER_WRITER_LOCKED 6358d0864087Sdan || pPager->eState==PAGER_WRITER_CACHEMOD 6359d0864087Sdan || pPager->eState==PAGER_WRITER_DBMOD 63605db56401Sdan || pPager->eState==PAGER_ERROR 6361d0864087Sdan ); 6362d0864087Sdan assert( assert_pager_state(pPager) ); 6363d0864087Sdan 6364dd3cd977Sdrh /* If a prior error occurred, report that error again. */ 6365719e3a7aSdrh if( NEVER(pPager->errCode) ) return pPager->errCode; 6366dad31b5eSdanielk1977 6367ead01fd2Sdrh /* Provide the ability to easily simulate an I/O error during testing */ 6368a7a45973Sdrh if( sqlite3FaultSim(400) ) return SQLITE_IOERR; 6369ead01fd2Sdrh 637030d53701Sdrh PAGERTRACE(("DATABASE SYNC: File=%s zMaster=%s nSize=%d\n", 637130d53701Sdrh pPager->zFilename, zMaster, pPager->dbSize)); 637280e35f46Sdrh 6373d0864087Sdan /* If no database changes have been made, return early. */ 6374d0864087Sdan if( pPager->eState<PAGER_WRITER_CACHEMOD ) return SQLITE_OK; 6375d0864087Sdan 637641113b64Sdan assert( MEMDB==0 || pPager->tempFile ); 6377199f56b9Sdan assert( isOpen(pPager->fd) || pPager->tempFile ); 63784bf7d21fSdrh if( 0==pagerFlushOnCommit(pPager, 1) ){ 6379b480dc23Sdrh /* If this is an in-memory db, or no pages have been written to, or this 6380b480dc23Sdrh ** function has already been called, it is mostly a no-op. However, any 638141113b64Sdan ** backup in progress needs to be restarted. */ 63820410302eSdanielk1977 sqlite3BackupRestart(pPager->pBackup); 6383d0864087Sdan }else{ 6384140a5987Sdan PgHdr *pList; 63857ed91f23Sdrh if( pagerUseWal(pPager) ){ 6386e5a1320dSdrh PgHdr *pPageOne = 0; 6387140a5987Sdan pList = sqlite3PcacheDirtyList(pPager->pPCache); 6388e5a1320dSdrh if( pList==0 ){ 6389e5a1320dSdrh /* Must have at least one page for the WAL commit flag. 6390e5a1320dSdrh ** Ticket [2d1a5c67dfc2363e44f29d9bbd57f] 2011-05-18 */ 63919584f58cSdrh rc = sqlite3PagerGet(pPager, 1, &pPageOne, 0); 6392e5a1320dSdrh pList = pPageOne; 6393e5a1320dSdrh pList->pDirty = 0; 6394e5a1320dSdrh } 639514438d12Sdrh assert( rc==SQLITE_OK ); 639614438d12Sdrh if( ALWAYS(pList) ){ 63974eb02a45Sdrh rc = pagerWalFrames(pPager, pList, pPager->dbSize, 1); 63987c24610eSdan } 6399e5a1320dSdrh sqlite3PagerUnref(pPageOne); 640010ec894cSdan if( rc==SQLITE_OK ){ 64017c24610eSdan sqlite3PcacheCleanAll(pPager->pPCache); 640210ec894cSdan } 64037c24610eSdan }else{ 64042df9478fSdrh /* The bBatch boolean is true if the batch-atomic-write commit method 64052df9478fSdrh ** should be used. No rollback journal is created if batch-atomic-write 64062df9478fSdrh ** is enabled. 64072df9478fSdrh */ 64082df9478fSdrh #ifdef SQLITE_ENABLE_BATCH_ATOMIC_WRITE 6409378a2da9Sdrh sqlite3_file *fd = pPager->fd; 6410140a5987Sdan int bBatch = zMaster==0 /* An SQLITE_IOCAP_BATCH_ATOMIC commit */ 64112df9478fSdrh && (sqlite3OsDeviceCharacteristics(fd) & SQLITE_IOCAP_BATCH_ATOMIC) 64122df9478fSdrh && !pPager->noSync 64132df9478fSdrh && sqlite3JournalIsInMemory(pPager->jfd); 64142df9478fSdrh #else 64152df9478fSdrh # define bBatch 0 64162df9478fSdrh #endif 64172df9478fSdrh 64182df9478fSdrh #ifdef SQLITE_ENABLE_ATOMIC_WRITE 6419bea2a948Sdanielk1977 /* The following block updates the change-counter. Exactly how it 6420bea2a948Sdanielk1977 ** does this depends on whether or not the atomic-update optimization 6421bea2a948Sdanielk1977 ** was enabled at compile time, and if this transaction meets the 6422bea2a948Sdanielk1977 ** runtime criteria to use the operation: 6423c7b6017cSdanielk1977 ** 6424bea2a948Sdanielk1977 ** * The file-system supports the atomic-write property for 6425c7b6017cSdanielk1977 ** blocks of size page-size, and 6426bea2a948Sdanielk1977 ** * This commit is not part of a multi-file transaction, and 6427bea2a948Sdanielk1977 ** * Exactly one page has been modified and store in the journal file. 6428c7b6017cSdanielk1977 ** 6429bea2a948Sdanielk1977 ** If the optimization was not enabled at compile time, then the 6430bea2a948Sdanielk1977 ** pager_incr_changecounter() function is called to update the change 6431bea2a948Sdanielk1977 ** counter in 'indirect-mode'. If the optimization is compiled in but 6432bea2a948Sdanielk1977 ** is not applicable to this transaction, call sqlite3JournalCreate() 6433bea2a948Sdanielk1977 ** to make sure the journal file has actually been created, then call 6434bea2a948Sdanielk1977 ** pager_incr_changecounter() to update the change-counter in indirect 6435bea2a948Sdanielk1977 ** mode. 6436bea2a948Sdanielk1977 ** 6437bea2a948Sdanielk1977 ** Otherwise, if the optimization is both enabled and applicable, 6438bea2a948Sdanielk1977 ** then call pager_incr_changecounter() to update the change-counter 6439bea2a948Sdanielk1977 ** in 'direct' mode. In this case the journal file will never be 6440bea2a948Sdanielk1977 ** created for this transaction. 6441c7b6017cSdanielk1977 */ 6442efe16971Sdan if( bBatch==0 ){ 6443bea2a948Sdanielk1977 PgHdr *pPg; 64443f94b609Sdan assert( isOpen(pPager->jfd) 64453f94b609Sdan || pPager->journalMode==PAGER_JOURNALMODE_OFF 64463f94b609Sdan || pPager->journalMode==PAGER_JOURNALMODE_WAL 64473f94b609Sdan ); 6448bea2a948Sdanielk1977 if( !zMaster && isOpen(pPager->jfd) 6449bea2a948Sdanielk1977 && pPager->journalOff==jrnlBufferSize(pPager) 64504d9c1b7fSdan && pPager->dbSize>=pPager->dbOrigSize 6451efe16971Sdan && (!(pPg = sqlite3PcacheDirtyList(pPager->pPCache)) || 0==pPg->pDirty) 6452bea2a948Sdanielk1977 ){ 6453bea2a948Sdanielk1977 /* Update the db file change counter via the direct-write method. The 6454bea2a948Sdanielk1977 ** following call will modify the in-memory representation of page 1 6455bea2a948Sdanielk1977 ** to include the updated change counter and then write page 1 6456bea2a948Sdanielk1977 ** directly to the database file. Because of the atomic-write 6457bea2a948Sdanielk1977 ** property of the host file-system, this is safe. 6458c7b6017cSdanielk1977 */ 6459c7b6017cSdanielk1977 rc = pager_incr_changecounter(pPager, 1); 6460f55b8998Sdanielk1977 }else{ 6461f55b8998Sdanielk1977 rc = sqlite3JournalCreate(pPager->jfd); 6462bea2a948Sdanielk1977 if( rc==SQLITE_OK ){ 6463c7b6017cSdanielk1977 rc = pager_incr_changecounter(pPager, 0); 6464bea2a948Sdanielk1977 } 6465bea2a948Sdanielk1977 } 6466d67a9770Sdan } 6467140a5987Sdan #else /* SQLITE_ENABLE_ATOMIC_WRITE */ 6468d67a9770Sdan #ifdef SQLITE_ENABLE_BATCH_ATOMIC_WRITE 6469d67a9770Sdan if( zMaster ){ 6470d67a9770Sdan rc = sqlite3JournalCreate(pPager->jfd); 6471d67a9770Sdan if( rc!=SQLITE_OK ) goto commit_phase_one_exit; 6472140a5987Sdan assert( bBatch==0 ); 6473d67a9770Sdan } 6474bea2a948Sdanielk1977 #endif 6475efe16971Sdan rc = pager_incr_changecounter(pPager, 0); 6476140a5987Sdan #endif /* !SQLITE_ENABLE_ATOMIC_WRITE */ 6477bea2a948Sdanielk1977 if( rc!=SQLITE_OK ) goto commit_phase_one_exit; 6478bea2a948Sdanielk1977 6479bea2a948Sdanielk1977 /* Write the master journal name into the journal file. If a master 6480bea2a948Sdanielk1977 ** journal file name has already been written to the journal file, 6481bea2a948Sdanielk1977 ** or if zMaster is NULL (no master journal), then this call is a no-op. 6482bea2a948Sdanielk1977 */ 6483bea2a948Sdanielk1977 rc = writeMasterJournal(pPager, zMaster); 6484bea2a948Sdanielk1977 if( rc!=SQLITE_OK ) goto commit_phase_one_exit; 6485bea2a948Sdanielk1977 648651133eaeSdan /* Sync the journal file and write all dirty pages to the database. 648751133eaeSdan ** If the atomic-update optimization is being used, this sync will not 648851133eaeSdan ** create the journal file or perform any real IO. 648951133eaeSdan ** 649051133eaeSdan ** Because the change-counter page was just modified, unless the 649151133eaeSdan ** atomic-update optimization is used it is almost certain that the 649251133eaeSdan ** journal requires a sync here. However, in locking_mode=exclusive 649351133eaeSdan ** on a system under memory pressure it is just possible that this is 649451133eaeSdan ** not the case. In this case it is likely enough that the redundant 649551133eaeSdan ** xSync() call will be changed to a no-op by the OS anyhow. 6496bea2a948Sdanielk1977 */ 6497937ac9daSdan rc = syncJournal(pPager, 0); 6498bea2a948Sdanielk1977 if( rc!=SQLITE_OK ) goto commit_phase_one_exit; 6499bea2a948Sdanielk1977 6500140a5987Sdan pList = sqlite3PcacheDirtyList(pPager->pPCache); 65014522c3e8Sdan #ifdef SQLITE_ENABLE_BATCH_ATOMIC_WRITE 6502efe16971Sdan if( bBatch ){ 6503efe16971Sdan rc = sqlite3OsFileControl(fd, SQLITE_FCNTL_BEGIN_ATOMIC_WRITE, 0); 6504140a5987Sdan if( rc==SQLITE_OK ){ 6505140a5987Sdan rc = pager_write_pagelist(pPager, pList); 6506efe16971Sdan if( rc==SQLITE_OK ){ 6507efe16971Sdan rc = sqlite3OsFileControl(fd, SQLITE_FCNTL_COMMIT_ATOMIC_WRITE, 0); 6508b8fff29cSdan } 6509b8fff29cSdan if( rc!=SQLITE_OK ){ 6510b8fff29cSdan sqlite3OsFileControlHint(fd, SQLITE_FCNTL_ROLLBACK_ATOMIC_WRITE, 0); 6511efe16971Sdan } 6512efe16971Sdan } 6513efe16971Sdan 6514140a5987Sdan if( (rc&0xFF)==SQLITE_IOERR && rc!=SQLITE_IOERR_NOMEM ){ 6515140a5987Sdan rc = sqlite3JournalCreate(pPager->jfd); 6516140a5987Sdan if( rc!=SQLITE_OK ){ 6517140a5987Sdan sqlite3OsClose(pPager->jfd); 6518b0b02300Sdrh goto commit_phase_one_exit; 6519140a5987Sdan } 6520140a5987Sdan bBatch = 0; 6521140a5987Sdan }else{ 6522140a5987Sdan sqlite3OsClose(pPager->jfd); 6523140a5987Sdan } 6524140a5987Sdan } 65254522c3e8Sdan #endif /* SQLITE_ENABLE_BATCH_ATOMIC_WRITE */ 6526140a5987Sdan 6527b0b02300Sdrh if( bBatch==0 ){ 6528140a5987Sdan rc = pager_write_pagelist(pPager, pList); 6529140a5987Sdan } 6530153c62c4Sdrh if( rc!=SQLITE_OK ){ 653104c3a46eSdrh assert( rc!=SQLITE_IOERR_BLOCKED ); 6532bea2a948Sdanielk1977 goto commit_phase_one_exit; 6533153c62c4Sdrh } 65348c0a791aSdanielk1977 sqlite3PcacheCleanAll(pPager->pPCache); 653580e35f46Sdrh 6536bc1a3c6cSdan /* If the file on disk is smaller than the database image, use 6537bc1a3c6cSdan ** pager_truncate to grow the file here. This can happen if the database 6538bc1a3c6cSdan ** image was extended as part of the current transaction and then the 6539bc1a3c6cSdan ** last page in the db image moved to the free-list. In this case the 6540bc1a3c6cSdan ** last page is never written out to disk, leaving the database file 6541bc1a3c6cSdan ** undersized. Fix this now if it is the case. */ 6542bc1a3c6cSdan if( pPager->dbSize>pPager->dbFileSize ){ 6543bea2a948Sdanielk1977 Pgno nNew = pPager->dbSize - (pPager->dbSize==PAGER_MJ_PGNO(pPager)); 6544d0864087Sdan assert( pPager->eState==PAGER_WRITER_DBMOD ); 6545bea2a948Sdanielk1977 rc = pager_truncate(pPager, nNew); 6546bea2a948Sdanielk1977 if( rc!=SQLITE_OK ) goto commit_phase_one_exit; 6547f90b7260Sdanielk1977 } 6548f90b7260Sdanielk1977 6549bea2a948Sdanielk1977 /* Finally, sync the database file. */ 6550354bfe03Sdan if( !noSync ){ 6551999cd08aSdan rc = sqlite3PagerSync(pPager, zMaster); 655280e35f46Sdrh } 655380e35f46Sdrh IOTRACE(("DBSYNC %p\n", pPager)) 65547c24610eSdan } 655580e35f46Sdrh } 655680e35f46Sdrh 6557bea2a948Sdanielk1977 commit_phase_one_exit: 6558d0864087Sdan if( rc==SQLITE_OK && !pagerUseWal(pPager) ){ 6559d0864087Sdan pPager->eState = PAGER_WRITER_FINISHED; 6560d0864087Sdan } 656180e35f46Sdrh return rc; 656280e35f46Sdrh } 656380e35f46Sdrh 656480e35f46Sdrh 656580e35f46Sdrh /* 6566bea2a948Sdanielk1977 ** When this function is called, the database file has been completely 6567bea2a948Sdanielk1977 ** updated to reflect the changes made by the current transaction and 6568bea2a948Sdanielk1977 ** synced to disk. The journal file still exists in the file-system 6569bea2a948Sdanielk1977 ** though, and if a failure occurs at this point it will eventually 6570bea2a948Sdanielk1977 ** be used as a hot-journal and the current transaction rolled back. 6571d9b0257aSdrh ** 6572bea2a948Sdanielk1977 ** This function finalizes the journal file, either by deleting, 6573bea2a948Sdanielk1977 ** truncating or partially zeroing it, so that it cannot be used 6574bea2a948Sdanielk1977 ** for hot-journal rollback. Once this is done the transaction is 6575bea2a948Sdanielk1977 ** irrevocably committed. 6576bea2a948Sdanielk1977 ** 6577bea2a948Sdanielk1977 ** If an error occurs, an IO error code is returned and the pager 6578bea2a948Sdanielk1977 ** moves into the error state. Otherwise, SQLITE_OK is returned. 6579ed7c855cSdrh */ 658080e35f46Sdrh int sqlite3PagerCommitPhaseTwo(Pager *pPager){ 6581bea2a948Sdanielk1977 int rc = SQLITE_OK; /* Return code */ 6582d9b0257aSdrh 6583b480dc23Sdrh /* This routine should not be called if a prior error has occurred. 6584b480dc23Sdrh ** But if (due to a coding error elsewhere in the system) it does get 6585b480dc23Sdrh ** called, just return the same error code without doing anything. */ 6586b480dc23Sdrh if( NEVER(pPager->errCode) ) return pPager->errCode; 6587bea2a948Sdanielk1977 6588de1ae34eSdan assert( pPager->eState==PAGER_WRITER_LOCKED 6589d0864087Sdan || pPager->eState==PAGER_WRITER_FINISHED 6590d0864087Sdan || (pagerUseWal(pPager) && pPager->eState==PAGER_WRITER_CACHEMOD) 6591d0864087Sdan ); 6592d0864087Sdan assert( assert_pager_state(pPager) ); 6593d0864087Sdan 6594bea2a948Sdanielk1977 /* An optimization. If the database was not actually modified during 6595bea2a948Sdanielk1977 ** this transaction, the pager is running in exclusive-mode and is 6596bea2a948Sdanielk1977 ** using persistent journals, then this function is a no-op. 6597bea2a948Sdanielk1977 ** 6598bea2a948Sdanielk1977 ** The start of the journal file currently contains a single journal 6599bea2a948Sdanielk1977 ** header with the nRec field set to 0. If such a journal is used as 6600bea2a948Sdanielk1977 ** a hot-journal during hot-journal rollback, 0 changes will be made 6601bea2a948Sdanielk1977 ** to the database file. So there is no need to zero the journal 6602bea2a948Sdanielk1977 ** header. Since the pager is in exclusive mode, there is no need 6603bea2a948Sdanielk1977 ** to drop any locks either. 6604bea2a948Sdanielk1977 */ 6605de1ae34eSdan if( pPager->eState==PAGER_WRITER_LOCKED 6606d0864087Sdan && pPager->exclusiveMode 66073cfe0703Sdanielk1977 && pPager->journalMode==PAGER_JOURNALMODE_PERSIST 66083cfe0703Sdanielk1977 ){ 66096b63ab47Sdan assert( pPager->journalOff==JOURNAL_HDR_SZ(pPager) || !pPager->journalOff ); 6610d0864087Sdan pPager->eState = PAGER_READER; 6611d138c016Sdrh return SQLITE_OK; 6612d138c016Sdrh } 6613bea2a948Sdanielk1977 661430d53701Sdrh PAGERTRACE(("COMMIT %d\n", PAGERID(pPager))); 6615d7107b38Sdrh pPager->iDataVersion++; 6616bc1a3c6cSdan rc = pager_end_transaction(pPager, pPager->setMaster, 1); 6617bea2a948Sdanielk1977 return pager_error(pPager, rc); 6618ed7c855cSdrh } 6619ed7c855cSdrh 6620ed7c855cSdrh /* 662173d66fdbSdan ** If a write transaction is open, then all changes made within the 662273d66fdbSdan ** transaction are reverted and the current write-transaction is closed. 662373d66fdbSdan ** The pager falls back to PAGER_READER state if successful, or PAGER_ERROR 662473d66fdbSdan ** state if an error occurs. 6625d9b0257aSdrh ** 662673d66fdbSdan ** If the pager is already in PAGER_ERROR state when this function is called, 662773d66fdbSdan ** it returns Pager.errCode immediately. No work is performed in this case. 662873d66fdbSdan ** 662973d66fdbSdan ** Otherwise, in rollback mode, this function performs two functions: 6630bea2a948Sdanielk1977 ** 6631bea2a948Sdanielk1977 ** 1) It rolls back the journal file, restoring all database file and 6632bea2a948Sdanielk1977 ** in-memory cache pages to the state they were in when the transaction 6633bea2a948Sdanielk1977 ** was opened, and 663473d66fdbSdan ** 6635bea2a948Sdanielk1977 ** 2) It finalizes the journal file, so that it is not used for hot 6636bea2a948Sdanielk1977 ** rollback at any point in the future. 6637bea2a948Sdanielk1977 ** 663873d66fdbSdan ** Finalization of the journal file (task 2) is only performed if the 663973d66fdbSdan ** rollback is successful. 6640bea2a948Sdanielk1977 ** 664173d66fdbSdan ** In WAL mode, all cache-entries containing data modified within the 664273d66fdbSdan ** current transaction are either expelled from the cache or reverted to 664373d66fdbSdan ** their pre-transaction state by re-reading data from the database or 664473d66fdbSdan ** WAL files. The WAL transaction is then closed. 6645ed7c855cSdrh */ 66463b8a05f6Sdanielk1977 int sqlite3PagerRollback(Pager *pPager){ 6647bea2a948Sdanielk1977 int rc = SQLITE_OK; /* Return code */ 664830d53701Sdrh PAGERTRACE(("ROLLBACK %d\n", PAGERID(pPager))); 6649d0864087Sdan 6650de1ae34eSdan /* PagerRollback() is a no-op if called in READER or OPEN state. If 6651a42c66bdSdan ** the pager is already in the ERROR state, the rollback is not 6652a42c66bdSdan ** attempted here. Instead, the error code is returned to the caller. 6653a42c66bdSdan */ 6654d0864087Sdan assert( assert_pager_state(pPager) ); 6655a42c66bdSdan if( pPager->eState==PAGER_ERROR ) return pPager->errCode; 6656d0864087Sdan if( pPager->eState<=PAGER_READER ) return SQLITE_OK; 6657d0864087Sdan 66587ed91f23Sdrh if( pagerUseWal(pPager) ){ 66597c24610eSdan int rc2; 66607c24610eSdan rc = sqlite3PagerSavepoint(pPager, SAVEPOINT_ROLLBACK, -1); 6661bc1a3c6cSdan rc2 = pager_end_transaction(pPager, pPager->setMaster, 0); 66627c24610eSdan if( rc==SQLITE_OK ) rc = rc2; 666373d66fdbSdan }else if( !isOpen(pPager->jfd) || pPager->eState==PAGER_WRITER_LOCKED ){ 66649f4beedbSdan int eState = pPager->eState; 6665bc1a3c6cSdan rc = pager_end_transaction(pPager, 0, 0); 66669f4beedbSdan if( !MEMDB && eState>PAGER_WRITER_LOCKED ){ 66679f4beedbSdan /* This can happen using journal_mode=off. Move the pager to the error 66689f4beedbSdan ** state to indicate that the contents of the cache may not be trusted. 66699f4beedbSdan ** Any active readers will get SQLITE_ABORT. 66709f4beedbSdan */ 66719f4beedbSdan pPager->errCode = SQLITE_ABORT; 66729f4beedbSdan pPager->eState = PAGER_ERROR; 667312e6f682Sdrh setGetterMethod(pPager); 66749f4beedbSdan return rc; 66759f4beedbSdan } 6676a6abd041Sdrh }else{ 6677e277be05Sdanielk1977 rc = pager_playback(pPager, 0); 6678a6abd041Sdrh } 667973d66fdbSdan 6680a42c66bdSdan assert( pPager->eState==PAGER_READER || rc!=SQLITE_OK ); 6681d4097928Sdan assert( rc==SQLITE_OK || rc==SQLITE_FULL || rc==SQLITE_CORRUPT 6682a01abc30Sdan || rc==SQLITE_NOMEM || (rc&0xFF)==SQLITE_IOERR 6683a01abc30Sdan || rc==SQLITE_CANTOPEN 6684a01abc30Sdan ); 66858c0a791aSdanielk1977 668607cb560bSdanielk1977 /* If an error occurs during a ROLLBACK, we can no longer trust the pager 6687b22aa4a6Sdan ** cache. So call pager_error() on the way out to make any error persistent. 668807cb560bSdanielk1977 */ 6689b22aa4a6Sdan return pager_error(pPager, rc); 669098808babSdrh } 6691d9b0257aSdrh 6692d9b0257aSdrh /* 66935e00f6c7Sdrh ** Return TRUE if the database file is opened read-only. Return FALSE 66945e00f6c7Sdrh ** if the database is (in theory) writable. 66955e00f6c7Sdrh */ 6696f49661a4Sdrh u8 sqlite3PagerIsreadonly(Pager *pPager){ 6697be0072d2Sdrh return pPager->readOnly; 66985e00f6c7Sdrh } 66995e00f6c7Sdrh 6700e05b3f8fSdrh #ifdef SQLITE_DEBUG 67015e00f6c7Sdrh /* 670295a0b371Sdrh ** Return the sum of the reference counts for all pages held by pPager. 67030f7eb611Sdrh */ 67043b8a05f6Sdanielk1977 int sqlite3PagerRefcount(Pager *pPager){ 67058c0a791aSdanielk1977 return sqlite3PcacheRefCount(pPager->pPCache); 67060f7eb611Sdrh } 6707e05b3f8fSdrh #endif 67080f7eb611Sdrh 670971d5d2cdSdanielk1977 /* 671063da0893Sdrh ** Return the approximate number of bytes of memory currently 671163da0893Sdrh ** used by the pager and its associated cache. 671263da0893Sdrh */ 671363da0893Sdrh int sqlite3PagerMemUsed(Pager *pPager){ 6714233f816bSdrh int perPageSize = pPager->pageSize + pPager->nExtra + sizeof(PgHdr) 6715233f816bSdrh + 5*sizeof(void*); 671663da0893Sdrh return perPageSize*sqlite3PcachePagecount(pPager->pPCache) 6717233f816bSdrh + sqlite3MallocSize(pPager) 67180cf68a9bSdrh + pPager->pageSize; 671963da0893Sdrh } 672063da0893Sdrh 672163da0893Sdrh /* 672271d5d2cdSdanielk1977 ** Return the number of references to the specified page. 672371d5d2cdSdanielk1977 */ 672471d5d2cdSdanielk1977 int sqlite3PagerPageRefcount(DbPage *pPage){ 672571d5d2cdSdanielk1977 return sqlite3PcachePageRefcount(pPage); 672671d5d2cdSdanielk1977 } 672771d5d2cdSdanielk1977 67280f7eb611Sdrh #ifdef SQLITE_TEST 67290f7eb611Sdrh /* 6730d9b0257aSdrh ** This routine is used for testing and analysis only. 6731d9b0257aSdrh */ 67323b8a05f6Sdanielk1977 int *sqlite3PagerStats(Pager *pPager){ 673342741be9Sdanielk1977 static int a[11]; 67348c0a791aSdanielk1977 a[0] = sqlite3PcacheRefCount(pPager->pPCache); 67358c0a791aSdanielk1977 a[1] = sqlite3PcachePagecount(pPager->pPCache); 67368c0a791aSdanielk1977 a[2] = sqlite3PcacheGetCachesize(pPager->pPCache); 6737de1ae34eSdan a[3] = pPager->eState==PAGER_OPEN ? -1 : (int) pPager->dbSize; 6738d0864087Sdan a[4] = pPager->eState; 6739efaaf579Sdanielk1977 a[5] = pPager->errCode; 67409ad3ee40Sdrh a[6] = pPager->aStat[PAGER_STAT_HIT]; 67419ad3ee40Sdrh a[7] = pPager->aStat[PAGER_STAT_MISS]; 67427c4ac0c5Sdrh a[8] = 0; /* Used to be pPager->nOvfl */ 674342741be9Sdanielk1977 a[9] = pPager->nRead; 67449ad3ee40Sdrh a[10] = pPager->aStat[PAGER_STAT_WRITE]; 6745d9b0257aSdrh return a; 6746d9b0257aSdrh } 67470410302eSdanielk1977 #endif 67480410302eSdanielk1977 67490410302eSdanielk1977 /* 6750ffc78a41Sdrh ** Parameter eStat must be one of SQLITE_DBSTATUS_CACHE_HIT, _MISS, _WRITE, 6751ffc78a41Sdrh ** or _WRITE+1. The SQLITE_DBSTATUS_CACHE_WRITE+1 case is a translation 6752ffc78a41Sdrh ** of SQLITE_DBSTATUS_CACHE_SPILL. The _SPILL case is not contiguous because 6753ffc78a41Sdrh ** it was added later. 6754ffc78a41Sdrh ** 6755ffc78a41Sdrh ** Before returning, *pnVal is incremented by the 675658ca31c9Sdan ** current cache hit or miss count, according to the value of eStat. If the 675758ca31c9Sdan ** reset parameter is non-zero, the cache hit or miss count is zeroed before 675858ca31c9Sdan ** returning. 675958ca31c9Sdan */ 676058ca31c9Sdan void sqlite3PagerCacheStat(Pager *pPager, int eStat, int reset, int *pnVal){ 676158ca31c9Sdan 676258ca31c9Sdan assert( eStat==SQLITE_DBSTATUS_CACHE_HIT 676358ca31c9Sdan || eStat==SQLITE_DBSTATUS_CACHE_MISS 67649ad3ee40Sdrh || eStat==SQLITE_DBSTATUS_CACHE_WRITE 6765ffc78a41Sdrh || eStat==SQLITE_DBSTATUS_CACHE_WRITE+1 676658ca31c9Sdan ); 676758ca31c9Sdan 67689ad3ee40Sdrh assert( SQLITE_DBSTATUS_CACHE_HIT+1==SQLITE_DBSTATUS_CACHE_MISS ); 67699ad3ee40Sdrh assert( SQLITE_DBSTATUS_CACHE_HIT+2==SQLITE_DBSTATUS_CACHE_WRITE ); 6770ffc78a41Sdrh assert( PAGER_STAT_HIT==0 && PAGER_STAT_MISS==1 6771ffc78a41Sdrh && PAGER_STAT_WRITE==2 && PAGER_STAT_SPILL==3 ); 67729ad3ee40Sdrh 6773ffc78a41Sdrh eStat -= SQLITE_DBSTATUS_CACHE_HIT; 6774ffc78a41Sdrh *pnVal += pPager->aStat[eStat]; 677558ca31c9Sdan if( reset ){ 6776ffc78a41Sdrh pPager->aStat[eStat] = 0; 677758ca31c9Sdan } 677858ca31c9Sdan } 677958ca31c9Sdan 678058ca31c9Sdan /* 67819131ab93Sdan ** Return true if this is an in-memory or temp-file backed pager. 67820410302eSdanielk1977 */ 678317b90b53Sdanielk1977 int sqlite3PagerIsMemdb(Pager *pPager){ 67849131ab93Sdan return pPager->tempFile; 678517b90b53Sdanielk1977 } 6786dd79342eSdrh 6787fa86c412Sdrh /* 6788bea2a948Sdanielk1977 ** Check that there are at least nSavepoint savepoints open. If there are 6789bea2a948Sdanielk1977 ** currently less than nSavepoints open, then open one or more savepoints 6790bea2a948Sdanielk1977 ** to make up the difference. If the number of savepoints is already 6791bea2a948Sdanielk1977 ** equal to nSavepoint, then this function is a no-op. 6792bea2a948Sdanielk1977 ** 6793bea2a948Sdanielk1977 ** If a memory allocation fails, SQLITE_NOMEM is returned. If an error 6794bea2a948Sdanielk1977 ** occurs while opening the sub-journal file, then an IO error code is 6795bea2a948Sdanielk1977 ** returned. Otherwise, SQLITE_OK. 6796fa86c412Sdrh */ 67973169906dSdrh static SQLITE_NOINLINE int pagerOpenSavepoint(Pager *pPager, int nSavepoint){ 6798bea2a948Sdanielk1977 int rc = SQLITE_OK; /* Return code */ 6799bea2a948Sdanielk1977 int nCurrent = pPager->nSavepoint; /* Current number of savepoints */ 68003169906dSdrh int ii; /* Iterator variable */ 68013169906dSdrh PagerSavepoint *aNew; /* New Pager.aSavepoint array */ 6802fd7f0452Sdanielk1977 6803de1ae34eSdan assert( pPager->eState>=PAGER_WRITER_LOCKED ); 6804937ac9daSdan assert( assert_pager_state(pPager) ); 68053169906dSdrh assert( nSavepoint>nCurrent && pPager->useJournal ); 6806dd3cd977Sdrh 6807fd7f0452Sdanielk1977 /* Grow the Pager.aSavepoint array using realloc(). Return SQLITE_NOMEM 6808fd7f0452Sdanielk1977 ** if the allocation fails. Otherwise, zero the new portion in case a 6809fd7f0452Sdanielk1977 ** malloc failure occurs while populating it in the for(...) loop below. 6810fd7f0452Sdanielk1977 */ 681149b9d338Sdrh aNew = (PagerSavepoint *)sqlite3Realloc( 6812fd7f0452Sdanielk1977 pPager->aSavepoint, sizeof(PagerSavepoint)*nSavepoint 6813fd7f0452Sdanielk1977 ); 6814fd7f0452Sdanielk1977 if( !aNew ){ 6815fad3039cSmistachkin return SQLITE_NOMEM_BKPT; 6816fa86c412Sdrh } 6817bea2a948Sdanielk1977 memset(&aNew[nCurrent], 0, (nSavepoint-nCurrent) * sizeof(PagerSavepoint)); 6818fd7f0452Sdanielk1977 pPager->aSavepoint = aNew; 6819fa86c412Sdrh 6820fd7f0452Sdanielk1977 /* Populate the PagerSavepoint structures just allocated. */ 6821bea2a948Sdanielk1977 for(ii=nCurrent; ii<nSavepoint; ii++){ 6822937ac9daSdan aNew[ii].nOrig = pPager->dbSize; 6823ba726f49Sdrh if( isOpen(pPager->jfd) && pPager->journalOff>0 ){ 682467ddef69Sdanielk1977 aNew[ii].iOffset = pPager->journalOff; 682567ddef69Sdanielk1977 }else{ 682667ddef69Sdanielk1977 aNew[ii].iOffset = JOURNAL_HDR_SZ(pPager); 682767ddef69Sdanielk1977 } 6828bea2a948Sdanielk1977 aNew[ii].iSubRec = pPager->nSubRec; 6829937ac9daSdan aNew[ii].pInSavepoint = sqlite3BitvecCreate(pPager->dbSize); 6830fd7f0452Sdanielk1977 if( !aNew[ii].pInSavepoint ){ 6831fad3039cSmistachkin return SQLITE_NOMEM_BKPT; 6832fa86c412Sdrh } 68337ed91f23Sdrh if( pagerUseWal(pPager) ){ 683471d89919Sdan sqlite3WalSavepoint(pPager->pWal, aNew[ii].aWalData); 68354cd78b4dSdan } 68368e64db2bSdan pPager->nSavepoint = ii+1; 6837fa86c412Sdrh } 68388e64db2bSdan assert( pPager->nSavepoint==nSavepoint ); 68399f0b6be8Sdanielk1977 assertTruncateConstraint(pPager); 684086f8c197Sdrh return rc; 684186f8c197Sdrh } 68423169906dSdrh int sqlite3PagerOpenSavepoint(Pager *pPager, int nSavepoint){ 68433169906dSdrh assert( pPager->eState>=PAGER_WRITER_LOCKED ); 68443169906dSdrh assert( assert_pager_state(pPager) ); 68453169906dSdrh 68463169906dSdrh if( nSavepoint>pPager->nSavepoint && pPager->useJournal ){ 68473169906dSdrh return pagerOpenSavepoint(pPager, nSavepoint); 68483169906dSdrh }else{ 68493169906dSdrh return SQLITE_OK; 68503169906dSdrh } 68513169906dSdrh } 68523169906dSdrh 6853fa86c412Sdrh 6854fa86c412Sdrh /* 6855bea2a948Sdanielk1977 ** This function is called to rollback or release (commit) a savepoint. 6856bea2a948Sdanielk1977 ** The savepoint to release or rollback need not be the most recently 6857bea2a948Sdanielk1977 ** created savepoint. 6858bea2a948Sdanielk1977 ** 6859fd7f0452Sdanielk1977 ** Parameter op is always either SAVEPOINT_ROLLBACK or SAVEPOINT_RELEASE. 6860fd7f0452Sdanielk1977 ** If it is SAVEPOINT_RELEASE, then release and destroy the savepoint with 6861fd7f0452Sdanielk1977 ** index iSavepoint. If it is SAVEPOINT_ROLLBACK, then rollback all changes 6862be217793Sshane ** that have occurred since the specified savepoint was created. 6863fd7f0452Sdanielk1977 ** 6864bea2a948Sdanielk1977 ** The savepoint to rollback or release is identified by parameter 6865bea2a948Sdanielk1977 ** iSavepoint. A value of 0 means to operate on the outermost savepoint 6866bea2a948Sdanielk1977 ** (the first created). A value of (Pager.nSavepoint-1) means operate 6867bea2a948Sdanielk1977 ** on the most recently created savepoint. If iSavepoint is greater than 6868bea2a948Sdanielk1977 ** (Pager.nSavepoint-1), then this function is a no-op. 6869fd7f0452Sdanielk1977 ** 6870bea2a948Sdanielk1977 ** If a negative value is passed to this function, then the current 6871bea2a948Sdanielk1977 ** transaction is rolled back. This is different to calling 6872bea2a948Sdanielk1977 ** sqlite3PagerRollback() because this function does not terminate 6873bea2a948Sdanielk1977 ** the transaction or unlock the database, it just restores the 6874bea2a948Sdanielk1977 ** contents of the database to its original state. 6875bea2a948Sdanielk1977 ** 6876bea2a948Sdanielk1977 ** In any case, all savepoints with an index greater than iSavepoint 6877bea2a948Sdanielk1977 ** are destroyed. If this is a release operation (op==SAVEPOINT_RELEASE), 6878bea2a948Sdanielk1977 ** then savepoint iSavepoint is also destroyed. 6879bea2a948Sdanielk1977 ** 6880bea2a948Sdanielk1977 ** This function may return SQLITE_NOMEM if a memory allocation fails, 6881bea2a948Sdanielk1977 ** or an IO error code if an IO error occurs while rolling back a 6882bea2a948Sdanielk1977 ** savepoint. If no errors occur, SQLITE_OK is returned. 6883fa86c412Sdrh */ 6884fd7f0452Sdanielk1977 int sqlite3PagerSavepoint(Pager *pPager, int op, int iSavepoint){ 6885d0d49b9cSdan int rc = pPager->errCode; 6886d0d49b9cSdan 6887d0d49b9cSdan #ifdef SQLITE_ENABLE_ZIPVFS 6888d0d49b9cSdan if( op==SAVEPOINT_RELEASE ) rc = SQLITE_OK; 6889d0d49b9cSdan #endif 6890fd7f0452Sdanielk1977 6891fd7f0452Sdanielk1977 assert( op==SAVEPOINT_RELEASE || op==SAVEPOINT_ROLLBACK ); 6892bea2a948Sdanielk1977 assert( iSavepoint>=0 || op==SAVEPOINT_ROLLBACK ); 6893fd7f0452Sdanielk1977 68944e004aa6Sdan if( rc==SQLITE_OK && iSavepoint<pPager->nSavepoint ){ 6895bea2a948Sdanielk1977 int ii; /* Iterator variable */ 6896bea2a948Sdanielk1977 int nNew; /* Number of remaining savepoints after this op. */ 6897bea2a948Sdanielk1977 6898bea2a948Sdanielk1977 /* Figure out how many savepoints will still be active after this 6899bea2a948Sdanielk1977 ** operation. Store this value in nNew. Then free resources associated 6900bea2a948Sdanielk1977 ** with any savepoints that are destroyed by this operation. 6901bea2a948Sdanielk1977 */ 69026885de36Sshaneh nNew = iSavepoint + (( op==SAVEPOINT_RELEASE ) ? 0 : 1); 6903fd7f0452Sdanielk1977 for(ii=nNew; ii<pPager->nSavepoint; ii++){ 6904fd7f0452Sdanielk1977 sqlite3BitvecDestroy(pPager->aSavepoint[ii].pInSavepoint); 6905b3175389Sdanielk1977 } 6906fd7f0452Sdanielk1977 pPager->nSavepoint = nNew; 6907fd7f0452Sdanielk1977 69086885de36Sshaneh /* If this is a release of the outermost savepoint, truncate 69096885de36Sshaneh ** the sub-journal to zero bytes in size. */ 69106885de36Sshaneh if( op==SAVEPOINT_RELEASE ){ 69116885de36Sshaneh if( nNew==0 && isOpen(pPager->sjfd) ){ 69126885de36Sshaneh /* Only truncate if it is an in-memory sub-journal. */ 69132491de28Sdan if( sqlite3JournalIsInMemory(pPager->sjfd) ){ 69146885de36Sshaneh rc = sqlite3OsTruncate(pPager->sjfd, 0); 69153517324dSdrh assert( rc==SQLITE_OK ); 69166885de36Sshaneh } 69176885de36Sshaneh pPager->nSubRec = 0; 69186885de36Sshaneh } 69196885de36Sshaneh } 69206885de36Sshaneh /* Else this is a rollback operation, playback the specified savepoint. 6921bea2a948Sdanielk1977 ** If this is a temp-file, it is possible that the journal file has 6922bea2a948Sdanielk1977 ** not yet been opened. In this case there have been no changes to 6923bea2a948Sdanielk1977 ** the database file, so the playback operation can be skipped. 6924bea2a948Sdanielk1977 */ 69257ed91f23Sdrh else if( pagerUseWal(pPager) || isOpen(pPager->jfd) ){ 6926fd7f0452Sdanielk1977 PagerSavepoint *pSavepoint = (nNew==0)?0:&pPager->aSavepoint[nNew-1]; 6927fd7f0452Sdanielk1977 rc = pagerPlaybackSavepoint(pPager, pSavepoint); 6928fd7f0452Sdanielk1977 assert(rc!=SQLITE_DONE); 6929fa86c412Sdrh } 6930d0d49b9cSdan 6931d0d49b9cSdan #ifdef SQLITE_ENABLE_ZIPVFS 6932d0d49b9cSdan /* If the cache has been modified but the savepoint cannot be rolled 6933d0d49b9cSdan ** back journal_mode=off, put the pager in the error state. This way, 6934d0d49b9cSdan ** if the VFS used by this pager includes ZipVFS, the entire transaction 6935d0d49b9cSdan ** can be rolled back at the ZipVFS level. */ 6936d0d49b9cSdan else if( 6937d0d49b9cSdan pPager->journalMode==PAGER_JOURNALMODE_OFF 6938d0d49b9cSdan && pPager->eState>=PAGER_WRITER_CACHEMOD 6939d0d49b9cSdan ){ 6940d0d49b9cSdan pPager->errCode = SQLITE_ABORT; 6941d0d49b9cSdan pPager->eState = PAGER_ERROR; 6942fc4111f7Sdrh setGetterMethod(pPager); 6943d0d49b9cSdan } 6944d0d49b9cSdan #endif 6945fd7f0452Sdanielk1977 } 69464e004aa6Sdan 6947fa86c412Sdrh return rc; 6948fa86c412Sdrh } 6949fa86c412Sdrh 695073509eeeSdrh /* 695173509eeeSdrh ** Return the full pathname of the database file. 6952d4e0bb0eSdrh ** 6953d4e0bb0eSdrh ** Except, if the pager is in-memory only, then return an empty string if 6954d4e0bb0eSdrh ** nullIfMemDb is true. This routine is called with nullIfMemDb==1 when 6955d4e0bb0eSdrh ** used to report the filename to the user, for compatibility with legacy 6956d4e0bb0eSdrh ** behavior. But when the Btree needs to know the filename for matching to 6957d4e0bb0eSdrh ** shared cache, it uses nullIfMemDb==0 so that in-memory databases can 6958d4e0bb0eSdrh ** participate in shared-cache. 695973509eeeSdrh */ 6960d4e0bb0eSdrh const char *sqlite3PagerFilename(Pager *pPager, int nullIfMemDb){ 6961d4e0bb0eSdrh return (nullIfMemDb && pPager->memDb) ? "" : pPager->zFilename; 696273509eeeSdrh } 696373509eeeSdrh 6964b20ea9d2Sdrh /* 6965d0679edcSdrh ** Return the VFS structure for the pager. 6966d0679edcSdrh */ 6967790f287cSdrh sqlite3_vfs *sqlite3PagerVfs(Pager *pPager){ 6968d0679edcSdrh return pPager->pVfs; 6969d0679edcSdrh } 6970d0679edcSdrh 6971d0679edcSdrh /* 6972cc6bb3eaSdrh ** Return the file handle for the database file associated 6973cc6bb3eaSdrh ** with the pager. This might return NULL if the file has 6974cc6bb3eaSdrh ** not yet been opened. 6975cc6bb3eaSdrh */ 6976cc6bb3eaSdrh sqlite3_file *sqlite3PagerFile(Pager *pPager){ 6977cc6bb3eaSdrh return pPager->fd; 6978cc6bb3eaSdrh } 6979cc6bb3eaSdrh 6980fd72563dSdrh #ifdef SQLITE_ENABLE_SETLK_TIMEOUT 6981fd72563dSdrh /* 6982fd72563dSdrh ** Reset the lock timeout for pager. 6983fd72563dSdrh */ 6984fd72563dSdrh void sqlite3PagerResetLockTimeout(Pager *pPager){ 6985fd72563dSdrh int x = 0; 6986fd72563dSdrh sqlite3OsFileControl(pPager->fd, SQLITE_FCNTL_LOCK_TIMEOUT, &x); 6987fd72563dSdrh } 6988fd72563dSdrh #endif 6989fd72563dSdrh 6990cc6bb3eaSdrh /* 699121d61853Sdrh ** Return the file handle for the journal file (if it exists). 699221d61853Sdrh ** This will be either the rollback journal or the WAL file. 699321d61853Sdrh */ 699421d61853Sdrh sqlite3_file *sqlite3PagerJrnlFile(Pager *pPager){ 6995b4acd6a8Sdrh #if SQLITE_OMIT_WAL 6996b4acd6a8Sdrh return pPager->jfd; 6997b4acd6a8Sdrh #else 699821d61853Sdrh return pPager->pWal ? sqlite3WalFile(pPager->pWal) : pPager->jfd; 6999b4acd6a8Sdrh #endif 700021d61853Sdrh } 700121d61853Sdrh 700221d61853Sdrh /* 70035865e3d5Sdanielk1977 ** Return the full pathname of the journal file. 70045865e3d5Sdanielk1977 */ 70053b8a05f6Sdanielk1977 const char *sqlite3PagerJournalname(Pager *pPager){ 70065865e3d5Sdanielk1977 return pPager->zJournal; 70075865e3d5Sdanielk1977 } 70085865e3d5Sdanielk1977 70097c4ac0c5Sdrh #ifdef SQLITE_HAS_CODEC 70102c8997b9Sdrh /* 7011fa9601a9Sdrh ** Set or retrieve the codec for this pager 7012b20ea9d2Sdrh */ 701340e459e0Sdrh void sqlite3PagerSetCodec( 7014b20ea9d2Sdrh Pager *pPager, 7015c001c58aSdrh void *(*xCodec)(void*,void*,Pgno,int), 7016fa9601a9Sdrh void (*xCodecSizeChng)(void*,int,int), 7017fa9601a9Sdrh void (*xCodecFree)(void*), 7018fa9601a9Sdrh void *pCodec 7019b20ea9d2Sdrh ){ 7020fa9601a9Sdrh if( pPager->xCodecFree ) pPager->xCodecFree(pPager->pCodec); 7021481aa74eSdrh pPager->xCodec = pPager->memDb ? 0 : xCodec; 7022fa9601a9Sdrh pPager->xCodecSizeChng = xCodecSizeChng; 7023fa9601a9Sdrh pPager->xCodecFree = xCodecFree; 7024fa9601a9Sdrh pPager->pCodec = pCodec; 702512e6f682Sdrh setGetterMethod(pPager); 7026fa9601a9Sdrh pagerReportSize(pPager); 7027fa9601a9Sdrh } 702840e459e0Sdrh void *sqlite3PagerGetCodec(Pager *pPager){ 7029fa9601a9Sdrh return pPager->pCodec; 7030b20ea9d2Sdrh } 7031ee0231efSdrh 7032ee0231efSdrh /* 7033ee0231efSdrh ** This function is called by the wal module when writing page content 7034ee0231efSdrh ** into the log file. 7035ee0231efSdrh ** 7036ee0231efSdrh ** This function returns a pointer to a buffer containing the encrypted 7037ee0231efSdrh ** page content. If a malloc fails, this function may return NULL. 7038ee0231efSdrh */ 7039ee0231efSdrh void *sqlite3PagerCodec(PgHdr *pPg){ 7040ee0231efSdrh void *aData = 0; 7041ee0231efSdrh CODEC2(pPg->pPager, pPg->pData, pPg->pgno, 6, return 0, aData); 7042ee0231efSdrh return aData; 7043ee0231efSdrh } 7044ee0231efSdrh 7045ee0231efSdrh /* 7046ee0231efSdrh ** Return the current pager state 7047ee0231efSdrh */ 7048ee0231efSdrh int sqlite3PagerState(Pager *pPager){ 7049ee0231efSdrh return pPager->eState; 7050ee0231efSdrh } 7051ee0231efSdrh #endif /* SQLITE_HAS_CODEC */ 7052b20ea9d2Sdrh 7053687566d7Sdanielk1977 #ifndef SQLITE_OMIT_AUTOVACUUM 7054687566d7Sdanielk1977 /* 70555e385311Sdrh ** Move the page pPg to location pgno in the file. 7056687566d7Sdanielk1977 ** 70575e385311Sdrh ** There must be no references to the page previously located at 70585e385311Sdrh ** pgno (which we call pPgOld) though that page is allowed to be 7059b3df2e1cSdrh ** in cache. If the page previously located at pgno is not already 70605e385311Sdrh ** in the rollback journal, it is not put there by by this routine. 7061687566d7Sdanielk1977 ** 70625e385311Sdrh ** References to the page pPg remain valid. Updating any 70635e385311Sdrh ** meta-data associated with pPg (i.e. data stored in the nExtra bytes 7064687566d7Sdanielk1977 ** allocated along with the page) is the responsibility of the caller. 7065687566d7Sdanielk1977 ** 70665fd057afSdanielk1977 ** A transaction must be active when this routine is called. It used to be 70675fd057afSdanielk1977 ** required that a statement transaction was not active, but this restriction 70685fd057afSdanielk1977 ** has been removed (CREATE INDEX needs to move a page when a statement 70695fd057afSdanielk1977 ** transaction is active). 70704c999999Sdanielk1977 ** 70714c999999Sdanielk1977 ** If the fourth argument, isCommit, is non-zero, then this page is being 70724c999999Sdanielk1977 ** moved as part of a database reorganization just before the transaction 70734c999999Sdanielk1977 ** is being committed. In this case, it is guaranteed that the database page 70744c999999Sdanielk1977 ** pPg refers to will not be written to again within this transaction. 7075bea2a948Sdanielk1977 ** 7076bea2a948Sdanielk1977 ** This function may return SQLITE_NOMEM or an IO error code if an error 7077bea2a948Sdanielk1977 ** occurs. Otherwise, it returns SQLITE_OK. 7078687566d7Sdanielk1977 */ 70794c999999Sdanielk1977 int sqlite3PagerMovepage(Pager *pPager, DbPage *pPg, Pgno pgno, int isCommit){ 70805e385311Sdrh PgHdr *pPgOld; /* The page being overwritten. */ 7081bea2a948Sdanielk1977 Pgno needSyncPgno = 0; /* Old value of pPg->pgno, if sync is required */ 7082bea2a948Sdanielk1977 int rc; /* Return code */ 708386655a1dSdrh Pgno origPgno; /* The original page number */ 7084687566d7Sdanielk1977 7085687566d7Sdanielk1977 assert( pPg->nRef>0 ); 7086d0864087Sdan assert( pPager->eState==PAGER_WRITER_CACHEMOD 7087d0864087Sdan || pPager->eState==PAGER_WRITER_DBMOD 7088d0864087Sdan ); 7089d0864087Sdan assert( assert_pager_state(pPager) ); 7090687566d7Sdanielk1977 70918c30f72dSdrh /* In order to be able to rollback, an in-memory database must journal 70928c30f72dSdrh ** the page we are moving from. 70938c30f72dSdrh */ 7094d22f5099Sdrh assert( pPager->tempFile || !MEMDB ); 7095d87efd72Sdan if( pPager->tempFile ){ 70968c30f72dSdrh rc = sqlite3PagerWrite(pPg); 70978c30f72dSdrh if( rc ) return rc; 70988c30f72dSdrh } 70998c30f72dSdrh 71001fab7b66Sdanielk1977 /* If the page being moved is dirty and has not been saved by the latest 71011fab7b66Sdanielk1977 ** savepoint, then save the current contents of the page into the 71021fab7b66Sdanielk1977 ** sub-journal now. This is required to handle the following scenario: 71031fab7b66Sdanielk1977 ** 71041fab7b66Sdanielk1977 ** BEGIN; 71051fab7b66Sdanielk1977 ** <journal page X, then modify it in memory> 71061fab7b66Sdanielk1977 ** SAVEPOINT one; 71071fab7b66Sdanielk1977 ** <Move page X to location Y> 71081fab7b66Sdanielk1977 ** ROLLBACK TO one; 71091fab7b66Sdanielk1977 ** 71101fab7b66Sdanielk1977 ** If page X were not written to the sub-journal here, it would not 71111fab7b66Sdanielk1977 ** be possible to restore its contents when the "ROLLBACK TO one" 7112bea2a948Sdanielk1977 ** statement were is processed. 7113bea2a948Sdanielk1977 ** 7114bea2a948Sdanielk1977 ** subjournalPage() may need to allocate space to store pPg->pgno into 7115bea2a948Sdanielk1977 ** one or more savepoint bitvecs. This is the reason this function 7116bea2a948Sdanielk1977 ** may return SQLITE_NOMEM. 71171fab7b66Sdanielk1977 */ 711860e32edbSdrh if( (pPg->flags & PGHDR_DIRTY)!=0 711960e32edbSdrh && SQLITE_OK!=(rc = subjournalPageIfRequired(pPg)) 71201fab7b66Sdanielk1977 ){ 71211fab7b66Sdanielk1977 return rc; 71221fab7b66Sdanielk1977 } 71231fab7b66Sdanielk1977 712430d53701Sdrh PAGERTRACE(("MOVE %d page %d (needSync=%d) moves to %d\n", 712530d53701Sdrh PAGERID(pPager), pPg->pgno, (pPg->flags&PGHDR_NEED_SYNC)?1:0, pgno)); 7126b0603416Sdrh IOTRACE(("MOVE %p %d %d\n", pPager, pPg->pgno, pgno)) 7127ef73ee9aSdanielk1977 71284c999999Sdanielk1977 /* If the journal needs to be sync()ed before page pPg->pgno can 71294c999999Sdanielk1977 ** be written to, store pPg->pgno in local variable needSyncPgno. 71304c999999Sdanielk1977 ** 71314c999999Sdanielk1977 ** If the isCommit flag is set, there is no need to remember that 71324c999999Sdanielk1977 ** the journal needs to be sync()ed before database page pPg->pgno 71334c999999Sdanielk1977 ** can be written to. The caller has already promised not to write to it. 71344c999999Sdanielk1977 */ 71357f8def28Sdan if( (pPg->flags&PGHDR_NEED_SYNC) && !isCommit ){ 713694daf7fdSdanielk1977 needSyncPgno = pPg->pgno; 71376ffb4975Sdrh assert( pPager->journalMode==PAGER_JOURNALMODE_OFF || 71385dee6afcSdrh pageInJournal(pPager, pPg) || pPg->pgno>pPager->dbOrigSize ); 71398c0a791aSdanielk1977 assert( pPg->flags&PGHDR_DIRTY ); 714094daf7fdSdanielk1977 } 714194daf7fdSdanielk1977 7142ef73ee9aSdanielk1977 /* If the cache contains a page with page-number pgno, remove it 714351133eaeSdan ** from its hash chain. Also, if the PGHDR_NEED_SYNC flag was set for 7144599fcbaeSdanielk1977 ** page pgno before the 'move' operation, it needs to be retained 7145599fcbaeSdanielk1977 ** for the page moved there. 7146f5fdda82Sdanielk1977 */ 7147bc2ca9ebSdanielk1977 pPg->flags &= ~PGHDR_NEED_SYNC; 7148c137807aSdrh pPgOld = sqlite3PagerLookup(pPager, pgno); 71498c0a791aSdanielk1977 assert( !pPgOld || pPgOld->nRef==1 ); 71506e2ef431Sdrh if( pPgOld ){ 71518c0a791aSdanielk1977 pPg->flags |= (pPgOld->flags&PGHDR_NEED_SYNC); 7152d87efd72Sdan if( pPager->tempFile ){ 715398829a65Sdrh /* Do not discard pages from an in-memory database since we might 715498829a65Sdrh ** need to rollback later. Just move the page out of the way. */ 715598829a65Sdrh sqlite3PcacheMove(pPgOld, pPager->dbSize+1); 715698829a65Sdrh }else{ 7157be20e8ecSdanielk1977 sqlite3PcacheDrop(pPgOld); 7158ef73ee9aSdanielk1977 } 715998829a65Sdrh } 7160687566d7Sdanielk1977 716186655a1dSdrh origPgno = pPg->pgno; 71628c0a791aSdanielk1977 sqlite3PcacheMove(pPg, pgno); 7163c047b9f7Sdrh sqlite3PcacheMakeDirty(pPg); 7164687566d7Sdanielk1977 71654e004aa6Sdan /* For an in-memory database, make sure the original page continues 71664e004aa6Sdan ** to exist, in case the transaction needs to roll back. Use pPgOld 71674e004aa6Sdan ** as the original page since it has already been allocated. 71684e004aa6Sdan */ 7169d12bc602Sdrh if( pPager->tempFile && pPgOld ){ 71704e004aa6Sdan sqlite3PcacheMove(pPgOld, origPgno); 7171da8a330aSdrh sqlite3PagerUnrefNotNull(pPgOld); 71724e004aa6Sdan } 71734e004aa6Sdan 717494daf7fdSdanielk1977 if( needSyncPgno ){ 717594daf7fdSdanielk1977 /* If needSyncPgno is non-zero, then the journal file needs to be 717694daf7fdSdanielk1977 ** sync()ed before any data is written to database file page needSyncPgno. 717794daf7fdSdanielk1977 ** Currently, no such page exists in the page-cache and the 71784c999999Sdanielk1977 ** "is journaled" bitvec flag has been set. This needs to be remedied by 717951133eaeSdan ** loading the page into the pager-cache and setting the PGHDR_NEED_SYNC 71804c999999Sdanielk1977 ** flag. 7181ae82558bSdanielk1977 ** 7182a98d7b47Sdanielk1977 ** If the attempt to load the page into the page-cache fails, (due 7183f5e7bb51Sdrh ** to a malloc() or IO failure), clear the bit in the pInJournal[] 7184a98d7b47Sdanielk1977 ** array. Otherwise, if the page is loaded and written again in 7185a98d7b47Sdanielk1977 ** this transaction, it may be written to the database file before 7186a98d7b47Sdanielk1977 ** it is synced into the journal file. This way, it may end up in 7187a98d7b47Sdanielk1977 ** the journal file twice, but that is not a problem. 718894daf7fdSdanielk1977 */ 71893b8a05f6Sdanielk1977 PgHdr *pPgHdr; 71909584f58cSdrh rc = sqlite3PagerGet(pPager, needSyncPgno, &pPgHdr, 0); 719187c29a94Sdanielk1977 if( rc!=SQLITE_OK ){ 71926aac11dcSdrh if( needSyncPgno<=pPager->dbOrigSize ){ 7193e98c9049Sdrh assert( pPager->pTmpSpace!=0 ); 7194e98c9049Sdrh sqlite3BitvecClear(pPager->pInJournal, needSyncPgno, pPager->pTmpSpace); 7195a98d7b47Sdanielk1977 } 719687c29a94Sdanielk1977 return rc; 719787c29a94Sdanielk1977 } 71988c0a791aSdanielk1977 pPgHdr->flags |= PGHDR_NEED_SYNC; 7199c047b9f7Sdrh sqlite3PcacheMakeDirty(pPgHdr); 7200da8a330aSdrh sqlite3PagerUnrefNotNull(pPgHdr); 720194daf7fdSdanielk1977 } 720294daf7fdSdanielk1977 7203687566d7Sdanielk1977 return SQLITE_OK; 7204687566d7Sdanielk1977 } 7205e6593d8eSdan #endif 720633ea4866Sdan 7207e6593d8eSdan /* 7208e6593d8eSdan ** The page handle passed as the first argument refers to a dirty page 7209e6593d8eSdan ** with a page number other than iNew. This function changes the page's 7210e6593d8eSdan ** page number to iNew and sets the value of the PgHdr.flags field to 7211e6593d8eSdan ** the value passed as the third parameter. 7212e6593d8eSdan */ 721331f4e99dSdan void sqlite3PagerRekey(DbPage *pPg, Pgno iNew, u16 flags){ 7214e6593d8eSdan assert( pPg->pgno!=iNew ); 7215e6593d8eSdan pPg->flags = flags; 721633ea4866Sdan sqlite3PcacheMove(pPg, iNew); 721733ea4866Sdan } 721833ea4866Sdan 72193b8a05f6Sdanielk1977 /* 72203b8a05f6Sdanielk1977 ** Return a pointer to the data for the specified page. 72213b8a05f6Sdanielk1977 */ 72223b8a05f6Sdanielk1977 void *sqlite3PagerGetData(DbPage *pPg){ 722371d5d2cdSdanielk1977 assert( pPg->nRef>0 || pPg->pPager->memDb ); 72248c0a791aSdanielk1977 return pPg->pData; 72253b8a05f6Sdanielk1977 } 72263b8a05f6Sdanielk1977 72273b8a05f6Sdanielk1977 /* 72283b8a05f6Sdanielk1977 ** Return a pointer to the Pager.nExtra bytes of "extra" space 72293b8a05f6Sdanielk1977 ** allocated along with the specified page. 72303b8a05f6Sdanielk1977 */ 72313b8a05f6Sdanielk1977 void *sqlite3PagerGetExtra(DbPage *pPg){ 72326aac11dcSdrh return pPg->pExtra; 72333b8a05f6Sdanielk1977 } 72343b8a05f6Sdanielk1977 723541483468Sdanielk1977 /* 723641483468Sdanielk1977 ** Get/set the locking-mode for this pager. Parameter eMode must be one 723741483468Sdanielk1977 ** of PAGER_LOCKINGMODE_QUERY, PAGER_LOCKINGMODE_NORMAL or 723841483468Sdanielk1977 ** PAGER_LOCKINGMODE_EXCLUSIVE. If the parameter is not _QUERY, then 723941483468Sdanielk1977 ** the locking-mode is set to the value specified. 724041483468Sdanielk1977 ** 724141483468Sdanielk1977 ** The returned value is either PAGER_LOCKINGMODE_NORMAL or 724241483468Sdanielk1977 ** PAGER_LOCKINGMODE_EXCLUSIVE, indicating the current (possibly updated) 724341483468Sdanielk1977 ** locking-mode. 724441483468Sdanielk1977 */ 724541483468Sdanielk1977 int sqlite3PagerLockingMode(Pager *pPager, int eMode){ 7246369339dbSdrh assert( eMode==PAGER_LOCKINGMODE_QUERY 7247369339dbSdrh || eMode==PAGER_LOCKINGMODE_NORMAL 7248369339dbSdrh || eMode==PAGER_LOCKINGMODE_EXCLUSIVE ); 7249369339dbSdrh assert( PAGER_LOCKINGMODE_QUERY<0 ); 7250369339dbSdrh assert( PAGER_LOCKINGMODE_NORMAL>=0 && PAGER_LOCKINGMODE_EXCLUSIVE>=0 ); 72518c408004Sdan assert( pPager->exclusiveMode || 0==sqlite3WalHeapMemory(pPager->pWal) ); 72528c408004Sdan if( eMode>=0 && !pPager->tempFile && !sqlite3WalHeapMemory(pPager->pWal) ){ 72531bd10f8aSdrh pPager->exclusiveMode = (u8)eMode; 725441483468Sdanielk1977 } 725541483468Sdanielk1977 return (int)pPager->exclusiveMode; 725641483468Sdanielk1977 } 725741483468Sdanielk1977 72583b02013eSdrh /* 72590b9b4301Sdrh ** Set the journal-mode for this pager. Parameter eMode must be one of: 72603b02013eSdrh ** 726104335886Sdrh ** PAGER_JOURNALMODE_DELETE 726204335886Sdrh ** PAGER_JOURNALMODE_TRUNCATE 726304335886Sdrh ** PAGER_JOURNALMODE_PERSIST 726404335886Sdrh ** PAGER_JOURNALMODE_OFF 7265bea2a948Sdanielk1977 ** PAGER_JOURNALMODE_MEMORY 72667c24610eSdan ** PAGER_JOURNALMODE_WAL 726704335886Sdrh ** 72680b9b4301Sdrh ** The journalmode is set to the value specified if the change is allowed. 72690b9b4301Sdrh ** The change may be disallowed for the following reasons: 72708a939190Sdrh ** 72718a939190Sdrh ** * An in-memory database can only have its journal_mode set to _OFF 72728a939190Sdrh ** or _MEMORY. 72738a939190Sdrh ** 72740b9b4301Sdrh ** * Temporary databases cannot have _WAL journalmode. 727504335886Sdrh ** 7276bea2a948Sdanielk1977 ** The returned indicate the current (possibly updated) journal-mode. 72773b02013eSdrh */ 72780b9b4301Sdrh int sqlite3PagerSetJournalMode(Pager *pPager, int eMode){ 72790b9b4301Sdrh u8 eOld = pPager->journalMode; /* Prior journalmode */ 72800b9b4301Sdrh 7281e5918c62Sdrh #ifdef SQLITE_DEBUG 7282e5918c62Sdrh /* The print_pager_state() routine is intended to be used by the debugger 7283e5918c62Sdrh ** only. We invoke it once here to suppress a compiler warning. */ 7284e5918c62Sdrh print_pager_state(pPager); 7285e5918c62Sdrh #endif 7286e5918c62Sdrh 7287e5918c62Sdrh 72880b9b4301Sdrh /* The eMode parameter is always valid */ 72890b9b4301Sdrh assert( eMode==PAGER_JOURNALMODE_DELETE 729004335886Sdrh || eMode==PAGER_JOURNALMODE_TRUNCATE 7291fdc40e91Sdrh || eMode==PAGER_JOURNALMODE_PERSIST 7292b3175389Sdanielk1977 || eMode==PAGER_JOURNALMODE_OFF 72937c24610eSdan || eMode==PAGER_JOURNALMODE_WAL 7294b3175389Sdanielk1977 || eMode==PAGER_JOURNALMODE_MEMORY ); 7295e04dc88bSdan 7296a485cccdSdrh /* This routine is only called from the OP_JournalMode opcode, and 7297a485cccdSdrh ** the logic there will never allow a temporary file to be changed 7298a485cccdSdrh ** to WAL mode. 72990b9b4301Sdrh */ 7300a485cccdSdrh assert( pPager->tempFile==0 || eMode!=PAGER_JOURNALMODE_WAL ); 73010b9b4301Sdrh 73020b9b4301Sdrh /* Do allow the journalmode of an in-memory database to be set to 73030b9b4301Sdrh ** anything other than MEMORY or OFF 73040b9b4301Sdrh */ 73050b9b4301Sdrh if( MEMDB ){ 73060b9b4301Sdrh assert( eOld==PAGER_JOURNALMODE_MEMORY || eOld==PAGER_JOURNALMODE_OFF ); 73070b9b4301Sdrh if( eMode!=PAGER_JOURNALMODE_MEMORY && eMode!=PAGER_JOURNALMODE_OFF ){ 73080b9b4301Sdrh eMode = eOld; 73090b9b4301Sdrh } 73100b9b4301Sdrh } 73110b9b4301Sdrh 73120b9b4301Sdrh if( eMode!=eOld ){ 73130b9b4301Sdrh 73140b9b4301Sdrh /* Change the journal mode. */ 7315e5953ccdSdan assert( pPager->eState!=PAGER_ERROR ); 73160b9b4301Sdrh pPager->journalMode = (u8)eMode; 7317731bf5bcSdan 7318731bf5bcSdan /* When transistioning from TRUNCATE or PERSIST to any other journal 7319e5953ccdSdan ** mode except WAL, unless the pager is in locking_mode=exclusive mode, 7320731bf5bcSdan ** delete the journal file. 7321731bf5bcSdan */ 7322731bf5bcSdan assert( (PAGER_JOURNALMODE_TRUNCATE & 5)==1 ); 7323731bf5bcSdan assert( (PAGER_JOURNALMODE_PERSIST & 5)==1 ); 7324731bf5bcSdan assert( (PAGER_JOURNALMODE_DELETE & 5)==0 ); 7325731bf5bcSdan assert( (PAGER_JOURNALMODE_MEMORY & 5)==4 ); 7326731bf5bcSdan assert( (PAGER_JOURNALMODE_OFF & 5)==0 ); 7327731bf5bcSdan assert( (PAGER_JOURNALMODE_WAL & 5)==5 ); 7328731bf5bcSdan 7329731bf5bcSdan assert( isOpen(pPager->fd) || pPager->exclusiveMode ); 7330731bf5bcSdan if( !pPager->exclusiveMode && (eOld & 5)==1 && (eMode & 1)==0 ){ 7331731bf5bcSdan 7332731bf5bcSdan /* In this case we would like to delete the journal file. If it is 7333731bf5bcSdan ** not possible, then that is not a problem. Deleting the journal file 7334731bf5bcSdan ** here is an optimization only. 7335731bf5bcSdan ** 7336731bf5bcSdan ** Before deleting the journal file, obtain a RESERVED lock on the 7337731bf5bcSdan ** database file. This ensures that the journal file is not deleted 7338731bf5bcSdan ** while it is in use by some other client. 7339731bf5bcSdan */ 7340e5953ccdSdan sqlite3OsClose(pPager->jfd); 7341e5953ccdSdan if( pPager->eLock>=RESERVED_LOCK ){ 7342e5953ccdSdan sqlite3OsDelete(pPager->pVfs, pPager->zJournal, 0); 7343e5953ccdSdan }else{ 7344731bf5bcSdan int rc = SQLITE_OK; 7345d0864087Sdan int state = pPager->eState; 73465653e4daSdan assert( state==PAGER_OPEN || state==PAGER_READER ); 7347de1ae34eSdan if( state==PAGER_OPEN ){ 7348731bf5bcSdan rc = sqlite3PagerSharedLock(pPager); 7349731bf5bcSdan } 7350d0864087Sdan if( pPager->eState==PAGER_READER ){ 7351731bf5bcSdan assert( rc==SQLITE_OK ); 73524e004aa6Sdan rc = pagerLockDb(pPager, RESERVED_LOCK); 7353731bf5bcSdan } 7354731bf5bcSdan if( rc==SQLITE_OK ){ 7355731bf5bcSdan sqlite3OsDelete(pPager->pVfs, pPager->zJournal, 0); 7356731bf5bcSdan } 7357d0864087Sdan if( rc==SQLITE_OK && state==PAGER_READER ){ 73584e004aa6Sdan pagerUnlockDb(pPager, SHARED_LOCK); 7359de1ae34eSdan }else if( state==PAGER_OPEN ){ 7360731bf5bcSdan pager_unlock(pPager); 7361731bf5bcSdan } 7362d0864087Sdan assert( state==pPager->eState ); 7363731bf5bcSdan } 7364929b9233Sdan }else if( eMode==PAGER_JOURNALMODE_OFF ){ 7365929b9233Sdan sqlite3OsClose(pPager->jfd); 7366b3175389Sdanielk1977 } 7367e5953ccdSdan } 73680b9b4301Sdrh 73690b9b4301Sdrh /* Return the new journal mode */ 7370fdc40e91Sdrh return (int)pPager->journalMode; 73713b02013eSdrh } 73723b02013eSdrh 7373b53e4960Sdanielk1977 /* 73740b9b4301Sdrh ** Return the current journal mode. 73750b9b4301Sdrh */ 73760b9b4301Sdrh int sqlite3PagerGetJournalMode(Pager *pPager){ 73770b9b4301Sdrh return (int)pPager->journalMode; 73780b9b4301Sdrh } 73790b9b4301Sdrh 73800b9b4301Sdrh /* 73810b9b4301Sdrh ** Return TRUE if the pager is in a state where it is OK to change the 73820b9b4301Sdrh ** journalmode. Journalmode changes can only happen when the database 73830b9b4301Sdrh ** is unmodified. 73840b9b4301Sdrh */ 73850b9b4301Sdrh int sqlite3PagerOkToChangeJournalMode(Pager *pPager){ 73864e004aa6Sdan assert( assert_pager_state(pPager) ); 7387d0864087Sdan if( pPager->eState>=PAGER_WRITER_CACHEMOD ) return 0; 738889ccf448Sdan if( NEVER(isOpen(pPager->jfd) && pPager->journalOff>0) ) return 0; 73890b9b4301Sdrh return 1; 73900b9b4301Sdrh } 73910b9b4301Sdrh 73920b9b4301Sdrh /* 7393b53e4960Sdanielk1977 ** Get/set the size-limit used for persistent journal files. 73945d73854bSdrh ** 73955d73854bSdrh ** Setting the size limit to -1 means no limit is enforced. 73965d73854bSdrh ** An attempt to set a limit smaller than -1 is a no-op. 7397b53e4960Sdanielk1977 */ 7398b53e4960Sdanielk1977 i64 sqlite3PagerJournalSizeLimit(Pager *pPager, i64 iLimit){ 7399b53e4960Sdanielk1977 if( iLimit>=-1 ){ 7400b53e4960Sdanielk1977 pPager->journalSizeLimit = iLimit; 740185a83755Sdrh sqlite3WalLimit(pPager->pWal, iLimit); 7402b53e4960Sdanielk1977 } 7403b53e4960Sdanielk1977 return pPager->journalSizeLimit; 7404b53e4960Sdanielk1977 } 7405b53e4960Sdanielk1977 74060410302eSdanielk1977 /* 74070410302eSdanielk1977 ** Return a pointer to the pPager->pBackup variable. The backup module 74080410302eSdanielk1977 ** in backup.c maintains the content of this variable. This module 74090410302eSdanielk1977 ** uses it opaquely as an argument to sqlite3BackupRestart() and 74100410302eSdanielk1977 ** sqlite3BackupUpdate() only. 74110410302eSdanielk1977 */ 74120410302eSdanielk1977 sqlite3_backup **sqlite3PagerBackupPtr(Pager *pPager){ 74130410302eSdanielk1977 return &pPager->pBackup; 74140410302eSdanielk1977 } 74150410302eSdanielk1977 741643c1ce39Sdan #ifndef SQLITE_OMIT_VACUUM 741743c1ce39Sdan /* 741843c1ce39Sdan ** Unless this is an in-memory or temporary database, clear the pager cache. 741943c1ce39Sdan */ 742043c1ce39Sdan void sqlite3PagerClearCache(Pager *pPager){ 742143c1ce39Sdan assert( MEMDB==0 || pPager->tempFile ); 742243c1ce39Sdan if( pPager->tempFile==0 ) pager_reset(pPager); 742343c1ce39Sdan } 742443c1ce39Sdan #endif 742543c1ce39Sdan 742643c1ce39Sdan 74275cf53537Sdan #ifndef SQLITE_OMIT_WAL 74287c24610eSdan /* 7429a58f26f9Sdan ** This function is called when the user invokes "PRAGMA wal_checkpoint", 7430a58f26f9Sdan ** "PRAGMA wal_blocking_checkpoint" or calls the sqlite3_wal_checkpoint() 7431a58f26f9Sdan ** or wal_blocking_checkpoint() API functions. 7432a58f26f9Sdan ** 7433cdc1f049Sdan ** Parameter eMode is one of SQLITE_CHECKPOINT_PASSIVE, FULL or RESTART. 74347c24610eSdan */ 74357fb89906Sdan int sqlite3PagerCheckpoint( 74367fb89906Sdan Pager *pPager, /* Checkpoint on this pager */ 74377fb89906Sdan sqlite3 *db, /* Db handle used to check for interrupts */ 74387fb89906Sdan int eMode, /* Type of checkpoint */ 74397fb89906Sdan int *pnLog, /* OUT: Final number of frames in log */ 74407fb89906Sdan int *pnCkpt /* OUT: Final number of checkpointed frames */ 74417fb89906Sdan ){ 74427c24610eSdan int rc = SQLITE_OK; 74437ed91f23Sdrh if( pPager->pWal ){ 74447fb89906Sdan rc = sqlite3WalCheckpoint(pPager->pWal, db, eMode, 7445dd90d7eeSdrh (eMode==SQLITE_CHECKPOINT_PASSIVE ? 0 : pPager->xBusyHandler), 7446dd90d7eeSdrh pPager->pBusyHandlerArg, 7447daaae7b9Sdrh pPager->walSyncFlags, pPager->pageSize, (u8 *)pPager->pTmpSpace, 7448cdc1f049Sdan pnLog, pnCkpt 744964d039e5Sdan ); 7450fd72563dSdrh sqlite3PagerResetLockTimeout(pPager); 74517c24610eSdan } 74527c24610eSdan return rc; 74537c24610eSdan } 74547c24610eSdan 74557ed91f23Sdrh int sqlite3PagerWalCallback(Pager *pPager){ 74567ed91f23Sdrh return sqlite3WalCallback(pPager->pWal); 74578d22a174Sdan } 74588d22a174Sdan 7459e04dc88bSdan /* 7460d9e5c4f6Sdrh ** Return true if the underlying VFS for the given pager supports the 7461d9e5c4f6Sdrh ** primitives necessary for write-ahead logging. 7462d9e5c4f6Sdrh */ 7463d9e5c4f6Sdrh int sqlite3PagerWalSupported(Pager *pPager){ 7464d9e5c4f6Sdrh const sqlite3_io_methods *pMethods = pPager->fd->pMethods; 7465ffbb02a3Sdrh if( pPager->noLock ) return 0; 7466d4e0bb0eSdrh return pPager->exclusiveMode || (pMethods->iVersion>=2 && pMethods->xShmMap); 7467d9e5c4f6Sdrh } 7468d9e5c4f6Sdrh 7469d9e5c4f6Sdrh /* 74708c408004Sdan ** Attempt to take an exclusive lock on the database file. If a PENDING lock 74718c408004Sdan ** is obtained instead, immediately release it. 74728c408004Sdan */ 74738c408004Sdan static int pagerExclusiveLock(Pager *pPager){ 74748c408004Sdan int rc; /* Return code */ 74758c408004Sdan 74768c408004Sdan assert( pPager->eLock==SHARED_LOCK || pPager->eLock==EXCLUSIVE_LOCK ); 74778c408004Sdan rc = pagerLockDb(pPager, EXCLUSIVE_LOCK); 74788c408004Sdan if( rc!=SQLITE_OK ){ 74797f0857c4Sdrh /* If the attempt to grab the exclusive lock failed, release the 74807f0857c4Sdrh ** pending lock that may have been obtained instead. */ 74818c408004Sdan pagerUnlockDb(pPager, SHARED_LOCK); 74828c408004Sdan } 74838c408004Sdan 74848c408004Sdan return rc; 74858c408004Sdan } 74868c408004Sdan 74878c408004Sdan /* 74888c408004Sdan ** Call sqlite3WalOpen() to open the WAL handle. If the pager is in 74898c408004Sdan ** exclusive-locking mode when this function is called, take an EXCLUSIVE 74908c408004Sdan ** lock on the database file and use heap-memory to store the wal-index 74918c408004Sdan ** in. Otherwise, use the normal shared-memory. 74928c408004Sdan */ 74938c408004Sdan static int pagerOpenWal(Pager *pPager){ 74948c408004Sdan int rc = SQLITE_OK; 74958c408004Sdan 74968c408004Sdan assert( pPager->pWal==0 && pPager->tempFile==0 ); 749733f111dcSdrh assert( pPager->eLock==SHARED_LOCK || pPager->eLock==EXCLUSIVE_LOCK ); 74988c408004Sdan 74998c408004Sdan /* If the pager is already in exclusive-mode, the WAL module will use 75008c408004Sdan ** heap-memory for the wal-index instead of the VFS shared-memory 75018c408004Sdan ** implementation. Take the exclusive lock now, before opening the WAL 75028c408004Sdan ** file, to make sure this is safe. 75038c408004Sdan */ 75048c408004Sdan if( pPager->exclusiveMode ){ 75058c408004Sdan rc = pagerExclusiveLock(pPager); 75068c408004Sdan } 75078c408004Sdan 75088c408004Sdan /* Open the connection to the log file. If this operation fails, 75098c408004Sdan ** (e.g. due to malloc() failure), return an error code. 75108c408004Sdan */ 75118c408004Sdan if( rc==SQLITE_OK ){ 7512f23da966Sdan rc = sqlite3WalOpen(pPager->pVfs, 751385a83755Sdrh pPager->fd, pPager->zWal, pPager->exclusiveMode, 751485a83755Sdrh pPager->journalSizeLimit, &pPager->pWal 75158c408004Sdan ); 75168c408004Sdan } 75175d8a1372Sdan pagerFixMaplimit(pPager); 75188c408004Sdan 75198c408004Sdan return rc; 75208c408004Sdan } 75218c408004Sdan 75228c408004Sdan 75238c408004Sdan /* 7524e04dc88bSdan ** The caller must be holding a SHARED lock on the database file to call 7525e04dc88bSdan ** this function. 752640e459e0Sdrh ** 752740e459e0Sdrh ** If the pager passed as the first argument is open on a real database 752840e459e0Sdrh ** file (not a temp file or an in-memory database), and the WAL file 752940e459e0Sdrh ** is not already open, make an attempt to open it now. If successful, 753040e459e0Sdrh ** return SQLITE_OK. If an error occurs or the VFS used by the pager does 7531763afe62Sdan ** not support the xShmXXX() methods, return an error code. *pbOpen is 753240e459e0Sdrh ** not modified in either case. 753340e459e0Sdrh ** 753440e459e0Sdrh ** If the pager is open on a temp-file (or in-memory database), or if 7535763afe62Sdan ** the WAL file is already open, set *pbOpen to 1 and return SQLITE_OK 753640e459e0Sdrh ** without doing anything. 7537e04dc88bSdan */ 753840e459e0Sdrh int sqlite3PagerOpenWal( 753940e459e0Sdrh Pager *pPager, /* Pager object */ 7540763afe62Sdan int *pbOpen /* OUT: Set to true if call is a no-op */ 754140e459e0Sdrh ){ 7542e04dc88bSdan int rc = SQLITE_OK; /* Return code */ 7543e04dc88bSdan 7544763afe62Sdan assert( assert_pager_state(pPager) ); 7545de1ae34eSdan assert( pPager->eState==PAGER_OPEN || pbOpen ); 7546763afe62Sdan assert( pPager->eState==PAGER_READER || !pbOpen ); 7547763afe62Sdan assert( pbOpen==0 || *pbOpen==0 ); 7548763afe62Sdan assert( pbOpen!=0 || (!pPager->tempFile && !pPager->pWal) ); 754940e459e0Sdrh 755040e459e0Sdrh if( !pPager->tempFile && !pPager->pWal ){ 7551d9e5c4f6Sdrh if( !sqlite3PagerWalSupported(pPager) ) return SQLITE_CANTOPEN; 7552e04dc88bSdan 7553919fc66eSdrh /* Close any rollback journal previously open */ 75544e004aa6Sdan sqlite3OsClose(pPager->jfd); 75554e004aa6Sdan 75568c408004Sdan rc = pagerOpenWal(pPager); 7557e04dc88bSdan if( rc==SQLITE_OK ){ 7558e04dc88bSdan pPager->journalMode = PAGER_JOURNALMODE_WAL; 7559de1ae34eSdan pPager->eState = PAGER_OPEN; 7560e04dc88bSdan } 7561e04dc88bSdan }else{ 7562763afe62Sdan *pbOpen = 1; 7563e04dc88bSdan } 7564e04dc88bSdan 7565e04dc88bSdan return rc; 7566e04dc88bSdan } 7567e04dc88bSdan 7568e04dc88bSdan /* 7569e04dc88bSdan ** This function is called to close the connection to the log file prior 7570e04dc88bSdan ** to switching from WAL to rollback mode. 7571e04dc88bSdan ** 7572e04dc88bSdan ** Before closing the log file, this function attempts to take an 7573e04dc88bSdan ** EXCLUSIVE lock on the database file. If this cannot be obtained, an 7574e04dc88bSdan ** error (SQLITE_BUSY) is returned and the log connection is not closed. 7575e04dc88bSdan ** If successful, the EXCLUSIVE lock is not released before returning. 7576e04dc88bSdan */ 75777fb89906Sdan int sqlite3PagerCloseWal(Pager *pPager, sqlite3 *db){ 7578e04dc88bSdan int rc = SQLITE_OK; 7579e04dc88bSdan 7580ede6eb8dSdan assert( pPager->journalMode==PAGER_JOURNALMODE_WAL ); 7581ede6eb8dSdan 7582ede6eb8dSdan /* If the log file is not already open, but does exist in the file-system, 7583ede6eb8dSdan ** it may need to be checkpointed before the connection can switch to 7584ede6eb8dSdan ** rollback mode. Open it now so this can happen. 7585ede6eb8dSdan */ 75867ed91f23Sdrh if( !pPager->pWal ){ 7587ede6eb8dSdan int logexists = 0; 75884e004aa6Sdan rc = pagerLockDb(pPager, SHARED_LOCK); 7589ede6eb8dSdan if( rc==SQLITE_OK ){ 7590db10f082Sdan rc = sqlite3OsAccess( 7591db10f082Sdan pPager->pVfs, pPager->zWal, SQLITE_ACCESS_EXISTS, &logexists 7592db10f082Sdan ); 7593ede6eb8dSdan } 7594ede6eb8dSdan if( rc==SQLITE_OK && logexists ){ 75958c408004Sdan rc = pagerOpenWal(pPager); 7596ede6eb8dSdan } 7597ede6eb8dSdan } 7598ede6eb8dSdan 7599ede6eb8dSdan /* Checkpoint and close the log. Because an EXCLUSIVE lock is held on 7600ede6eb8dSdan ** the database file, the log and log-summary files will be deleted. 7601ede6eb8dSdan */ 76027ed91f23Sdrh if( rc==SQLITE_OK && pPager->pWal ){ 76038c408004Sdan rc = pagerExclusiveLock(pPager); 7604e04dc88bSdan if( rc==SQLITE_OK ){ 7605daaae7b9Sdrh rc = sqlite3WalClose(pPager->pWal, db, pPager->walSyncFlags, 7606c97d8463Sdrh pPager->pageSize, (u8*)pPager->pTmpSpace); 76077ed91f23Sdrh pPager->pWal = 0; 76085d8a1372Sdan pagerFixMaplimit(pPager); 7609cdce61e1Sdrh if( rc && !pPager->exclusiveMode ) pagerUnlockDb(pPager, SHARED_LOCK); 7610e04dc88bSdan } 7611e04dc88bSdan } 7612e04dc88bSdan return rc; 7613e04dc88bSdan } 761447ee386fSdan 7615fc1acf33Sdan #ifdef SQLITE_ENABLE_SNAPSHOT 7616fc1acf33Sdan /* 7617fc1acf33Sdan ** If this is a WAL database, obtain a snapshot handle for the snapshot 7618fc1acf33Sdan ** currently open. Otherwise, return an error. 7619fc1acf33Sdan */ 7620fc1acf33Sdan int sqlite3PagerSnapshotGet(Pager *pPager, sqlite3_snapshot **ppSnapshot){ 7621fc1acf33Sdan int rc = SQLITE_ERROR; 7622fc1acf33Sdan if( pPager->pWal ){ 7623fc1acf33Sdan rc = sqlite3WalSnapshotGet(pPager->pWal, ppSnapshot); 7624fc1acf33Sdan } 7625fc1acf33Sdan return rc; 7626fc1acf33Sdan } 7627fc1acf33Sdan 7628fc1acf33Sdan /* 7629fc1acf33Sdan ** If this is a WAL database, store a pointer to pSnapshot. Next time a 7630fc1acf33Sdan ** read transaction is opened, attempt to read from the snapshot it 7631fc1acf33Sdan ** identifies. If this is not a WAL database, return an error. 7632fc1acf33Sdan */ 7633fc1acf33Sdan int sqlite3PagerSnapshotOpen(Pager *pPager, sqlite3_snapshot *pSnapshot){ 7634fc1acf33Sdan int rc = SQLITE_OK; 7635fc1acf33Sdan if( pPager->pWal ){ 7636fc1acf33Sdan sqlite3WalSnapshotOpen(pPager->pWal, pSnapshot); 7637fc1acf33Sdan }else{ 7638fc1acf33Sdan rc = SQLITE_ERROR; 7639fc1acf33Sdan } 7640fc1acf33Sdan return rc; 7641fc1acf33Sdan } 76421158498dSdan 76431158498dSdan /* 76441158498dSdan ** If this is a WAL database, call sqlite3WalSnapshotRecover(). If this 76451158498dSdan ** is not a WAL database, return an error. 76461158498dSdan */ 76471158498dSdan int sqlite3PagerSnapshotRecover(Pager *pPager){ 76481158498dSdan int rc; 76491158498dSdan if( pPager->pWal ){ 76501158498dSdan rc = sqlite3WalSnapshotRecover(pPager->pWal); 76511158498dSdan }else{ 76521158498dSdan rc = SQLITE_ERROR; 76531158498dSdan } 76541158498dSdan return rc; 76551158498dSdan } 7656*fa3d4c19Sdan 7657*fa3d4c19Sdan /* 7658*fa3d4c19Sdan ** The caller currently has a read transaction open on the database. 7659*fa3d4c19Sdan ** If this is not a WAL database, SQLITE_ERROR is returned. Otherwise, 7660*fa3d4c19Sdan ** this function takes a SHARED lock on the CHECKPOINTER slot and then 7661*fa3d4c19Sdan ** checks if the snapshot passed as the second argument is still 7662*fa3d4c19Sdan ** available. If so, SQLITE_OK is returned. 7663*fa3d4c19Sdan ** 7664*fa3d4c19Sdan ** If the snapshot is not available, SQLITE_ERROR is returned. Or, if 7665*fa3d4c19Sdan ** the CHECKPOINTER lock cannot be obtained, SQLITE_BUSY. If any error 7666*fa3d4c19Sdan ** occurs (any value other than SQLITE_OK is returned), the CHECKPOINTER 7667*fa3d4c19Sdan ** lock is released before returning. 7668*fa3d4c19Sdan */ 7669*fa3d4c19Sdan int sqlite3PagerSnapshotCheck(Pager *pPager, sqlite3_snapshot *pSnapshot){ 7670*fa3d4c19Sdan int rc; 7671*fa3d4c19Sdan if( pPager->pWal ){ 7672*fa3d4c19Sdan rc = sqlite3WalSnapshotCheck(pPager->pWal, pSnapshot); 7673*fa3d4c19Sdan }else{ 7674*fa3d4c19Sdan rc = SQLITE_ERROR; 7675*fa3d4c19Sdan } 7676*fa3d4c19Sdan return rc; 7677*fa3d4c19Sdan } 7678*fa3d4c19Sdan 7679*fa3d4c19Sdan /* 7680*fa3d4c19Sdan ** Release a lock obtained by an earlier successful call to 7681*fa3d4c19Sdan ** sqlite3PagerSnapshotCheck(). 7682*fa3d4c19Sdan */ 7683*fa3d4c19Sdan void sqlite3PagerSnapshotUnlock(Pager *pPager){ 7684*fa3d4c19Sdan assert( pPager->pWal ); 7685*fa3d4c19Sdan return sqlite3WalSnapshotUnlock(pPager->pWal); 7686*fa3d4c19Sdan } 7687*fa3d4c19Sdan 7688fc1acf33Sdan #endif /* SQLITE_ENABLE_SNAPSHOT */ 7689f7c7031fSdrh #endif /* !SQLITE_OMIT_WAL */ 7690f7c7031fSdrh 769170708600Sdrh #ifdef SQLITE_ENABLE_ZIPVFS 7692b3bdc72dSdan /* 7693b3bdc72dSdan ** A read-lock must be held on the pager when this function is called. If 7694b3bdc72dSdan ** the pager is in WAL mode and the WAL file currently contains one or more 7695b3bdc72dSdan ** frames, return the size in bytes of the page images stored within the 7696b3bdc72dSdan ** WAL frames. Otherwise, if this is not a WAL database or the WAL file 7697b3bdc72dSdan ** is empty, return 0. 7698b3bdc72dSdan */ 7699b3bdc72dSdan int sqlite3PagerWalFramesize(Pager *pPager){ 77009675d5daSdan assert( pPager->eState>=PAGER_READER ); 7701b3bdc72dSdan return sqlite3WalFramesize(pPager->pWal); 7702b3bdc72dSdan } 770370708600Sdrh #endif 7704b3bdc72dSdan 77052e66f0b9Sdrh #endif /* SQLITE_OMIT_DISKIO */ 7706