1d34c03a9Sdanielk1977 2d34c03a9Sdanielk19771. OVERVIEW 3d34c03a9Sdanielk1977 4d34c03a9Sdanielk1977 This README file describes the syntax of the arguments that may be passed to 5d34c03a9Sdanielk1977 the FTS3 MATCH operator used for full-text queries. For example, if table 6d34c03a9Sdanielk1977 "t1" is an Fts3 virtual table, the following SQL query: 7d34c03a9Sdanielk1977 8d34c03a9Sdanielk1977 SELECT * FROM t1 WHERE <col> MATCH <full-text query> 9d34c03a9Sdanielk1977 10d34c03a9Sdanielk1977 may be used to retrieve all rows that match a specified for full-text query. 11d34c03a9Sdanielk1977 The text "<col>" should be replaced by either the name of the fts3 table 12d34c03a9Sdanielk1977 (in this case "t1"), or by the name of one of the columns of the fts3 13d34c03a9Sdanielk1977 table. <full-text-query> should be replaced by an SQL expression that 14d34c03a9Sdanielk1977 computes to a string containing an Fts3 query. 15d34c03a9Sdanielk1977 16d34c03a9Sdanielk1977 If the left-hand-side of the MATCH operator is set to the name of the 17d34c03a9Sdanielk1977 fts3 table, then by default the query may be matched against any column 18d34c03a9Sdanielk1977 of the table. If it is set to a column name, then by default the query 19d34c03a9Sdanielk1977 may only match the specified column. In both cases this may be overriden 20d34c03a9Sdanielk1977 as part of the query text (see sections 2 and 3 below). 21d34c03a9Sdanielk1977 22d34c03a9Sdanielk1977 As of SQLite version 3.6.8, Fts3 supports two slightly different query 23d34c03a9Sdanielk1977 formats; the standard syntax, which is used by default, and the enhanced 24d34c03a9Sdanielk1977 query syntax which can be selected by compiling with the pre-processor 25*42128b9eSdrh symbol SQLITE_ENABLE_FTS3_PARENTHESIS defined. 26d34c03a9Sdanielk1977 27*42128b9eSdrh -DSQLITE_ENABLE_FTS3_PARENTHESIS 28d34c03a9Sdanielk1977 29d34c03a9Sdanielk19772. STANDARD QUERY SYNTAX 30d34c03a9Sdanielk1977 31d34c03a9Sdanielk1977 When using the standard Fts3 query syntax, a query usually consists of a 32d34c03a9Sdanielk1977 list of terms (words) separated by white-space characters. To match a 33d34c03a9Sdanielk1977 query, a row (or column) of an Fts3 table must contain each of the specified 34d34c03a9Sdanielk1977 terms. For example, the following query: 35d34c03a9Sdanielk1977 36d34c03a9Sdanielk1977 <col> MATCH 'hello world' 37d34c03a9Sdanielk1977 38d34c03a9Sdanielk1977 matches rows (or columns, if <col> is the name of a column name) that 39d34c03a9Sdanielk1977 contain at least one instance of the token "hello", and at least one 40d34c03a9Sdanielk1977 instance of the token "world". Tokens may be grouped into phrases using 41d34c03a9Sdanielk1977 quotation marks. In this case, a matching row or column must contain each 42d34c03a9Sdanielk1977 of the tokens in the phrase in the order specified, with no intervening 43d34c03a9Sdanielk1977 tokens. For example, the query: 44d34c03a9Sdanielk1977 45d34c03a9Sdanielk1977 <col> MATCH '"hello world" joe" 46d34c03a9Sdanielk1977 47d34c03a9Sdanielk1977 matches the first of the following two documents, but not the second or 48d34c03a9Sdanielk1977 third: 49d34c03a9Sdanielk1977 50d34c03a9Sdanielk1977 "'Hello world', said Joe." 51d34c03a9Sdanielk1977 "One should always greet the world with a cheery hello, thought Joe." 52d34c03a9Sdanielk1977 "How many hello world programs could their be?" 53d34c03a9Sdanielk1977 54d34c03a9Sdanielk1977 As well as grouping tokens together by phrase, the binary NEAR operator 55d34c03a9Sdanielk1977 may be used to search for rows that contain two or more specified tokens 56d34c03a9Sdanielk1977 or phrases within a specified proximity of each other. The NEAR operator 57d34c03a9Sdanielk1977 must always be specified in upper case. The word "near" in lower or mixed 58d34c03a9Sdanielk1977 case is treated as an ordinary token. For example, the following query: 59d34c03a9Sdanielk1977 60d34c03a9Sdanielk1977 <col> MATCH 'engineering NEAR consultancy' 61d34c03a9Sdanielk1977 62d34c03a9Sdanielk1977 matches rows that contain both the "engineering" and "consultancy" tokens 63d34c03a9Sdanielk1977 in the same column with not more than 10 other words between them. It does 64d34c03a9Sdanielk1977 not matter which of the two terms occurs first in the document, only that 65d34c03a9Sdanielk1977 they be seperated by only 10 tokens or less. The user may also specify 66d34c03a9Sdanielk1977 a different required proximity by adding "/N" immediately after the NEAR 67d34c03a9Sdanielk1977 operator, where N is an integer. For example: 68d34c03a9Sdanielk1977 69d34c03a9Sdanielk1977 <col> MATCH 'engineering NEAR/5 consultancy' 70d34c03a9Sdanielk1977 71d34c03a9Sdanielk1977 searches for a row containing an instance of each specified token seperated 72d34c03a9Sdanielk1977 by not more than 5 other tokens. More than one NEAR operator can be used 73d34c03a9Sdanielk1977 in as sequence. For example this query: 74d34c03a9Sdanielk1977 75d34c03a9Sdanielk1977 <col> MATCH 'reliable NEAR/2 engineering NEAR/5 consultancy' 76d34c03a9Sdanielk1977 77d34c03a9Sdanielk1977 searches for a row that contains an instance of the token "reliable" 78d34c03a9Sdanielk1977 seperated by not more than two tokens from an instance of "engineering", 79d34c03a9Sdanielk1977 which is in turn separated by not more than 5 other tokens from an 80d34c03a9Sdanielk1977 instance of the term "consultancy". Phrases enclosed in quotes may 81d34c03a9Sdanielk1977 also be used as arguments to the NEAR operator. 82d34c03a9Sdanielk1977 83d34c03a9Sdanielk1977 Similar to the NEAR operator, one or more tokens or phrases may be 84d34c03a9Sdanielk1977 separated by OR operators. In this case, only one of the specified tokens 85d34c03a9Sdanielk1977 or phrases must appear in the document. For example, the query: 86d34c03a9Sdanielk1977 87d34c03a9Sdanielk1977 <col> MATCH 'hello OR world' 88d34c03a9Sdanielk1977 89d34c03a9Sdanielk1977 matches rows that contain either the term "hello", or the term "world", 90d34c03a9Sdanielk1977 or both. Note that unlike in many programming languages, the OR operator 91d34c03a9Sdanielk1977 has a higher precedence than the AND operators implied between white-space 92d34c03a9Sdanielk1977 separated tokens. The following query matches documents that contain the 93d34c03a9Sdanielk1977 term 'sqlite' and at least one of the terms 'fantastic' or 'impressive', 94d34c03a9Sdanielk1977 not those that contain both 'sqlite' and 'fantastic' or 'impressive': 95d34c03a9Sdanielk1977 96d34c03a9Sdanielk1977 <col> MATCH 'sqlite fantastic OR impressive' 97d34c03a9Sdanielk1977 98d34c03a9Sdanielk1977 Any token that is part of an Fts3 query expression, whether or not it is 99d34c03a9Sdanielk1977 part of a phrase enclosed in quotes, may have a '*' character appended to 100d34c03a9Sdanielk1977 it. In this case, the token matches all terms that begin with the characters 101d34c03a9Sdanielk1977 of the token, not just those that exactly match it. For example, the 102d34c03a9Sdanielk1977 following query: 103d34c03a9Sdanielk1977 104d34c03a9Sdanielk1977 <col> MATCH 'sql*' 105d34c03a9Sdanielk1977 106d34c03a9Sdanielk1977 matches all rows that contain the term "SQLite", as well as those that 107d34c03a9Sdanielk1977 contain "SQL". 108d34c03a9Sdanielk1977 109d34c03a9Sdanielk1977 A token that is not part of a quoted phrase may be preceded by a '-' 110d34c03a9Sdanielk1977 character, which indicates that matching rows must not contain the 111d34c03a9Sdanielk1977 specified term. For example, the following: 112d34c03a9Sdanielk1977 113d34c03a9Sdanielk1977 <col> MATCH '"database engine" -sqlite' 114d34c03a9Sdanielk1977 115d34c03a9Sdanielk1977 matches rows that contain the phrase "database engine" but do not contain 116d34c03a9Sdanielk1977 the term "sqlite". If the '-' character occurs inside a quoted phrase, 117d34c03a9Sdanielk1977 it is ignored. It is possible to use both the '-' prefix and the '*' postfix 118d34c03a9Sdanielk1977 on a single term. At this time, all Fts3 queries must contain at least 119d34c03a9Sdanielk1977 one term or phrase that is not preceded by the '-' prefix. 120d34c03a9Sdanielk1977 121d34c03a9Sdanielk1977 Regardless of whether or not a table name or column name is used on the 122d34c03a9Sdanielk1977 left hand side of the MATCH operator, a specific column of the fts3 table 123d34c03a9Sdanielk1977 may be associated with each token in a query by preceding a token with 124d34c03a9Sdanielk1977 a column name followed by a ':' character. For example, regardless of what 125d34c03a9Sdanielk1977 is specified for <col>, the following query requires that column "col1" 126d34c03a9Sdanielk1977 of the table contains the term "hello", and that column "col2" of the 127d34c03a9Sdanielk1977 table contains the term "world". If the table does not contain columns 128d34c03a9Sdanielk1977 named "col1" and "col2", then an error is returned and the query is 129d34c03a9Sdanielk1977 not run. 130d34c03a9Sdanielk1977 131d34c03a9Sdanielk1977 <col> MATCH 'col1:hello col2:world' 132d34c03a9Sdanielk1977 133d34c03a9Sdanielk1977 It is not possible to associate a specific table column with a quoted 134d34c03a9Sdanielk1977 phrase or a term preceded by a '-' operator. A '*' character may be 135d34c03a9Sdanielk1977 appended to a term associated with a specific column for prefix matching. 136d34c03a9Sdanielk1977 137d34c03a9Sdanielk19773. ENHANCED QUERY SYNTAX 138d34c03a9Sdanielk1977 139d34c03a9Sdanielk1977 The enhanced query syntax is quite similar to the standard query syntax, 140d34c03a9Sdanielk1977 with the following four differences: 141d34c03a9Sdanielk1977 142d34c03a9Sdanielk1977 1) Parenthesis are supported. When using the enhanced query syntax, 143d34c03a9Sdanielk1977 parenthesis may be used to overcome the built-in precedence of the 144d34c03a9Sdanielk1977 supplied binary operators. For example, the following query: 145d34c03a9Sdanielk1977 146d34c03a9Sdanielk1977 <col> MATCH '(hello world) OR (simple example)' 147d34c03a9Sdanielk1977 148d34c03a9Sdanielk1977 matches documents that contain both "hello" and "world", and documents 149d34c03a9Sdanielk1977 that contain both "simple" and "example". It is not possible to forumlate 150d34c03a9Sdanielk1977 such a query using the standard syntax. 151d34c03a9Sdanielk1977 152d34c03a9Sdanielk1977 2) Instead of separating tokens and phrases by whitespace, an AND operator 153d34c03a9Sdanielk1977 may be explicitly specified. This does not change query processing at 154d34c03a9Sdanielk1977 all, but may be used to improve readability. For example, the following 155d34c03a9Sdanielk1977 query is handled identically to the one above: 156d34c03a9Sdanielk1977 157d34c03a9Sdanielk1977 <col> MATCH '(hello AND world) OR (simple AND example)' 158d34c03a9Sdanielk1977 159d34c03a9Sdanielk1977 As with the OR and NEAR operators, the AND operator must be specified 160d34c03a9Sdanielk1977 in upper case. The word "and" specified in lower or mixed case is 161d34c03a9Sdanielk1977 handled as a regular token. 162d34c03a9Sdanielk1977 163d34c03a9Sdanielk1977 3) The '-' token prefix is not supported. Instead, a new binary operator, 164d34c03a9Sdanielk1977 NOT, is included. The NOT operator requires that the query specified 165d34c03a9Sdanielk1977 as its left-hand operator matches, but that the query specified as the 166d34c03a9Sdanielk1977 right-hand operator does not. For example, to query for all rows that 167d34c03a9Sdanielk1977 contain the term "example" but not the term "simple", the following 168d34c03a9Sdanielk1977 query could be used: 169d34c03a9Sdanielk1977 170d34c03a9Sdanielk1977 <col> MATCH 'example NOT simple' 171d34c03a9Sdanielk1977 172d34c03a9Sdanielk1977 As for all other operators, the NOT operator must be specified in 173d34c03a9Sdanielk1977 upper case. Otherwise it will be treated as a regular token. 174d34c03a9Sdanielk1977 175d34c03a9Sdanielk1977 4) Unlike in the standard syntax, where the OR operator has a higher 176d34c03a9Sdanielk1977 precedence than the implicit AND operator, when using the enhanced 177d34c03a9Sdanielk1977 syntax implicit and explict AND operators have a higher precedence 178d34c03a9Sdanielk1977 than OR operators. Using the enhanced syntax, the following two 179d34c03a9Sdanielk1977 queries are equivalent: 180d34c03a9Sdanielk1977 181d34c03a9Sdanielk1977 <col> MATCH 'sqlite fantastic OR impressive' 182d34c03a9Sdanielk1977 <col> MATCH '(sqlite AND fantastic) OR impressive' 183d34c03a9Sdanielk1977 184d34c03a9Sdanielk1977 however, when using the standard syntax, the query: 185d34c03a9Sdanielk1977 186d34c03a9Sdanielk1977 <col> MATCH 'sqlite fantastic OR impressive' 187d34c03a9Sdanielk1977 188d34c03a9Sdanielk1977 is equivalent to the enhanced syntax query: 189d34c03a9Sdanielk1977 190d34c03a9Sdanielk1977 <col> MATCH 'sqlite AND (fantastic OR impressive)' 191d34c03a9Sdanielk1977 192d34c03a9Sdanielk1977 The precedence of all enhanced syntax operators, in order from highest 193d34c03a9Sdanielk1977 to lowest, is: 194d34c03a9Sdanielk1977 195d34c03a9Sdanielk1977 NEAR (highest precedence, tightest grouping) 196d34c03a9Sdanielk1977 NOT 197d34c03a9Sdanielk1977 AND 198d34c03a9Sdanielk1977 OR (lowest precedence, loosest grouping) 199d34c03a9Sdanielk1977 200d34c03a9Sdanielk1977 Using the advanced syntax, it is possible to specify expressions enclosed 201d34c03a9Sdanielk1977 in parenthesis as operands to the NOT, AND and OR operators. However both 202d34c03a9Sdanielk1977 the left and right hand side operands of NEAR operators must be either 203d34c03a9Sdanielk1977 tokens or phrases. Attempting the following query will return an error: 204d34c03a9Sdanielk1977 205d34c03a9Sdanielk1977 <col> MATCH 'sqlite NEAR (fantastic OR impressive)' 206d34c03a9Sdanielk1977 207d34c03a9Sdanielk1977 Queries of this form must be re-written as: 208d34c03a9Sdanielk1977 209d34c03a9Sdanielk1977 <col> MATCH 'sqlite NEAR fantastic OR sqlite NEAR impressive' 210