We use SQLite to index all the words in a TEXT column in an SQLite3 database. We get started with the FTS3 feature of SQLite.
Some results. The results of using FT3 in a small database were impressive. After removing stop words, the search was done much faster. And the index was smaller.
Note: The database contained more than 700 text documents. It was reduced by not indexing stop words.
Optimization results: FTS3 No FTS3 enabled: 7.0 MB [smallest] Unreduced FTS3: 19.0 MB [largest] Reduced FTS3: 9.0 MB
First, we make a virtual table. A FTS3 table is a virtual table with extensions enabled in SQLite. After we use FTS3, we will see some ideas of how its performance can be improved.
You can create a virtual FTS3 table using SQL. You can store other kinds of columns in SQLite virtual tables, but only the TEXT ones are searched with FTS3 and the MATCH syntax.
Create FTS3 table -- Create a fts3 table that contains two columns. -- The first is a text column and the second is a datetime column. -- The column names and table name must be customized. CREATE VIRTUAL TABLE TableName USING FTS3(ColOne TEXT, ColTwo DATETIME)
Update FTS3 table. Here we look at updating FTS3 tables. Remember that before you are able to update data into the virtual table, you need to insert rows into it.
Tip: With SQLite there is no way around checking whether you need to insert or update a row. Here's how to update.
UPDATE -- Update our virtual table. UPDATE TableName SET ColOne=@a WHERE ColTwo=@b
Description of the UPDATE statement. You need to select first. Select first to see if the row exists, and you can insert as usual if it isn't there.
Tip: FTS3 is like a regular table. The virtual table acts like a regular table, but you need to manually maintain it.
And: You must maintain the indexes the virtual table to maintain referential integrity. You need to UPDATE the FTS3 table.
Match. Here we use MATCH to search through full-text tables in SQLite. Make sure you use MATCH instead of equals or LIKE to scan a TEXT column in a virtual table.
MATCH -- Look at the line starting with MATCH. -- That's where the fts3 stuff takes effect. -- FullText is a fts3 virtual table. SELECT jt.EntryTitle, jt.DateTimeNew, jt.WordCount FROM JournalTable jt, FullText ft WHERE ft.FullContent MATCH @query AND jt.DateTimeNew = ft.FullDate ORDER BY ft.FullDate DESC LIMIT 100
We find a row that contains the query, and then we take the date from that row in the FTS3 table. Then we do a join and find the row in the other table that has more information.
Details: This "links" the virtual table with the main table through a surrogate key.
Note: This syntax is similar to LIKE or WHERE syntaxes, so it won't be a problem to understand.
Row ID. In SQLite and other databases, each row has a unique identifier called a rowid. You don't need to tell the database to create this. The rowid gives us more power.
SELECT -- Get some matches from the fts3 table, then select the rowid. SELECT rowid FROM VirtualTable WHERE ContentCol MATCH @query LIMIT @limit OFFSET @offset;
Select rowid. You have a main table, which contains content and data, and a FTS3 virtual table. Search text using MATCH on the virtual table, and then get the corresponding row content.
SELECT example 2 -- Fetch the relevant row from the other table. SELECT ColA, ColB FROM DataTable WHERE rowid=@id;
Join rowid. To optimize performance and improve code clarity, minimize the trips to the SQLite database. Use an implicit join on the two tables using rowid, an int.
SELECT example 3 -- Join virtual table with regular table on rowid. SELECT dt.ColA, dt.ColB FROM DataTable dt, VirtualTable vt WHERE vt.ContentCol MATCH @query AND dt.rowid = vt.rowid LIMIT @limit OFFSET @offset;
This query selects from both tables at once, and names them dt and vt. Next it finds all ContentCol where the query is present and then does an implicit join on the two tables.
Finally: It returns the full content that you want to give the user instead of the virtual table contents.
Optimize. My database grew to three times its previous size when I enabled FTS3. Stopwords are short, common words that are not indexed. These make searching slower.
We used FTS3 in SQLite 3 for fast searching. SQLite does this well. The results of FTS3 implementations in my programs were impressive.