|For corporate or law enforcement investigators a fully functional licence for the Forensic Toolkit for SQLite can be obtained by clicking here and providing your official email address, full name and position within your organisation|
To help with this I created a test messaging application with a few artificial messages and users to keep the demonstration simple.
The "application" I have developed has just two tables "messages" and "users" and I created a simple conversation series and added and deleted users and messages.
The Messages table when viewed with the Firefox SQLite manager plugin, i.e. the live database as the application would display it looks as follows:
The users table is even simpler:
The final step in setting the 'picture' is to show the database as it would be seen in the Forensic Browser with a simple join between the tables and the columns and the columns displayed with simple conversions so the report makes more sense:
That is how most SQLite viewers would see the database and all looks OK. So what happens when the database is opened with the Browser and the WAL journal is processed alongside the master database? The following screen shot shows just a few of the 101 messages that were recovered by the Forenisc Browser for SQLite, while there is a lot of duplication in this report (see later) there are also some records that don't appear in the live database (some of which are highlighted) it is these records we will examine in the rest of this article.
So why 101 records? If you have read my previous article you will be aware that when a database is using a WAL journal any new or changed pages are updated and appended to the end of the WAL file, the associated database file is not touched at all until a WAL checkpoint occurs (typcially when the WAL grows beyond 1000 pages). The other reason of course for the extra records is because some of them are deleted records that are not in the live database.
So how can we identifiy just the deleted records when there are exact copies of live records that are also held in the WAL?
The simplest method is to using something called a correlated subquery that first identifies the MD5 of each of the live records and then returns a query which shows everything other than a record that has the same MD5as the live records.
The query is:
SELECT * FROM messages WHERE sfcMD5 NOT IN (SELECT sfcMD5 FROM messages WHERE sfIsLive = 'True')
This query however returns every row from the table and will include duplicate copies of any rows where there are multiple copies. We can filter this query further by simply adding a group by clause to the sfcmd5 column to just display one unique copy of each record based on the MD5 of all of the columns in the original database (i.e. not including those columns that start with sf...).
This query is further explained here:
So what else can the Forensic Browser do for us as part of an investigation?
Firstly we can use one of the parsing features of the Browser to use a commit or transaction number to limit the Browser when consider records to include in the list of live records.
The commit number for the last WAL frame containing the deleted records is highlighted above is 20 and WAL frame is frame 25. When the Browser is run and the WAL is identified the investigator gets the option to process all frames, none of the frames, or select the maximum frame. Note that all records are still processed, but the value selected will determine which records are marked as live:
The following screenshot shows the result of parsing with the last commit number set to 20. The results have been filtered to just show the live records and we can see at the top left that this is 16 of 101 records. So in summary this is the database/table as it was before the messages were deleted.
When were the records deleted?
As well as knowing what records were deleted it would of course be useful to determine when a record was deleted. The success of this technique depends on the content of the database and works best if there is a timestamp that is set automatically, in this example we are assuming that the "date" is the date a message was sent/received and is set automatically by the application, i.e. not manually by the user.
The first step is to try and work out what the last possible date that the last deleted record was live in the table. We can then determine:
- The date and time of the last message in the table before the record was deleted
- the date and time of the next message after the record was deleted.
The highlighted record below is the last occurence of any of the deleted records, we can see that this is in commit number 20 (as discussed above). To summarise what this shows:
- The record of interest is on DB page 5
- At commit 20 page 5 is updated and the record is live
- Commit 21 just affects page 6 and adds a record to the database
- At commit 22 page 5 is updated and the record is no longer in the table
- The record was therefore deleted in commit 22.
As we know the record was not present in page 5 at commit 22 the numbered bullet points above become:
- The date and time of the last transaction in the table before commit 22 (LastLiveDate)
- the date and time of the next (earliest) transaction after and including commit 22 and after (LatestDeleteDate)
If we can determine the date of the last record that was sent or received before commit number 22 then we know the last time that the record was in the live DB, the following query does this:
Getting the other bracketing date (the following date) is slightly more complex because WAL frames/pages after commit 22 can also include records that are live both before and after the deletion occured. The following SQL uses the query above as a correlated sub query that returns the last live date, and finds the earliest date that follows the last live date from the following WAL pages.
Note that the highlighted portion of the SQL above (the subquery) is the same as the previous query).
By means of confirmation the last few lines of SQL insert and delete statements used to create the test database are below:
insert into messages (id, date, sent, type, user_id, message) values (9, strfTime('%s','2016-03-04 09:22:02'), 0, 2, 1, 'Paul, be my friend, Darcy'); insert into users (id, username) values (3, 'Darcy'); insert into messages (id, date, sent, type, user_id, message) values (10, strfTime('%s','2016-03-04 09:25:43'), 1, 1, 3, 'Hiya mate - didn''t know you were on this app'); insert into messages (id, date, sent, type, user_id, message) values (11, strfTime('%s','2016-03-04 09:27:43'), 0, 1, 3, 'No time for pleasantries, Ive transferred the money from the company account'); insert into messages (id, date, sent, type, user_id, message) values (12, strfTime('%s','2016-03-04 09:29:22'), 0, 1, 3, 'This is really scary - I dont do illegal - no more after this'); insert into messages (id, date, sent, type, user_id, message) values (13, strfTime('%s','2016-03-04 09:29:22'), 1, 1, 3, 'OK good - thats it we are quits now'); insert into messages (id, date, sent, type, user_id, message) values (14, strfTime('%s','2016-03-04 10:03:21'), 0, 1, 2, 'Hi honey - are you working late? what time will you be home?'); insert into messages (id, date, sent, type, user_id, message) values (15, strfTime('%s','2016-03-04 10:05:10'), 1, 1, 2, 'about 8pm hopefully'); insert into messages (id, date, sent, type, user_id, message) values (16, strfTime('%s','2016-03-04 13:13:40'), 0, 1, 3, 'Oh - I forgot to say, delete any trace of this conversation'); insert into messages (id, date, sent, type, user_id, message) values (17, strfTime('%s','2016-03-04 14:08:21'), 0, 1, 2, 'Hi honey - did you get my message form earlier?'); delete from messages where (id >= 9 and id <= 13) or (id = 16); insert into messages (id, date, sent, type, user_id, message) values (18, strfTime('%s','2016-03-04 17:05:08'), 1, 1, 2, 'been in a meeting, leaving shortly, quick pint with the boys and then ill be on the train');