SQLite Forensics Book, now available on Amazon

More information here


  • Using group_concat to amalgamate the results of queries

    Recently one of our users contacted me and asked for help creating a custom report for a Skype database after other forensic software had failed to include some very relevant data in their reportsí.

    In a nutshell the messages report he had produced using the other software only included the author of a given message and did not list the other person(s) who were part of that conversation. This information is maintained in different places in a Skype main.db SQLite database. This article discusses how to include this information in a user friendly format.

    For corporate or law enforcement investigators a fully functional licence for the Forensic Toolkit for SQLite can be obtained by clicking here and providing your official email address, full name and position within your organisation

    In order to include thre required information, the two tables that we need are the messages table and the participants table. Normally in a Skype installation this table includes most of the information we want in the dialog_partner column, however in this database (from a HTC One phone) this column was blank.

    In order to understand the problem more clearly have a look at this extract of selected columns from the participants table:

    The convo_id column is a unique number that refers to each conversation; the identity column is the identity of the participants. Note that conversation 426 has 4 participants. Also note that r3scue193 (thatís me) appears in every conversation, as you would expect.

    If we now look at the messages table we can see that there is a corresponding convo_id column and we can use this column to perform a join between the two tables so that we can include a list of all the participants.

    So how do we get a list of all of the participants in a conversation?

    If we simply do a LEFT join between the two tables on convo_id then for each row in the messages table we will get a join for each matching row in the participants table, i.e. for a conversation with two participants we would get a duplicated row for each participant (the identity column below), for three participants we would get three rows etc. - this would be very confusing:

    The answer is the SQLite aggregate function group_concat, from the SQLite web site (https://www.sqlite.org/lang_aggfunc.html)

    The group_concat() function returns a string which is the concatenation of all non-NULL values of X. If parameter Y is present then it is used as the separator between instances of X. A comma (",") is used as the separator if Y is omitted. The order of the concatenated elements is arbitrary.

    The query we use is below and utilises the group_concat function along with the GROUP BY expression:

    SELECT Participants.convo_id,
      Group_Concat(Participants.identity, ', ') AS ConversationParticipants
    FROM Participants
    GROUP BY Participants.convo_id
    In English this selects all of the convo_id and associated participants from the participants table and groups them by the convo_id, and then the identity of each participant in each group is concatenated into a single string with each identity separated by a comma.

    This might be clearer if we review the original table:

    And then look at the results of the query:

    Remember you can use the Case Manager to save the query you have just created for re-use on another case.

    We now want to use the query created above with the messages table so that we can list the participants alongside each message. But before we do that we will make a VIEW based on the above query. A VIEW can be thought of as a sort of temporary table and the Forensic Browser for SQLite allows you to create a VIEW by selecting the appropriate option from the Queries menu:

    You then need to enter a name for the view

    This view can now be used in the same way as any table by dragging it to the query designer. The screenshot below shows that:

    SELECT Participants.convo_id,
      Group_Concat(Participants.identity, ', ') AS ConversationParticipants
    FROM Participants
    GROUP BY Participants.convo_id
    Can now be replaced with:

    SELECT *
    FROM Convo_participants

    All that now remains is to create a JOIN on the messages table using our new view and select the columns we want to display: