• Contiguous and fragmented!

    On occasion I have looked at the fragmentation of a file to try and draw some conclusions as to how the file has been "built" on the disk, i.e. is the file contiguous or is the file fragmented. On one occasion this was used to prove that a file hadn't been part of just a mass copy exercise (where all the other files were contiguous and one after the other), all pretty basic stuff.

    However today when playing with MFT entries I saw something that I had not seen before and that may be relevant in an investigation for someone. Essentially I have a file in which the allocation (the clusters) are contiguous, but the allocation list itself is fragmented.

    The following screenshot shows the output of istat, from the Sleuthkit, for the MFT record for the file in question which shows that there are 3 data attributes in the MFT entry:





    This in itself is quite normal, if there are a lot of runs in a file then the allocation for the runlist can be fragmented, ths is covered in the NTFS section in Brian Carriers excellent book "File System Forensic Analysis". However, the screenshot below shows the list of file extents as shown by Encase for this file and this shows that the file is contiguous on the disk as is clearly just one run of 361 clusters.





    Finally the following three screenshots show the output from program I am currently working on, each of which shows the data from a separate data attribute for the MFT



    The interesting values here are the start and len fields, the last two in the screen shot. Start for the first run of the file is offset 16,052,187,136 and this is the same as the Start Byte recorded and displayed by Encase (as we would expect), however the length of this run is just 1,093,632 bytes, much less than the recorded file size of 1,478,656 bytes.

    If we add the len (length) to the start we get the offset of the end of this run of the file, i.e. 16,052,187,136 + 1,093,632 = 16,053,280,768 which is the start of the second fragment of the file, as seen below:



    This fragment is just 4096 bytes in size which when added to the start of the fragment (16,053,280,768) gives the start of the next and final fragment of the file i.e. 16,053,284,864, this last fragment is 380,928 bytes:



    If we sum the allocation from each of the above fragments 1,093,632 + 4,096 + 380,928 we get 1478656 which is the allocation recorded by Encase (and stored in the MFT as the allocated file size) and the correct allocation for the size of the file.

    Subject to further testing it would seem unlikely that Windows would write three data attributes for a file that was created in one discrete event and given that the file in question is a log file, with a last written date that is a few minutes after the created date, this is a very strong indication that this file has been appended to.

    So in summary what we have here is a file that is technically fragmented, but the three file fragments are contiguous.

    [Edit - additional research]

    After looking some more and modifying me software Reconnoitre to display the number of MFT data streams (type 128) associated with a file it seems that this phenomenon is not rare at all.

    The screenshot below shows the output of Reconnoitre when run across an old image of my local drive. The two main columns of interest are the "runs" and "mftdatastreams" columns. Runs is the number of file extents as reported by Encase, WinHex etc. i.e. this is the number of fragments actually on the disk. Mftdatastreams is the number of streams of type 128 that are actually listed in the MFT records. It can easily be seen that Reconnoitre shows quite a few files for which there are more MFT data streams than there are file fragments:



    Looking at the output of Encase in the screen shot below for one file in particular (the highlighted file reconnoitre.csl) we can see that there is just one data fragment on the disk, i.e. the file data is contiguous, but Reconnoitre (above) shows that there are 12 data streams:



    Again, if we look at the output of istat we can see that this tool also lists 12 data streams for this file:



    In conclusion it seems that this is a relatively common occurence and a forensic investigator would do well to bear this in mind on investigations where fragmentation of a file might be relevant.
    This article was originally published in blog: Contiguous and fragmented! started by Paul