CHAPTER 4: FORENSIC ANALYSIS OF DATA – Computer Forensics: A Pocket Guide

CHAPTER 4: FORENSIC ANALYSIS OF DATA

The purpose of this chapter is to provide an insight into how to undertake an analysis of a forensic image. General topics will be discussed, such as dead analysis and file carving. However, the nature of an analysis is very much dependent upon the underlying file system being used by the operating system. Owing to its popularity, this chapter will specifically focus upon the Windows® file and operating system. How to identify forensic evidence from various aspects of the system, such as file slack, e-mail, Internet history and virtual memory, will all be discussed.

The process of forensically analysing images very much depends upon the suspected nature of the incident. For instance, malware incidents will leave very different artefacts to cases where employees have been misusing computer systems (e.g. downloading and/or distributing pornography). For those incidents involving people, it is also important to consider the technical capability of the individual involved. Those with more technical knowledge potentially have the ability to hide data within the system more effectively, therefore requiring a different approach and level of analysis.

The analysis of the drive can be achieved in two ways: live and dead analysis. Traditionally, the forensic procedure has focused upon dead analysis – analysing the forensic image from your trusted forensic system. The data on the image never changes and the integrity of the data is therefore simpler to maintain. For most investigations, this form of analysis is sufficient. A live analysis is where you would utilise the operating system (OS) on the suspect image to collect evidence – booting from the suspect image. Within dead analysis, forensic file system analysers are able to interpret a specific file system, and subsequently recreate the file system for you. In order to achieve this the analysers must understand the exact nature of the file system – from the location and operation of the file system, to interpreting the file record metadata. Prior to these tools being available, the forensic examiner would have difficulty in establishing file pathways and understanding the structure of the file system, without performing a live analysis – where the host OS would interpret the file system for the examiner. File system analysers also allow the examiner to acquire all the metadata about the files and folders, such as modified, accessed and created timestamps, which is essential in understanding an investigation. Numerous such analysers now exist: EnCase®, FTK® and Autopsy are three popular tools (see Resources section for more information). Figure 1 below provides an illustration of the output that can be seen from such a tool. The tool has a number of key areas: the file system tree view (on the upper left in Figure 1); a folder list (on the upper right in Figure 1); and a detailed file view (lower right in Figure 1).

Figure 1: Screenshot of the file system from EnCase ®

In addition to recreating the file system, these tools will also identify and list deleted folders and files that are still present within the file system. The extent to which the file is actually still present depends upon the state of the system at any particular point in time. Various situations arise with regards to deleted files:

  • The file system still contains the record with all metadata and file data.
  • The file system contains the record with metadata, but the file contents themselves have been overwritten.
  • The file system no longer contains the record with metadata, but the file contents still exist on the image.

In addition to these situations, the nature of the file contents can also be partially overwritten. The ability for a forensic examiner to retrieve the information in a partially overwritten case depends upon which bytes of the file are overwritten and which tools are being used. In the first two cases, the analyser will list what information is available within the file system view and, where possible, link to the file itself. In the final situation, the file system is unable to list the file, but performing file carving and string searches on the complete drive can reveal these.

Before proceeding to explain forensic analysis further, it is necessary to briefly introduce file systems. Each file system operates differently and is technically complicated, but their operation can be highly valuable to a forensic examiner; they will frequently perform tasks that a user is unaware of and that could contain artefacts of interest. For instance, when deleting a file in Windows®, a user may consider the file to be removed from the drive, whereas the file system simply marks the entry in the file system as available. In order to be able to undertake a forensic analysis of a system it is therefore imperative that the examiner has the knowledge and understanding of the system in order to ensure they know where to look for evidence. A number of specific texts have been written on the different file systems to assist the forensic examiner – information on these are located in the Resources section.

The discussion from this point will cover the New Technology File System (NTFS) and the Windows® OS. However, many of the techniques and procedures are also valid for other systems. The discussion will focus upon the primary methods used to analyse a system:

  • common techniques for investigation
  • exploring user activity and communication
  • file carving
  • virtual memory
  • registry.

If you are performing an investigation where the source of evidence is not a bit-for-bit copy (i.e. a back-up dataset) the only approaches available to the examiner are the first and second methods. The remaining approaches assume a bit-for-bit copy of the hard drive with the final two methods only available if the drive image has an OS installed on it.

Common investigative techniques include simple searches through the file system for file and data of interest to the investigation. The ‘My Documents’ folder for an individual could be valuable source of evidence if the person concerned has been saving information pertaining to the incident. Looking through the Recycle Bin and within the deleted folders and files would also be a useful place to start. A primary tool for the investigator is being able to search through the drive for keywords or file types. If you are looking for images, you can perform a search to find all jpeg or bitmap images, etc. A very simply hiding technique used by novice computer users is to modify the file extension to something else in order to avoid such searches. However, most commercially available tools such as EnCase® and FTK®, are able to verify the signature of the files to ensure the file extension matches the file header. These keyword searches are able to scan through the entire disk, including unallocated clusters.

In order to reduce the number of files requiring analysis, it is useful to remove all files that pertain to the OS and standard applications. Hash values of every file can be compared to a reference source. Those with matching hash values are trusted files and can therefore be removed from the analysis. NIST has developed the National Software Reference Library (NSRL),15 which is freely available and integrates into many forensic analysers. This significantly reduces the burden upon the investigator. It is also extremely useful in malware and hacking investigations as it quickly becomes evident which OS files have been infected or modified.

Once the basic level of analysis has been completed and all the obvious places of interest have been investigated, the examiner can turn to analysing application-specific data. As applications tend to create temporary information during their operation, these can be used to identify what has been happening. Which applications the examiner will investigate will depend upon the nature of the investigation; if the incident is concerned with illegal access to a database system, the focus for the investigator will be upon the database application logs. Common applications that are investigated, however, include web browsers, e-mail and instant messenger clients, and office documents. In each of these cases, the files (temporary or not) created by the application tend to be proprietary and are therefore stored in a proprietary format. The choices for the examiner in this situation are:

  • Obtain information from the Software Vendor on the structure and format of the file. View the file in hexadecimal and translate the contents.
  • Install the application on a forensics machine. Extract the file of interest and use the application to view the file contents.
  • Use an inbuilt viewer within the forensics tool to view the file.

For common applications such as web browsers, email clients and image viewers, commercial forensic tools contain an inbuilt viewer to view the proprietary files. For example, Figure 2 below illustrates the view from EnCase® when analysing e-mail. For other applications, the examiner will need to extract the file and use the application to view the file. It is extremely time intensive to go to the effort of understanding and translating the file structure. However, with many organisations having bespoke applications this is sometimes necessary.

Figure 2: An illustration of EnCase®’s e-mail history view

Before discussing file carving, it is worthwhile to introduce the concept of file slack. File slack is one reason why bit-for-bit duplication of drives is useful to the examiner. File slack is an area of memory on the drive that can contain valuable information from deleted files. In order to understand file slack, some hard drive and operating system details are required. The smallest area of memory on a hard drive is referred to as a sector. In Windows®, a sector is typically 512 bytes. Sectors are then grouped into clusters, with a cluster having 1–128 sectors. From a file system perspective, the smallest data area that is indexed are clusters. As illustrated in Figure 3, when the OS is writing to a disk, should the file it is writing not be an exact multiple of the cluster size, then an area of memory will be left remaining. This is referred to as file slack. Every complete unused sector within the cluster simply does not change. Therefore, any contents previously stored on those sectors will remain. In addition, the hard drive itself must write in sector chunks. Should the file contents stop midway through a sector, the OS will fill the remaining sector with data. In older versions of Windows® (such as Windows® 98) the contents for this used to come from the RAM, which potentially is an extremely useful source of evidence; however, newer versions of the OS simply zero out that space. This type of slack is commonly referred to as RAM slack.

Figure 3: RAM and file slack

The problem with file slack is that it simply contains file data. In many cases, all of the metadata associated to the file and stored in the file system (the Master File Table (MFT) in NTFS) is lost. Therefore, the examiner would not know the file existed. This is where a process called file carving comes in very useful. File carvers do not need any metadata knowledge of the file but simply trawl through the disk looking for file headers and footers. Once the start and end of a file have been identified, the file can be extracted or carved from the disk. In addition to file slack, file carving is also extremely useful when searching through unallocated areas of memory.16

Given the dynamic nature of the file system, many files in slack space and unallocated memory no longer have the complete file contents still intact. Frequently, only partial file fragments exist with bits of the header, footer or file contents missing. Moreover, many files are stored on disk in nonsequential order (i.e. fragmented), making it difficult for a file carver to simply extract all data from the beginning of the header to the end of the footer. Therefore, a variety of file carving mechanisms have been develop to assist in extracting the files such as semantic carving, fragment recovery carving and SmartCarving (see Resources section for further information).

Within a Windows® OS, there are two further aspects of particular interest to a forensic examiner: the virtual memory and the Registry. When a system does not have sufficient RAM memory to operate, the OS creates a space on the hard drive and uses this to extend the RAM capacity. Referred to as virtual memory, this file can be considerable in size and contain RAMbased memory from the previous session. During each new session, the memory is overwritten with the new session data – although file slack data of previous sessions can still remain. This is one reason why care should be taken when booting from the suspect drive, as previous session information will be overwritten and lost. The size of this file can be in the order of gigabytes. Whilst the discussion has focused upon hard drive analysis, RAM-based data can also be extremely useful in understanding what the user and/or system was doing during the last session. This area of memory can also contain a variety of artefacts such as encryption keys and passwords. The virtual memory, therefore, is a useful source to examine further. On Windows® XP systems, this file is named ‘pagefile.sys’ and can be located under the root directory of the drive. Analysing this type of file, however, is more difficult than others because of the lack of file structure. With other files, a header, footer and file structure exists for understanding the file. With the virtual memory, this understanding remains with the active OS and therefore the file contains no structure. As a result, the examiner needs to perform a series of string searches on the file in order to try and identify relevant artefacts.

The remaining area of discussion for this chapter is the Registry. The Registry is a hierarchical database that contains the configuration settings for the OS and applications. It is the Registry that also contains the user’s authentication credentials. As a vast source of information about the system, what has been installed, when the system was last running, who the users are, what network cards are present, etc, the Registry is an extremely useful source of evidence. The Registry is not stored on the file system as a single file, but is stored principally in five files: Sam, Security, Software, System, Default.17 The OS is responsible for creating the Registry when loading. Obviously, when forensically analysing the system, unless you are performing a live analysis, the Registry will not exist as a whole but as separate files. In this situation, like the procedure for proprietary files, you can extract the files and then use a Registry viewer to understand the contents, or use some commercial software like EnCase® or FTK® and use the built-in Registry viewer to extract the information for you.

The chapter has provided a preliminary insight into the forensic analysis of media, demonstrating that evidence can be located in a variety of areas. Even data thought to be lost for some time, might still reside on the drive in unallocated memory or file slack. Unfortunately, owing to the dynamic nature of the file system, it is difficult to predict exactly what will or will not be present at any point in time. It is therefore imperative that systems are acquired speedily upon identification of an incident. For links to further information on forensic analysis of computers please refer to the Resources section.

 

15 National Software Reference Library, NIST (2010).

www.nsrl.nist.gov

16 Unallocated areas of memory are clusters that the file system is not currently using. However, this is not to say these clusters were not previously used and therefore will still contain the file contents of what was previously stored there.

17 ‘Windows Registry Information for Advanced Users’, Microsoft (2008).

http://support.microsoft.com/kb/256986