Journal of Applied Computing and Information Technology

ISSN 2230-4398, Volume 15, Issue 1, 2011

Incorporating the NACCQ publications:
Bulletin of Applied Computing and Information Technology, ISSN 1176-4120
Journal of Applied Computing and Information Technology, ISSN 1174-0175

Issue Home  Home  Authors  Reviewers  About 
 
Box Refereed Article F2:

Comparing the performance of three digital forensic tools

Brian Cusack
Auckland University of Technology, New Zealand
brian.cusack@aut.ac.nz

James Liang
Auckland University of Technology, New Zealand
zhuohaoliang@gmail.com

Cusack, B. & Liang, J. (2011). Comparing the performance of three digital forensic tools. Journal of Applied Computing and Information Technology, 15(1). Retrieved September 17, 2019 from http://www.citrenz.ac.nz/jacit/JACIT1501/2011Cusack_DigitalForensic.html

Abstract

Software used for digital forensic investigations requires to be verified against reliability and validity criteria. In this paper, three well known tools are tested against the mandatory features of digital forensic tools published by the National Institute of Standards (NIST). It was found that a variation in performance existed between the tools, with all having measureable areas of non performance. The findings have an impact on the professional use of the tools and illustrate the need for benchmarking and testing of the tools before use.

Keywords

Benchmarking, Software Tools, Comparison, Performance

1. Introduction

Tools which digital forensic investigators use come under scrutiny when the findings of an investigation are examined in court (Dixon, 2005; Liles, Rogers, & Hoebich, 2009; Cohen, 2010). General guidance for professional practice comes from the five Daubert criteria that prescribe expectations for expert evidence (Table 1).

Table 1. Daubert Standard- Five Relevant Factors (Daubert v. Merrell Dow Pharmaceuticals, 1993)

1. The theories or techniques utilized by the expert witness have been tested
2. The theories or techniques have been subjected to peer review and publication
3. The theories or techniques have a known or potential error rate
4. The theories or techniques have standards and controls concerning its operation.
5. The degree to which the theory and/or technique is generally accepted by a relevant scientific community.

More specifically there are published standards for tool performance expectations (NIST, 2001; 2004; 2005; Lyle, 2003). The testing and standardization of the tools assures the consistency and the completeness of evidence (Reith, Carr, & Gunsch, 2002; Wilsdon & Slay, 2006; Lyle, & Wozar, 2007). For example, a tool may be used correctly in terms of the global expectations stated in the Daubert criteria, but at technical level the tool may perform in a particularly erroneous or unexpected way (Selamat, Yusof, & Sahib, 2008).

Standardization is an attempt to assure the end users of the information that the findings can be trusted. In most instances tools are taken and tested against internationally recognized standards to show what a tool may do, how it performs and what trust may be put in it (Goel, 1985; Guo, Slay, & Beckett, 2009). The conditions of testing may vary with the testing environment but standardized best practices are employed to minimize the variation and to increase the confidence of end users (DFRW, 2001).

In this research we ran three comparative and well known digital forensic extraction tools against the National Institute of Standards (NIST) mandatory features and related standardized assertions. The test context was extracting evidence from a hard drive - in other words forensic imaging. This is core work for digital forensic investigators that represents the start of the technical phases in an investigation. The tools were chosen because they were commonly used in the industry. The key focus of our investigation was to measure the performance of each tool on the same evidence in the same hard drive against each requirement.

NIST (2004, p.8) specifies mandatory features that a tool ought to exhibit. In Table 2 the NIST requirements are summarized with respect to access, capability, accuracy, and reporting.

Table 2. Mandatory Features of a Disk Imaging Tool (NIST, 2004)

Requirements Description
DI-RM-01 The tool shall be able to acquire a digital source using each access interface visible to the tool.
DI-RM-02 The tool shall be able to create either a clone of a digital source, or an image of a digital source, or provide the capability for the user to select and then create either a clone or an image of a digital source.
DI-RM-03 The tool shall operate in at least one execution environment and shall be able to acquire digital sources in each execution environment.
DI-RM-04 & 05 The tool shall completely acquire all visible and hidden data sectors from the digital source.
DI-RM-06 All data sectors acquired by the tool from the digital source shall be accurately acquired.
DI-RM-07 If there are unresolved errors reading from a digital source then the tool shall notify the user of the error type and the error location.
DI-RM-08 If there are unresolved errors reading from a digital source then the tool will notify the user.

This paper first outlines the methodology including the testing processes, the test bed, the software environment, and the selection of the assertions, discusses the findings, draws conclusions and makes recommendations about using the tools in digital forensic work.

2. Literature Review

Byers and Shahmehri (2009) provide a systematic approach to disk testing forensics tools. They acknowledge the challenge, the time consumption and the expense of tool testing. The study presented here focused on evaluating Encase 6.8 and Linen 6.1 (Linux version of Encase) both developed by Guidance Software. The purpose of the evaluation is the same as the NIST Computer Forensic Tool Testing (CFTT) program that provided testing and assurance guidance for digital forensic tools (Lyle, 2003; Adams, 2008). Both studies seek to determine if the disk imaging tools used during investigations perform as expected and produce accurate and complete results (Stephenson, 2003; Ciardhuain, 2004; Berghel, 2007).

The studies by Byers and Shahmehri (2009) have some similarities to the CFTT program but also significant differences which were identified during the evaluation. The principal differences related to the assertion scope. The methodology is shown in Figure 1 and the sequential progression is noted.

Figure 1. Methodology of Disk Imaging Tools Evaluation. Compiled from: (Byers & Shahmehri, 2009)

3. Method

The sets of assertions needed to operationalize the testing were selected from the NIST database and the Byers and Shahmehri (2009) guidelines. Twenty six assertions were selected and clustered to assess performance against the eight fundamental requirements in listed in Table 2.

The methodology sequence is aligned with the sequential steps as shown in Figure 1 (Beebe & Clark, 2005; Roussev & Richard, 2006). The clusters were termed scenarios (TCs) and the assertions were either passed or failed to give an overall result. Table 3 shows an example: Test Case 1 (TC-01), with a description and the definition of the clustered assertions that comprised the test runs. The assertion coding followed the definitions of fundamental assertions (AFR), the hidden sectors (AHS), image creation (AIC), and the log files (ALOG).

Table 3. Test Case Definition Example

Test Cases Description Assertions for Testing
TC-01

(A11)

Acquire a hard drive using Access Interface (AI) and convert to an image file AFR01-05, AFR07, AIC01, AIC05, ALOG01-03

Each assertion was also defined by description and mapped onto the corresponding NIST and/or Byers and Shahmehri assertion. In this way a causal chain of relationships was created between the standards and the actions of professional practice (Baryamureeba & Tushabe, 2009). Table 4 shows an example of assertion description and standard.

Table 4. Assertion Definition Example

Assertion ID

Assertion Description

Correspondent NIST Assertion
TSP-AFR-01 The tool accesses the digital source with a supported access interface DA-AM-01

The metric was a simple audit measurement of yes or no (0 or 1) and then the performance of each test case could be calculated as a percentage of "yes" in the cluster sample size (Black, 2005).

Two test beds were assembled (Table 5) and a software was environment created (Table 6).

Table 5. The Test Stations

Test Station 1

Windows Environment

Test Station 2

Linux Environment

Intel® Core(TM) i5 CPU 750 @2.67GHz
Gigabyte Motherboard GA-P55A-UD4
BIOS version F6
On board USB 2.0, USB3.0, Ethernet, SATA and PATA controllers
Texas Instruments 1394 OHCI Host controller
4GB Ram
ASUS DVD-RW DRW-24B1ST ATA Device
SAMSUNG HD103SJ SATA drive 1TB

Windows 7, Windows XP SP3 with latest system updates or Virtualised Windows XP SP3

Intel® Core2(TM) CPU 6300 @1.86GHz
EPox 5P965 Motherboard

On board USB 2.0, Ethernet, SATA and PATA controllers
1.44 MB floppy drive
3GB Ram
Pioneer DVD-RW DVR-111D ATA device
Seagate ST3250823AS SATA drive 250Gb

Ubuntu 8.04 LTS (Hardy Heron)

Table 6. The Software Environment

Software

Version

Description agnostics tool for storage devices

MHDD

4.5

Low-level HDD Diagnostics Software

UltraEdit

16.10.0.1036

Hex Editor

Darik's Boot and Nuke

2.2.6

Used to securely wipe the test drive

Hdparm

9.29

Linux Hard drive tool, used to check and change parameter of the test hard drive

Gparted

0.6.2

Linux hard drive partitioning tool

Disk Management Tool

1.0.0

Windows hard disk partitioning tool

(Supports GUID partition table partition style)

Disk_stat

3.1.2

Used to check the existence of Host protected areas

EnCase

6.5

Used to verify the hash value of the acquired images

WinHex

15.6

Computer Forensics & Data Recovery Software,

Hex Editor & Disk Editor from X-Ways Software

To assure standardization between the test runs the hard drive had to be returned to the specified start condition that was to be the same for each test run. The procedure was to remove any partitions and data in the hard drive. Each test case run required a drive reset to make the hard drive ready for the next assertion (Bunting, 2007; Byers, & Shahmehri, 2008). In Table 7 the steps in the reset procedure are itemized.

Table 7. Drive Reset Example

1. Connect the test hard drive to the machine
   1.1 Boot the computer to the HDAT2 CD if HPA and/or DCO exist
   1.2 Choose the test hard drive in the device list
   1.3 Navigate to the Device Information Menu to detect HPA and/or DCO
   1.4 Navigate to the SET MAX (HPA) Menu if HPA exists or Navigate to Device Configuration Overlay (DCO) menu if DCO exists
   1.5 Choose "Auto Remove HPA Area" for HPA or "Restore" for DCO
2. Boot the computer to the Darik's Boot and Nuke CD
3. Choose the test hard drive to wipe
4. Select DoD Short method
5. Wait for the drive reset completed prompt

4. Testing

We chose three disk imaging tools to represented everyday choice and common tools that have high usage in the industry. Table 8 summaries the capability statements that are available from the suppliers. FTK Imager is a freeware that provides a cross Operating System (OS) capability for imaging and related support. Helix3 Pro is purchased by license and provides a similar capability (including MAC OS platform support). Automated Image Restore (AIR) is open source software specifically developed for the Linux environment. The three selected tools provided a useful triangulation where two tools were close to each other in specification and the third distinct in several ways. All were to be subjected to the same assertions and gap analysis to measure relative performances. In this way the tools performance would be relatively independent of the test bed capabilities (attributes were to be measured rather than properties) and a result would better reflect an end user expectation for a robust forensic image (SWGDE, 2009).

Table 8. Disk Imaging Tool Published Capability

Functionalities FTK Imager Version 2.9.0 Helix3 Pro Automated Image and Restore (AIR) Version 2.0.0
Software Type Freeware Commercial Open source software
Platform supports Windows & Linux Windows, Linux and Mac Linux
Support physical Interfaces IDE, SATA, SCSI, USB, IEEE 1394 IDE, SATA, SCSI, USB, IEEE 1394 IDE, SATA, SCSI, USB, IEEE 1394
Partition format supports NTFS, NTFS compressed, FAT 12/16/32, and Linux ext2 & ext3, HFS, HFS+ NTFS, NTFS compressed, FAT 12/16/32, and Linux ext2 & ext3 Linux partitions
Support image format Encase, SMART, Snapback, Safeback (up to but not including v.3), and dd Encase, dd dd and dc3dd
Image copy compression/ decompression PKZIP, WinZip, WinRAR, Gzip, and TAR compressed files PKZIP, WinZip, WinRAR, Gzip, and TAR compressed files Gzip and bzip2
Uses MD5 Hash Yes Yes Yes
Uses SHA1 Hash Yes Yes Yes
Can verify image integrity Yes No No
Split images into segments Yes Yes Yes
Logging Yes Yes Yes
Wipe disk drives or partitions Yes No Yes
Access HPA Unknown Unknown Unknown

5. Findings

So what did we find? The ranking system showed that FTK and AIR had a clear advantage over Helix3 Pro (Table 9). This was a 91% to 74% advantage. However a closer examination of the results shows the outcome is mitigated by a number of factors.

All three tools demonstrated significant non-performances. In scenario 3 (TC-03) all three tools failed to acquire hidden sectors into the image file. Table 4 may have signaled the problem and yet end users expect to get access to the hidden sectors on a hard drive while imaging. The problem was recurrent across the scenarios. The highest success in testing still leaves 10% dissatisfaction for an end user who has the expectation that the eight mandatory features for a disk imaging tool will be present. The gap between expectation and delivery is large enough to cause concern. It also gives grounds for challenges in the court regarding the validity of any tool, the environment in which it was used and the implications of the margins of error.

Table 9. Test Results

Test Cases

FTK

Helix3 Pro

AIR

Pass Rate Failed Assertions Pass Rate Failed Assertions Pass Rate Failed Assertions
TC-01 100% None 91% ALOG02 100% None
TC-02 100% None 91% ALOG02 100% None
TC-03 85% AFR06
AHS01-03
75% AFR06
AHS01-03
ALOG02
85% AFR06
AHS01-03
TC-05 100% None 82% ALOG01-02 100% None
TC-06 100% None 82% AFR08
ALOG02
100% None
TC-07 100% None 88% ALOG02 88% AIC04
TC-08 100% None 83% AIC-10
ALOG02
75% AIC04
AIC10
ALOG02
TC-09 100% None N/A N/A N/A N/A
TC-10 86% AIC08 N/A N/A N/A N/A
TC-11 100% None N/A N/A N/A N/A
TC-12(1) 79% AFR06
AHS01-03
29% AFR05-06
AHS01-03
79% AFR06
AHS01-03
TC-12(2) 16% EXCEPT
AFR01-03
29% AFR05-06 79% AFR06
AHS01-03
TC-13 92% AIC11 83% AIC11
ALOG02
92% AIC11
TC-14 92% AIC11 83% AIC11
ALOG02
92% AIC11
TC-15 89% AIC11
ALOG02
89% AIC11
ALOG02
94% AIC11
TC-16(1) 100% None 33% AFR-01 & 03 100% None
TC-16(2) 100% None 93% ALOG02 100% None
TC-17 79% AFR-06
AHS01-03
74% AFR-06
AHS01-03
ALOG02
79% AFR-06
AHS01-03
TC-18 N/A N/A 93% ALOG02 100% None
Overall Passed Rate
(Common Test Cases)

89%

74%

91%

Helix3 Pro failed on significant assertion (ALOG02 - "The tool displays correct and necessary information to the user") in all scenarios where as FTK Imager and AIR had this problem in one scenario each. Overall Helix3 Pro failed twelve different assertions where as FTK Imager and AIR failed seven and six respectively. Overall FTK Imager and AIR retain the compliance lead but this is not a clean bill of health for the tools.

A troubling result was AIC11 ("The tools report to the user if any irregularities are found in the digital source") found in scenarios 13 to 15 (TC13-TC15). All three tools failed on an assertion most end users would expect or take for granted. The implications for an end user would be: Use with care. Double check all results carefully. Know the tool limitations. Run repeated checks. Use only after benchmark testing. Publicize the results with consideration and explanation of the error margins.

The FTK Imager and AIR perform somewhat similarly but only to some extent. When the capability of each tool is evaluated it becomes evident that the FTK Imager has the advantage of being cross OS but then lacks some useful features of AIR - for example, TC-18 "Verification of the network image acquisition function". Significantly the FTK Imager outperformed AIR on TC-08 "Attempt to create an image file where the destination device has insufficient storage space, and see whether the tool notifies the user and offers another destination device to continue". The scenario test indicated failure related to other assertions such as engaging in communication with the user and subsequently the AIR result was dragged down.

6. Conclusion

The general implications that may be drawn from the research findings suggest caution to practitioners when selecting, using or changing between tools. The three tested tools are commonly used in the digital forensic industry and yet each has limitations when tested against standardized assertions. Particularly new entrants into the profession may inadvertently make claims regarding, for example the completeness of an examination, when clearly a tool requires prior testing to substantiate the capability. An assumption made from prior use, the brand promotion or collegial recommendation is insufficient. The implication follows that all digital forensic evidence presented to the end users in a court may be challenged and that the expert witness ought to be prepared to explain the procedures of imaging, the tools and the tool testing.

Acknowledgements

The use of the resources provided by the Auckland University of Technology Digital Forensic Research Laboratories in completing this research is acknowledged.

References

Adams, C. W. (2008). Legal issues pertaining to the development of digital forensic tools. Paper presented at the Proceedings - SADFE 2008 3rd International Workshop on Systematic Approaches to Digital Forensic Engineering, Berkeley, CA.

Baryamureeba, V., & Tushabe, F. (2004). The enhanced digital investigation process model. Paper presented at the Digital Forensic Research Workshop.

Beebe, N. L., & Clark, J. G. (2005). A Hierarchical, Objectives-Based Framework for the Digital Investigations Process. Digital Investigation, 2(2), 147-167.

Berghel, H. (2007). Hiding Data, forensics, and anti-forensics. Communication of ACM, 50(4), 15-20.

Black, P. E. (2005). Software assurance metrics and tool evaluation. Paper presented at the Proceedings of the 2005 International Conference on Software Engineering Research and Practice, Las Vegas, US.

Bunting, S. (2007). Acquiring Digital Evidence. In EnCase computer forensics the official EnCE: EnCase Certified Examiner. Indianapolis, Indiana: Wiley Publishing.

Byers, D., & Shahmehri, N. (2008). Contagious errors: Understanding and avoiding issues with imaging drives containing faulty sectors. Digital Investigation, 5(1-2), 29-33.

Byers, D., & Shahmehri, N. (2009). A systematic evaluation of disk imaging in EnCase 6.8 and LinEn 6.1. Digital Investigation, 6(1-2), 61-70.

Ciardhuain, S. O. (2004). An Extended Model of Cybercrime Investigations. International Journal of Digital Evidence, 3(1).

Cohen, F. B. (2010). Fundamentals of Digital Forensic Evidence. In Handbook of Information and Communication Security (pp. 789-808). Berlin Heidelberg: Springer.

Dixon, P. D. (2005). An overview of computer forensics. IEEE Potentials, 24(5), 7-10.

DFRW. (2001). A Road Map for Digital Forensic Research. New York, U.S.A.

Daubert vs Merrell Dow Pharmaceuticals (1993). Volume 509. The Supreme Court of the United States. 579.

Goel, A. L. (1985). Software Reliability Models: Assumptions, Limitations, and Applicability. IEEE Transactions on Software Engineering, SE-11 (12), 1411-1424.

Guo, Y., Slay, J., & Beckett, J. (2009). Validation and verification of computer forensic software tools - Searching Function. Digital Investigation, 6, S12-S22.

Liles, S., Rogers, M., & Hoebich, M. (2009). A Survey of legal issues facing digital forensic experts. Advances in Digital Forensics V: Springer Boston.

Lyle, J. R. (2003). NIST CFTT: Testing disk imaging tools. International Journal of Digital Evidence, 1(4), 1-10.

Lyle, J. R., & Wozar, M. (2007). Issues with imaging drives containing faulty sectors. Digital Investigation, 4(Supplement 1), 13-15.

NIST. (2001). General test methodology for computer forensic tools. Retrieved from http://www.cftt.nist.gov/Test%20Methodology%207.doc

NIST. (2004). Digital Data Acquisition Tool Specification (Draft 1 of Version 4.0). Washington, DC: NIST. Retrieved 21st January 2010 from http://www.cftt.nist.gov/Pub-Draft-1-DDA-Require.pdf

NIST. (2005). Digital Data Acquisition Tool Test Assertions and Test Plan (Draft 1 of Version 1.0). Washington, DC: NIST. Retrieved from 21st January 2010 from http://www.cftt.nist.gov/DA-ATP-pc-01.pdf

Reith, M., Carr, C., & Gunsch, G. (2002). An examination of digital forensic models. International Journal of Digital Evidence, 1(3).

Roussev, V. and Richard III, G.G. (2006). Next-generation digital forensics. Communication of ACM, 49(2), 76-80.

Selamat, S. R., Yusof, R., & Sahib, S. (2008). Mapping Process of Digital Forensic Investigation Framework. International Journal of Computer Science and Network Security, 8(10), 163-169.

Stephenson, P. (2003). A comprehensive approach to digital incident investigation. Information Security Technical Report, 8(2), 42-54.

SWGDE (2009). SWGDE Recommended Guidelines for Validation Testing. Retrieved 3rd March 2010 from http://www.swgde.org/documents/swgde2009/SWGDE%20Validation%20Guidelines%2001-09.pdf

Wilsdon, T., & Slay, J. (2006). Validation of forensic computing software utilizing black box testing technique. Paper presented at the Proceedings of 4th Australian Digital Forensics Conference, Perth, Australia.