Template for adding support for a new disk format?

Jan 13, 2013 at 4:32 PM

Hi all,

Thank you so much for this great library.

I'm trying to add support for another virtual hard disk format, specifically the 'Expert Witness Format' (EWF) as used in the forensic world by applications such as EnCase and FTK.

I've managed to write a C# app in managed code to analyse the file (and it's child files) and I can dump all the meta data and more importantly, I can map the disk sectors to their location in the EWF files.

So, I think the hard work is done. I'm now trying to port my app to the DiscUtils library. I tried by simply copying the Vhd folder in the hope that I'd be able to figure out how it actually works but I'm struggling.

I can do the first step of adding all the segment files to the _files collection, but I can't figure out how to update the Extents mapping? And when?

Any help as to how I could get this new format added would be greatly appreciated.

Thanks!

Jan 14, 2013 at 9:46 AM

Ok, so I'm thinking it might be easier to start with the 'Raw' code rather than the 'Vhd' code and build it from there.

If that fails, I might just start with a blank 'Disk.cs', implement VirtualDisk and see when it complains.

I'm assuming that because not every disk format has non-contiguous data (for example RAW), that the management of the segments/chunks/runs is implemented per disk format.

Is there a way I can pass my chunks (and the sector offsets they represent) to some object that the SparseStream will process?

Thanks!

Jan 14, 2013 at 10:03 PM

Ah-ha! Implementing 'MappedStream' seems to be where it's at. At least that's where I think the reading of the chunks is going to happen.

I started with the 'Raw' in the end. I think I'll need to add:

  • A collection for the segment files (e01, e02, e03...) - to the Disk class?
  • A ResolveChain method to ensure I have all the necessary segment files - to the Disk class?
  • A class for each of the Ewf "sections".

(Btw, I am aware I'm the only one replying to this thread, but I thought I'd log my findings here in case it comes in handy later. It's gonna be really embarrassing if I find out this is actually all documented somewhere...)

Coordinator
Jan 14, 2013 at 10:36 PM

Hi,

'Raw' is a good place to start - you'll want a 'DiskImageFile' class as well as Disk - DiskImageFile represents a single layer in a disk chain (a chain is a base disk, differencing disk, etc).  From a quick glance EWF looks like it's a non-sparse format, with only a single layer (same as Raw), the layer made of multiple segments (unlike Raw).

You should be able to copy the implementation of 'Raw' pretty much as-is.

Implementing a mapped stream (or even a SparseStream, initially) and returning as the 'Content' of the DiskImageFile exposes it to the rest of DiscUtils.  Replace this line in DiskImageFile:

_content = SparseStream.FromStream(stream, ownsStream);

With something like this would get you started:

_content = new EWFStream(baseFileName);

 

The implementation of EWFStream is where your logic should go - to open the correct file for each segment, etc.  In the Read method of that class you probably want to first determine which segment the stream position is in, and open the appropriate segment file to service the read request.

 

Ken

Jan 14, 2013 at 11:10 PM

Thanks very much for the tips Ken - they're really useful.

You're absolutely right: EWF is indeed non-sparse, a single layer, and the layer is made of multiple segments.

I've already rolled my sleeves up and started playing, but I have a question:

My Disk constructor currently looks like this:

public Disk(string path)
{
  DiskImageFile file = new DiskImageFile(path, FileAccess.Read);
  _files = new List<DiscUtils.Tuple<DiskImageFile, Ownership>>();
  _files.Add(new DiscUtils.Tuple<DiskImageFile, Ownership>(file, Ownership.Dispose));
  ResolveFileChain();
}

obviously stolen from 'Vhd'. Given that it's going to be my EWFStream that does all the work with the segment files is there any need for the _files list in this class?

Would it be better for the list of segment files to be maintained by the EWFStream class?

It would of course be sensible to check all the necessary segment files are present at this stage, but once that's confirmed, couldn't the list just be passed to DiskImageFile and then onto EWFStream?

Coordinator
Jan 15, 2013 at 8:07 PM

Yes - you should only need a single 'DiskImageFile', so the _files list is overkill for your case.

I would tend to store the list of segment files in the EWFStream class - whether you pass the list of just the 'base' file, I think is probably just a matter of taste.  Either way would probably work.

 

Ken

Jan 17, 2013 at 10:32 PM

Thanks again Ken - I'm definitely making progress.

Another question if I may...

My EWFStream implements MappedStream and as such must implement the Extents collection property and the MapContent method. Why are these two essential? I can see why they might be useful, but if I handled the logic of servicing a request without using the Extents property or the MapContent method would some other part of DiscUtils not work?

Coordinator
Jan 18, 2013 at 8:38 AM

DiscUtils itself will be OK, but what won't work is if the application code that's using DiscUtils wants that info, things won't work.

I'd suggest you derive from SparseStream instead of MappedStream - which will avoid you needing to implement MapContent.

For extents, I'd suggest you just implement like this:

        public override IEnumerable<StreamExtent> Extents
        {
            get
            {
                return new StreamExtent[] { new StreamExtent(0, Length) };
            }
        }

This effectively says the disk is not sparse (there's a single extent that covers the entire disk), which I believe is true for EWF.

Ken

Jan 18, 2013 at 9:21 PM
Edited Jan 18, 2013 at 9:23 PM

Again, many thanks Ken.

I now have an EWFStream class that exposes 'content' as a stream. With that stream I can service read requests - woohoo!

I can do this:

DiscUtils.Ewf.Disk disk = new DiscUtils.Ewf.Disk(@"C:\test.E01");

DiscUtils.Ntfs.NtfsFileSystem ntfs = new DiscUtils.Ntfs.NtfsFileSystem(disk.Partitions[1].Open());
DiscUtils.DiscDirectoryInfo root = ntfs.Root;

foreach (DiscUtils.DiscDirectoryInfo ddi in root.GetDirectories())
{
    Log(ddi.FullName);
}

and it works!

It is currently read-only (as in, I haven't implemented the write method), but the Seek method works too.

The code definitely needs a tidy up and I'm sure I need to do more testing, but just wanted to let you know and say a big thank you.

Ok, back to the testing and tidying...

Jan 19, 2013 at 11:02 PM
Edited Jan 19, 2013 at 11:06 PM

After much testing and fixing and a couple of algorithm replacements I now have a much more robust solution.

However, I'm now getting a bizarre runtime error.

With this code:

DiscUtils.Ewf.Disk disk = new DiscUtils.Ewf.Disk(textBox1.Text);

DiscUtils.Ewf.Section.Header h = disk.HeaderSection; // [1]
foreach (KeyValuePair<string, string> headerInfo in h.Info)
    Log(string.Format("{0}: {1}", headerInfo.Key, headerInfo.Value));

When the program runs, a NullReferenceException is thrown by disk.HeaderSection at [1].

However, if I step through the program all is fine.

Even weirder, if I set a breakpoint at [1], when I hover over disk.HeaderSection it tells me that a NullReferenceException is thrown, but if I hover over 'disk' and then expand to view the HeaderSection it's fine, and the disk.HeaderSection seems to magically be ok.

It's hard to explain in words what's going on so I recorded a screen capture (it's only 40 seconds): http://dl.dropbox.com/u/36286466/ewf-nullreference.mp4

Any suggestions or tips would be fantastic!

Jan 19, 2013 at 11:13 PM

Ok. Not to worry. Typically, after posting asking for help I saw my mistake.

In my property I had:

public Section.Header HeaderSection
{
    get
    {
        EWFStream es = (EWFStream)_content;
        return es.HeaderSection;
    }
}

What I should've had was:

public Section.Header HeaderSection
{
    get
    {
        EWFStream es = (EWFStream)Content;
        return es.HeaderSection;
    }
}

Using the property (Content) rather than the private variable (_content) meant that if the local variable was null, it would set it from the DiskImageFile.

Still not sure why it worked after hovering over the object tho..?

Jan 19, 2013 at 11:34 PM

Oh, hang on... because hovering over the disk object forced a 'Get' on all properties which meant the 'Content' property updated _content. With _content now updated, the HeaderSection property was of course fine.