Tag Archives: virtual machines

How to convert a physical hard drive to a virtual machine

This post is based on various notes from a project I did a while back. It all began when my laptop’s hard drive started making funny clicking noises. I took this as a sign it would stop working sooner or later, did a backup of the relevant things and replaced it. Since then, it had been lying on a shelf so I figured a fun project would be to turn it into a virtual machine. After all, a disk can be moved from one physical machine to another without any need for reinstalling or configuring with identical settings. Sounds easy enough to take whatever is on the disk, convert it to a virtual hard drive and boot it inside VirtualBox? That way, I could continue to use the existing installation even if the hardware failed. Turns out this wasn’t as straight forward as first expected.

Let’s start with the easy part, first we need to write the disk content to an image file. Not knowing the exact steps involved, it seemed like a good idea to have an exact copy. This makes it easier to experiment and go back if something fails or additional steps are needed. It also reduces the risk and issue of the disk suddenly dying if I have already grabbed all I need from it.

I plugged the old hard drive into my Ubuntu machine to make a copy of the content while the drive was still working. To create an image, I used dd which will create a byte for byte identical copy. As an example; dd if=/dev/sdb2 of=image.iso status=progress will create an exact copy of the sdb2 partition and store it in image.iso. Make sure a) that you read from the correct disk (check gparted or similar tools to be sure) and b) the target where you store the result has sufficient space. As we’ll get more into later, you want a copy of the whole disk not just individual partitions. Unfortunately this didn’t work in my case, attempting to copy the entire disk it would stop halfway through. Repeated attempts failed at the same exact number of bytes, presumably related to the disk problems. I therefore grabbed only the main Windows partition which I was able to without running into errors. That should be all I needed. Now that I had an exact copy of the disk (well, the parts I could get at least), I unplugged the physical one.

Next step was to convert the raw disk image to something the virtual machine can understand. In my case, I used VirtualBox’s tool for this conversion: vboxmanage convertfromraw image.iso virtualHd.vdi. Note that the raw image is a byte-for-byte exact copy and will need all the required space even to duplicate empty space, but the virtual hard drive only needs the actual space in use. My tip now would be to create a backup of the virtual hard drive, to ensure you can start over if (when) something goes wrong. You can possibly do this with snapshots in the VM, but I found it easier to know that I could always return to the original state and start over without any earlier changes spilling over.

Create a virtual machine in VirtualBox as normal and attach the virtualHd.vdi to the new VM. This is where the problems started, it refused to boot. The disk was there, it was connected, and if I booted with a live CD I could see all the files. So why didn’t it work?

I tried multiple things here, eventually took a look at it with a boot repair tool. The report told me the boot sector believed it should start reading from sector 200000 or so, while the disk in fact started at sector 0. This is where I should probably tell you that the original disk layout was a bit strange. The first partition was a rescue partition (for some reason), the second was Windows and the third a dual boot setup for Ubuntu. Since I had failed to copy the complete disk, I had settled for the Windows partition. However, it seemed that it had retained the offset caused by the first partition, so using only the second partition made it really confused.

Disk management overview of the partitions

Note that the boot repair tool was able to pin-point the issue, but despite the examples and documentation I was looking at it didn’t provide any solutions. I tried a couple of variations to re-create the MBR by overwriting it, but no matter how I tried it always messed up the partition so that no program knew exactly what partitions or file systems it contained anymore.

After banging my head against that wall for a while, it struck me that if it needed that partition layout, why not set it up that way? I had a recovery CD, created from when the laptop was new. (Seemed more like a clean install than a recovery, but that suited me even better). So the plan was: I do a recovery install in the VM, I get the same partition layout and then simply replace the second partition. This actually worked as expected. Replacing the content in the second partition was easy. I just booted the virtual machine with the newly installed hard drive, the copied hard drive and a Ubuntu live CD to move things from one to the other. As an experiment, it actually made a difference if you copy all the files or all bytes of the partition with dd. The former worked and booted, but strange dialogs popped up when logging in. It should have replaced and included all files, so I don’t really understand the issue here. However, going back and overwriting everything with raw bytes worked much better.

So I now had a clean install with a fresh partition 1 and a salvaged partition 2 copied over. The VM booted, everything was loading and I got to the login screen. A little detour here, before we get to the end: I was unable to remember my original password. I tried most likely variations and then some rather unlikely ones without any luck. While I had successfully moved the content of the disk, I was unable to access it.

I considered password cracking for a while, but that would require taking the time to brute-force it which I’d rather avoid. While looking around for how to extract the username and password hash I found that you don’t need to crack it, you can simply blank it out. This guide (while written for an older Ubuntu version) went through the details. In short terms, boot from an Ubuntu live CD, install chntpw, locate the SAM file and blank out the password for the account in question. After doing this and rebooting I was automatically logged in and shown my glorious desktop with all previously installed programs.

This is also when I discovered that if you convert a physical hard drive to a virtual one in a VM, Windows will count that as hardware change and require a license re-activation. This would be no different if I had done a clean install, but I had hoped it could be avoided by converting the existing install.

In conclusion:
Yes, this can be done: duplicate the disk content, convert it to a format the virtual machine program reads and you can plug it into a VM. Though apart from being a fun hobby project it seems easier and less time-consuming to create a fresh install and then setup the same programs and configuration. If you do decide to convert a physical disk, make sure you create an image from the whole disk instead of a single partition, as this will save you lots of hassle in the long run. You can always clean up or wipe superfluous partitions in the VM afterwards.

My list of virtual machines

list_of_vmsThought I’d share the setup I have for virtual machines, how I use them to triage bugs and experiment with various software.

First a small digression, since the observant reader will notice I am using Virtualbox. When I first discovered and started playing around with virtual machines I had a computer incapable of hardware supported virtualization. I discovered this rather quickly since every virtualization solution I tried failed to work because they all required specific CPU features. After testing several solutions, I settled on Virtualbox because it also supported software-based virtualization. I’ve later replaced that machine, and while my current computer supports hardware assisted virtualization I’m still using Virtualbox as it is straight-forward and I am familiar with it. I did briefly try a couple of other solutions when I got my new computer, but didn’t find any obvious advantages they had over sticking with my existing setup.

Now, the machines. I have a set of the currently supported Ubuntu releases, organized by their code names. (Yes, I’m aware 11.04 reached end of life a while back.) They come in handy when confirming bugs or trying to track down which release something broke (or got fixed). My main use case is: load up the relevant release a bug was reported against, verify it is reproducible there, and then check whether it is also present in the latest development release.

All are kept more or less up to date, to make sure I have the latest version of libraries and other software when attempting to reproduce bugs. When I started triaging bug reports I used to simply install the software on my main system and check if the bug was reproducible there, though I quickly changed my approach for several reasons. Mainly because my main system wouldn’t easily allow me to test with multiple releases, but also in case my setup or set of installed packages would produce a different result than a system out of the box. The latter may not always be relevant, but there are some cases where it matters. For instance, say a program fails to run without a specific library which is not installed as a dependency, however since I already have installed the library for other reasons I wouldn’t be able to reproduce the issue. In cases like that it makes more sense to check what happens on a system out of the box.

In addition to the Ubuntu releases, I also run a couple of other systems. Arch Linux is nice and since it is rolling release distribution it usually includes the latest version of programs/libraries before most other distros. It’s ideal for testing whether projects still work as expected with the latest version of their dependencies, or to try out features in newer versions of programs. If newer versions of a library or compiler is released, it’s really convinient to be able to catch any issues early before it ends up the stable version of other distributions. In addition, Arch has a rather different philosophy and approach compared to Ubuntu, which is interesting to explore.

The Debian machine is running Sid (unstable). For most of the same reason as Arch, being able to test the latest version of projects, plus it will eventually turn into the next releases of Debian, Ubuntu (and related derivatives). As Ubuntu is based on Debian, it is of course also relevant for checking whether bugs are reproducible both places in case they should be forwarded upstream. As Debian is currently in freeze for the upcoming Wheezy release, there’s not many updates these days though.

Oh, and there’s a Windows 8 preview I was trying out when it became available. Used it some when it was announced. I’m pretty sure that will expire soon.