An encrypted filesystem using a loopback device

Security sometimes dictates encrypting data. This is often associated with "in transit" data on a public network, since it usually travels through 3rd-party hands. Despite greater physical security, encrypting local data is sometimes desirable too. Especially if it is removable data, stored on a floppy, zip disk, or usb storage device like a thumb drive. Though mostly local, this data not entirely so. Once it's in your pocket and you leave the office or house, it can fall into public hands just like network data.

When you want to encrypt local data, you have some granularity choices. You might want to encrypt a single file. Or you might want to encrypt all stored data on your machine. Or you might want to encrypt all data stored in a particular subdirectory or device. You need it to be encrypted before it is written. Then you need to add a decryption step to every retrieval. This can be manual and specific to what you store, perhaps file-by-file; or dynamic and transparent, specific to the place where you store it (eg, anything entering a particular directory gets the treatment).

Linux offers something interesting called a loopback device that can be used to implement transparent, dynamic encryption for an entire filesystem. The filesystem can reside normally on a device. Or as a special case it can be placed inside a file (that's right, an entire filesystem inside an existing file). We are going to use a file-based loopback device here. Any other loopback device could be substituted. The file-based loopback arrangement performs a double trick.

 1. make a file like a device
 2. encrypt/decrypt storage and retrieval on it

The first trick is to make a normal disk file appear to be a block device like a disk. Whatever you normally write onto a disk to make it work-- filenames, directories, inodes, timestamps, content-- can be written into a file. Thereafter you can pretend the file is a disk and use it as such. The second, additional trick is to put an encryption engine at the door that will encrypt everything on the way into the file and decrypt it all on the way out. With the encryption done using a password or other key, the data becomes inaccessible in the hands of anybody who, say, finds your lost thumb drive and takes it home but has no idea what your password was.

First, please become familiar with the plain-vanilla loopback device. That was quick. "Plain-vanilla" means essentially "no encryption." Now for all the encryption flavors you can add. The command that makes the file be a device is losetup, which offers an encryption option. To use it you'll choose an encryption algorithm and a password, passing them to losetup. It will then apply that algorithm, your password in the role of encryption key, to all data traffic entering or leaving the file. Everything going in gets scrambled; everything coming out gets unscrambled. Here's losetup's syntax:

 losetup   -e  <encryption algorithm>   <name of loopback device>   <name of file to associated with it>

What choices of algorithm do we have? This depends on your distribution. With most, the algorithms are implemented as kernel modules (units of code optionally loaded into memory). The modules come in files with "ko" extension. You can find them in /lib/modules/<kernel version>/kernel/crypto. Here they are from a Fedora Core 4 system:

The first step is to make sure the modules you'll need are available by loading them into memory. The modprobe command does that. You must at least load the cryptoloop since it implements the feature, plus the specific modules for whichever particular encryption algorithms you want to use. You can examine what modules are loaded in general with the lsmod command. You can see which crypto algorithms are available specifically by reading the contents of the virtual file /proc/crypto.

Suppose we want to set this up. First we'll load the module containing code for the twofish encryption algorithm (modprobe command). Then we'll create a file to work with (dd command). We'll associate a device with it, specifying dynamic encryption with the twofish algorithm (losetup command). We'll mount this device for use (mount command). We'll write a file to it. Then we'll unmount and un-loop it (umount and losetup).

The preparatory steps are shown here:

Now we create, mount, and write two loopback devices, each using a file as its backend data receptacle. And at creation time one of them is set to be encrypted with the twofish algorithm, for which a password is established by the user. A file is then associated with each device. Read/write requests directed to this device will be connected to its file by the loopback system. In effect, the file becomes the device and the device becomes the file. And in the case of the "made-for-encryption" device there is an extra processing layer whereby all data read or written passes through an encryption stage. The first occasion for that, below, is where mkfs inscribes on each device the default ext2 filesystem structures (which actually end up in the devices' affiliated files, "straight" and "scrambled"). Those structures appear normally inside "straight" but are unrecognizable inside "scrambled".

To do a comparative test of the normal versus encrypted devices, you can write known text into each then search for it in the two files. You'll find it in the unencrypted file but not the encrypted one. Above, the phrase "hello world" is placed in a file named hello.txt on each device (redirected "echo" command). The loopback devices are then undone-- the files lose their special role and go back to being ordinary files (umount/losetup commands). We know we wrote the word "hello" into both files. So we search for it (grep command). It's found in the unencrypted file straight ("Binary file straight matches") but not the encrypted file scrambled (grep returns silently). Scrambled contains a transformed ciphertext version of "hello" as opposed to the word itself.

Where is the protection? If an interloper obtains your file can he access your data? He can set it up as an encrypted loopback device as you did. He'll have to supply a password at that point, as you did. Any password he gives will be accepted and utilized, the command will still operate and your file will be decrypted using his algorithm and password. However if either one doesn't match yours, the data that comes out won't be what you put in. His password will recrypt, not decrypt. It will "operate," but it won't work. In particular the mis-decrypted file content won't conform to the format of a physical filesystem (fat32, ext, ntfs). So the point at which failure arises will be when that garbage is presented to the mount command. Mounting is what the interloper will be unable to do. Consequently your data will be safely inaccessible to him.