Thursday, February 28, 2008

Why don't I get spam?

I have an anti-spam trick. It won't work for most people, but there might be some people out there who are inclined to take advantage of it. For the rest, this might be educational.

The trick that I use depends on the fact that I have my own domain. That means I can run sendmail on my computer, and I can create email addresses quickly and easily. I will use the domains,, and for this document, as recommended in RFC 2606.

The basic idea is this: instead of having one email address, I have dozens. I create a new email address for every person with whom I exchange messages, as well as addresses for websites and companies when necessary. If an email address is accidentally revealed, or if one of the companies decides to start sending annoying amounts of unsolicited mail, I simply expire the email address and, if desired, contact the sending party to tell them about the new address. I don't have to contact all of my friends whenever I turn off one address, only the one person who uses that address to talk to me.

OK, how is this implemented? There are two things I have to do. First, I need my sendmail to accept the messages for the active addresses, and send them all to me. Second, I have to ensure that my outbound email has the correct Reply-To: address for the particular recipient of the message.

If you're familiar with sendmail, you can probably guess how I do the first thing. I set up a virtual user table. Here's the file used to make this work:

VERSIONID(` for version 01')
FEATURE(`nouucp', `reject')
FEATURE(`virtusertable', `hash /etc/sendmail/virtusertable')dnl
FEATURE(`genericstable', `hash /etc/sendmail/genericstable')dnl
FEATURE(`local_procmail', `/usr/local/bin/procmail')
FEATURE(`access_db', `hash -T<TMPF> /etc/mail/access')
define(`CERT_DIR', `MAIL_SETTINGS_DIR`'certs')dnl
define(`confCACERT_PATH', `CERT_DIR')dnl
define(`confCACERT', `CERT_DIR/CAcert.pem')dnl
define(`confSERVER_CERT', `CERT_DIR/MYcert.pem')dnl
define(`confSERVER_KEY', `CERT_DIR/MYkey.pem')dnl
define(`confCLIENT_CERT', `CERT_DIR/MYcert.pem')dnl
define(`confCLIENT_KEY', `CERT_DIR/MYkey.pem')dnl
Then, I create a file called /etc/mail/virtusertable.src. It contains entries similar to this:                      error:nouser Spammers found this address myself myself myself myself myself myself myself myself myself error:nouser Spammers found this address myself

The addresses I create for regular correspondance are just successive numbers, plus an unpredictable sequence of two characters to avoid dictionary attacks.

Now, recall that sendmail doesn't read the virtusertable.src file, it reads another file called virtusertable.db. I've got a little Makefile in /etc/mail that I use to keep things up to date:
all : genericstable.db virtusertable.db mailertable.db aliases.db access.db

%.db : %.src
makemap hash $* < $<

aliases.db : aliases

hup : all
killall -HUP sendmail

Now, I can change the virtusertable file, and when it looks correct, issue (as root) the command:
make -C /etc/mail hup
This will update the appropriate database file, and send a SIGHUP to sendmail, telling that program to reload its databases.

So, that's the receiving side. How about sending? There may be a way to configure sendmail to rewrite the outbound addresses according to a database of recipients, but I haven't figured one out. Instead, I have written a bit of code for my email client, which is rmail mode in Emacs. Here are the relevant bits of Emacs Lisp:
(setq user-mail-address "")
(setq mail-specify-envelope-from t)

(setq outbound-address-alist
("" "")
("" "")
("" "")
("" "")
(nil "")
(setq full-name "Winter Toad")

;; a function to parse out the header and send email as if from
;; different usernames. That way, I can obsolete a username if it
;; gets spam.
(add-hook 'mail-send-hook
'(lambda ()
(narrow-to-region 1 (mail-header-end))
(expand-mail-aliases 1 (mail-header-end))
(re-search-forward "^To: ")
;; parse out the recipient address
(let (recipient from-whom)
((looking-at "\\([^ \\t]*\\)$")
(setq recipient (match-string 1)))
((looking-at "[^<]*<\\([^>]*\\)>$")
(setq recipient (match-string 1))))
(setq from-whom (or (cadr (assoc recipient outbound-address-alist))
(cadr (assoc nil outbound-address-alist))))
(insert "From: " full-name " <" from-whom ">")

(re-search-forward "^Reply-to: ")
(let ((namestart (point-marker)))
(kill-region namestart (point-marker))
(insert from-whom)))

(narrow-to-region 1 (1+ (buffer-size)))))

What this does is to insert a hook into the mail system when I hit send. A bit of elisp locates the email address in the "To:" field, and tries to match that string to one of the names in the 'outbound-address-alist'. If it finds a match, it inserts the corresponding data into the "Reply-to:" field. If no match is found, or if there are multiple recipients, it uses the default fallback address.

It also sets the sender address to, which means that automated replies, such as sendmail daemon warnings and errors, will be delivered to that address. It should be redirected in the virtusertable to some appropriate address so that you can be notified of problems at the recipient's end (though many systems no longer generate bounce messages, because of spam abuse).

Anyway, with all this, I get really no spam. Every few months I may get one message on one of my email addresses, typically one that I used for a forum post or to send a bug report or patch to a mailing list. I retire the address, set up a new one, and never get spam at that address again.

Some time later I'll describe the cryptographic certificates in the mail configuration, and how they allow secure relaying.

Tuesday, February 26, 2008

Installing in non-standard places

I mentioned earlier the possibility of choosing an install prefix like /usr/local/samba, which installs the Samba libraries in a directory that may not commonly exist on distribution-managed machines. One possible effect of this is that you may turn up bugs in configuration and compilation scripts of other packages.

A configure script for another package may accept arguments related to the location of Samba libraries and header files, but compiling the package with these options set might not work. This isn't very surprising, it's a compilation option that is probably rarely used, so bit rot has a tendency to set in. A change somewhere that accidentally breaks the compilation when Samba is installed in an unusual place might not be noticed for some time. By putting Samba in its own directory, you are setting yourself up to test a valid, but rarely exercised option. You may find yourself submitting bug reports and patches to the package maintainers.

As I've said before, maintaining your box without a package manager and distribution is not easy. It's quite a bit more work, but it does force you to understand more about how the system is set up and what it's doing. For people who like the extra control and understanding this provides, this is a useful technique.

Sunday, February 24, 2008

Pharyngula readers in Ottawa

PZ over at Pharyngula reports that readers of his blog are meeting up in various places. Well, if there are any people in Ottawa who are interested in meeting, we can try to set it up here in the comments.

Any place I can get to by bus is fine with me, maybe a weekend lunch time? Possibilities would be
  • Lone Star at Baseline and Fisher
  • Sushi Kan at Baseline and Merivale
  • Some place in Chinatown
Or suggestions from somebody else, I'm not very familiar with the spots to eat in the city, places where a group can sit, eat, and talk for a while.

Thursday, February 21, 2008

A Followup On Cryptographic Mounts, The Bad News

Previously, I discussed cryptographic mounts to hold sensitive data. It's worth pointing out an article that is making the rounds today by 9 authors from Princeton, in which the researchers describe an attack on cryptographic techniques, including the one I've described.

The technique relies on the fact that modern memory can retain its information for several minutes after the computer stops sending it refresh signals. What this means is that a person with physical access to the computer can pull the power connector from the computer and then remove the memory chips, insert them in another computer, and read the cryptographic keys out of the memory. I don't know of a good way to avoid this attack. If the cryptographic volumes are mounted when the computer falls into the hands of the attacker, the data will be, in theory, recoverable.

So, what can be done to prevent the key from being resident in the computer's memory at the instant that the attacker unplugs it? The key has to be available to the operating system so that it can read and write that data in normal operation. Sure, you could get specially modified hardware that deliberately overwrites the main memory from batteries when the power connector is removed, but maybe there's a way to store 128 bits somewhere other than in main memory?

A cache line on a modern CPU is 64 bytes, big enough to hold two 128-bit keys. Could the operating system subvert the hardware's L1 caching mechanism sufficiently to pin a value in the cache and remove it from L2 and main memory? This attack won't recover data from the L1 cache, so if that's the only place the key is kept, maybe that would be enough. You sacrifice a cache line, but maybe it's worth it?

How about the TLB? That's another part of the CPU that holds data, and that one is explicitly designed to interact with the operating system. Could we find a way to store 128 bits in parts of the TLB, and then deliberately avoid overwriting them? Can the operating system read those numbers back out of the TLB?

Are there any registers that could be used? Probably not on 32-bits, there aren't many registers there, and on 64-bits you'd probably have to use a special-purpose compiler to avoid these registers being touched by a context switch, and avoid them being saved to memory when an interrupt handler runs.

What if you have fifteen keys, all of 128 bits? Well, I believe we could handle that if we had 256 bits of volatile storage space. The first 128 bits of volatile space holds an XOR key, that decodes all of the fifteen keys. The second 128 bits of volatile space holds the decoded key in active use.

Those are my thoughts, anyway.

Wednesday, February 20, 2008

Choosing an install prefix

As noted in this posting, you generally will have to choose an install prefix for software that you are compiling yourself. Most packages you encounter will be configured to install under /usr/local, though some will be configured for /usr.

The first thing you'll want to do is to see if you already have an older version of the software installed anywhere. If the software was previously installed under /usr/local, and you install the new package under /usr, not only will you needlessly consume disk space, but the version that is run will depend on the setting of your PATH environment variable. A user may report that he can't use a certain feature in the new version, and it may take you a while to notice that his environment variable differs from yours, and that he's still running the old software. So, find the name of an executable that you expect will be installed. For example, if you're installing the binutils software, you will expect that the ld binary should be installed somewhere. Next, type the command:
which ld
to see where it is currently installed. If you see it in "/usr/bin/ld", then you'll probably want to use a prefix of "/usr", so that your new versions install over top of the old ones. If, on the other hand, it's in "/usr/local/bin/ld", you'll want a prefix of "/usr/local".

Sometimes a package installs only one or a few binaries. You may decide to install this into its own directory. For example, I install firefox into the prefix /usr/local/firefox, SBCL into the prefix /usr/local/sbcl, and the apache httpd into /usr/local/apache2. These get their own directories because, while they may install a very small number of executables, they come with a large set of ancillary files. Rather than installing over top of the old directory, I move the old directory to a new location, say "/usr/local/sbcl.old", and then install and test the new version. If the new version doesn't work properly, I can revert to the old one by deleting the new install and renaming the ".old" directory. Alternatively, I can compare the two installations, the previously working one against the new one, and see if there are any obvious differences that could account for problems.

Of course, you probably won't be able to type the command firefox and expect it to run if it's installed in /usr/local/firefox/bin/. You will either want to add that directory to the PATH variable, or, more conveniently, put a symbolic link to the appropriate executable from a directory that is in your PATH. This command:
ln -s /usr/local/firefox/bin/firefox /usr/X11/bin/firefox
puts the firefox executable into your PATH, piggy-backing on the /usr/X11/bin entry that is probably there already. Note, however, that if you re-install X11 (we'll get to that in another posting), you might destroy this symbolic link, and you'll have to re-create it then.

So, you really have a couple of choices. Put the program into a standard place, like /usr or /usr/local (and if upgrading try to install over top of the old version by using the same prefix that was used then), or installing the software in its own dedicated directory, like /usr/local/firefox or /usr/local/sbcl.

Now, when you set the prefix in an autoconf configure script, it also sets a number of derived values which can be separately overridden. Configuration files are, by default, put in <prefix>/etc, libraries in <prefix>/lib, headers in <prefix>/include, man pages in <prefix>/share/man (sometimes omitting the 'share' component), log files in <prefix&gt/var/log, and so on. The configure program lets you override these defaults separately, so that you can put configuration files into, say, /etc/http with the option "--sysconfdir=/etc/http", and so on. Think carefully about whether you want these additional directories to keep their defaults. You probably don't want your X-server log to be in /usr/X11/var/log, nobody will know where to look for it.

Compiling and installing by hand

If you're not using a package manager, or if you are, but there is no package available for a piece of software you'd like to install, you'll find yourself compiling the software by hand. Generally, you start by locating the official web page of the software, downloading an appropriate version of the source code, and extracting the tar file to a directory somewhere.

At this point in the process, you are not doing anything as the root user. You'll become root much later in this process.

The next thing you'll do is look in the top level of the extracted directory for promising looking files, like README, INSTALL, or Makefile. It is likely that you will see an executable script called "configure". It's always a good idea to start by looking at the README and INSTALL files, if present. They may be in the toplevel directory, or in a documentation directory, which will often have a name like "doc", "docs", or "documentation", possibly with different capitalizations.

If The Package Came With A Makefile

If there's a Makefile in the toplevel, that's usually because the software package is fairly small. You will want to look over the Makefile to ensure that it is correct for your intended installation. The most important things to look for are the installation directory and any optional features that might have to be turned on by editing the Makefile. If you can't find the installation directory, type the command:
make -n install

This will ask "make" to print out the sequence of commands that it will be using to install the package. Since you haven't compiled anything yet, it will start with the sequence of commands required to compile your software, so look for the installation commands to occur near the end of the output generated by this command.

If your package came with a Makefile, you will now modify the Makefile if necessary, perhaps changing the installation directory of the product. You should do this before compiling it, because sometimes character strings holding the pathnames of configuration files are inserted into the compiled binary, so changing the installation target after compiling may result in an installation that doesn't work correctly. Editing the Makefile will usually not force a recompilation of the objects under its control, that is the Makefile is not, by default, considered a dependency for the targets in the file.

After this, you will, still as your non-root user, compile the package. This is usually done by simply entering the command make. If errors are encountered during the compile, you'll have to figure out what happened and how to fix it. The most common causes of errors are:
  • missing include files - you might have to add a "-I<directory>" to the CFLAGS, CXXFLAGS, or CPPFLAGS variables in your Makefile.
  • missing libraries - you might have to add a "-L<directory>" to the LDFLAGS variable in your Makefile.
  • bad version - the compilation may depend on a library you have on your machine, but the version you have may not be compatible with the software package. You might have to download a different version of that library and install it before you can continue with the software package.
  • apparent code errors - the compiler may generate errors related to missing variables, bad function declarations, or syntax errors. Resist the urge to correct these immediately, and try to understand why you are seeing these errors. Remember, this package probably compiled for somebody before they released it, why doesn't it work for you? Is it that your compiler is a different version, and flags as errors things that used to be warnings? Is the Makefile configured for the wrong architecture or platform? Something else?
Once you get a clean compile, you're almost ready for the install. I usually prefer to run the command
make -n install | less
once and read through the output, just to make sure that the install isn't going to do something weird. Look for things like configuration files going into /usr/etc, which might not be what you expect, or binaries going into /bin (you should try to keep in that directory only those executables that are necessary to get the computer to boot through its startup scripts up to the point where the network starts up).

At this point, move down to the section of the text called "Installing The Software".

You Have A "" Script, But No "configure" Script

If you have a "" script, but no "configure" script, you'll have to generate the configure script. If there is an executable in this directory with a name like "", run it. This should be sufficient to set up the configure script. If you don't have an autogen script, you should run the commands automake then autoconf. This will often generate warnings, but unless the configure script you generate doesn't run, you can ignore those. So, now you have a configure script, you continue to the next section.

You Have A "configure" Script

If you generated the configure script yourself, you know that it's an autoconf configure script. Sometimes, though, software is produced that has a completely different script that happens to be called "configure". This can be confusing if it doesn't recognize the switch "--help". Start by typing:
./configure --help | less
and look at the output. If it produces a list of options that are available to you, review them carefully and see if there are any optional behaviours that you would like to turn on, or unwanted options that you want to remove (possibly you don't have library support for these, and don't need them). If, instead, the configure script appears to run and do things, you don't have an autoconf configure script, go back and look at the documentation again to see how to use their particular configuration script.

There are a few things to look at in the options you get from "configure". One of them is the prefix location, and choosing that properly can require some care, which is discussed here. For now, let's assume that you've chosen a set of options that look suitable. You re-run the configure script with those options, and without the "--help" option. It will do some things, it may take a considerable amount of time to run. Eventually, the script should exit, sometimes generating a list of all options and whether or not they are active. Examine this list if present, there might be an option that you want to enable that has been turned off because the configure script failed to find a particular library, in which case you'll have figure out why that option was disabled and figure out how to get it working. When you're satisfied with the compilation options, type "make". If an error is encountered, see the possibilities mentioned in the earlier section referring to building from a Makefile. If you succeed in compiling the software package, go to next section, "Installing The Software".

Installing The Software

Now, you can become the root user. Change directory to the location where you compiled the binary, and run
make install
If the thing you're installing has any shared objects (libraries, usually with names that end in ".so", possibly followed by more dots and numerals), you should type
to make sure that the dynamic linker knows where to find the libraries you've just installed.

Many packages these days produce a pkg-config file. This is usually a filename that ends in ".pc", and is installed in a directory like ".../lib/pkgconfig/". The pkg-config application often looks for these files when "configure" is being run, but it has a fairly definite idea of where to look. If your .pc file was installed into a directory where pkg-config doesn't normally look, you'll have to find some way to make this file visible to that program. There are three ways you can handle this:
  • Add the appropriate directory to the system-wide environment variable PKG_CONFIG_PATH. Usually this means editing /etc/profile. You likely want it set at least to "/usr/lib/pkgconfig:/usr/local/lib/pkgconfig:/usr/X11/lib/pkgconfig", but you may want to add more search directories to it, if you expect many packages to be installed in the same prefix.
  • Copy the .pc file into a directory that pkg-config searches. This is unwise, you may install another version of the software some time later, and unless you remember this step your .pc file will still be the old one, causing much aggravation as "configure" insists you still have a version of the package that you know you just replaced.
  • Put a symbolic link to the file from a directory that is searched by pkg-config. Do this if you've got only one or two .pc files in this prefix, and don't expect to put in more.
Test your newly-installed software. It's best to find problems now, when you've just finished installing it and remember what you did, than two weeks from now and have to go through the whole thing again just to figure out how it's set up.

Two more hints: "configure" writes its command line into a comment near the top of the file "config.log". If you need to remember how you last ran "configure", you will find the options you used there.

If you have a particularly detailed set of configure options, you might want to record them in a directory somewhere for future reference, both to see quickly what options you enabled when you compiled the software and to re-use the command the next time you recompile it after downloading a new version.

Tuesday, February 19, 2008

Making blogged source code readable

Just a quick note, I've reformatted my source code examples using

Monday, February 18, 2008

Keeping sensitive data on the crypto disks

Previously, I described how to create one or more crytpographic partitions. The data stored on those partitions is not retrievable without the 32-digit hexadecimal key that protects it, the key being constructed from a passphrase input by the user. It may seem that this is sufficient to protect sensitive data, making sure simply to create and edit your files only in that partition. However, there are some subtle details that have to be kept in mind.

Information stored on an unencrypted ext2 or ext3 partition has an unknown persistence. A file that was stored there, and later deleted, may be partially or fully recoverable at some time in the future. To be sure of the confidentiality of your data, you have to make sure that it has never been stored to an unencrypted partition.

If you start up your favourite text editor, telling it to create a new file in some place, let's call it /crypto/sensitive.txt, and then start typing, you may expect that the data never lands on an unencrypted partition. However, there are at least four things to be careful of:
  1. The editor may store information in your home directory, which may not be on the encrypted partition. It might store some of the file contents there, or it might store file metadata. Your editor may keep a table of filenames recently visited in /home, with information about the line number last visited. Your editor might be configured to store crash-recovery autosave files in a directory under your /home directory.
  2. The editor may sometimes store the contents of a working buffer to a file in /tmp.
  3. The computer may come under memory pressure, resulting in some of your data being sent to the swap device.
  4. Your backups may not be as well protected as the files on the cryptographic disk.
The first two points are probably best addressed by ensuring that all of the directories writable by the unprivileged user are on cryptographic partitions. If you only have write permission to the crypto drives, you won't store any files in plaintext. Note, however, that you typically need /tmp to exist and be writable during the bootup of your system, so that partition can't be protected with a passphrase if you care about the system successfully performing an unattended reboot.

So, what do we do about /tmp? Well, one simple solution is an overmount. While you normally mount a partition onto an empty directory, it is legal to mount onto a directory that is not empty. The files that were present in that directory are mostly inaccessible after that (a process with access to file descriptors that it opened before the mount will still be able to operate on those files, but they will be invisible to new open operations by pathname).

We're assuming you have at least one cryptographic partition. So, create a directory on that partition, let's say /crypto/tmp. After you have formatted and mounted your cryptographic partition, run this command. You only have to do this once, the first time you set up cryptographic disks.
mkdir --mode=01777 /crypto/tmp

Now, you can add the following command to the end of the script in the previous post, the script that mounts your formatted disks:
mount --bind /crypto/tmp /tmp

After you've done this, the system will still boot up as usual, using its unencrypted /tmp partition. Then, the root user can run the script from the previous post, now modified to have this extra mount line on the end of it. After entering the passphrase the script will do its work and exit, at which time your /tmp partition will have been replaced with the one in /crypto. Note that if your system starts up in X, with a graphical login screen, you will have to restart it after you have overmounted /tmp, otherwise you will find that X programs fail to work at all. I usually restart X by issuing a simple "killall X" command, and letting the xdm or gdm program start it back up again. This is a lot of trouble, but all manner of things can be stored on your /tmp disk. Firefox will store downloaded files such as PDFs there when there is a helper application ready to use them.

That leaves us with swap. Encrypting the swap space is actually very easy:

# Encrypt the swap partition
hashed=`dd if=/dev/urandom bs=1 count=64 | md5sum | awk ' { print $1 } '`
dmsetup create SWP <<DONE
0 `blockdev --getsize /dev/hda6` crypt aes-plain $hashed 0 /dev/hda6 0
mkswap /dev/mapper/SWP
swapon /dev/mapper/SWP

This can run unattended during the bootup. It creates a random cryptographic key using /dev/urandom, a device especially designed to produce true random numbers even during a system bootup sequence. This random key is used to create an encrypted interface to /dev/hda6. It is formatted as a swap partition, and then enabled. A new key will be generated each time the system boots, so nothing in swap space will survive a reboot. Note that there do exist suspend-to-disk procedures for Linux that store a memory image on the swap partition. If you intend to use such a suspend system, you will have to ensure that it does not attempt to write to the cryptographic swap partition, or you'll have to defer mounting the swap partition until the root user can enter a specific passphrase, thereby allowing you to preserve the contents across a reboot. If you're supplying a passphrase to handle encryption on the swap space, you should not run mkswap, except the first time you set up the partition (think of mkswap as being a reformat).

The question of how to protect your backup copies of sensitive files is entirely dependent on what system you use for backups. You may be able to pipe your backups through the des binary, or you may be able to store the backups on encrypted filesystems, but there are too many variations for me to offer much advice here. The security of your backups is not something that can be ignored, as has been made all to obvious with the various data disclosure scares that occur with alarming regularity when shipments of tapes or CDs fail to arrive at their destinations.


See my followup article for a warning about a vulnerability in this technique.

Sunday, February 17, 2008

Cryptographic mounts

Some of the data on my computers is stuff that I'd rather not let into the hands of a random stranger. Work-related files, proprietary data or source code, banking information, or other sensitive files. A laptop can go missing, an entire desktop computer can be carried away. It would be nice if the sensitive data were inaccessible in that event.

This leads us to cryptographic mounts. Partitions whose contents cannot be read without the knowledge of a secret that is not stored in the computer. I use a passphrase, but if you are the kind of person who memorizes 32 digit hexadecimal numbers, you can skip the passphrase. The appropriate features to enable in the kernel, either as modules or compiled directly in, are MD (the same subsystem that controls RAID) and two features in that subsystem, BLK_DEV_MD, and DM_CRYPT. You also need a cryptographic algorithm available. I use AES encryption on my partitions, but there are many others available. I have activated the CRYPTO_AES module, plus the appropriate architecture specific module, CRYPTO_AES_X86_64 for my desktop machine and CRYPTO_AES_586 for my laptop.

So, let's say you have one or more blank partitions that you'd like to set up as a cryptographic partitions, all with the same passphrase. You start with this script:
#! /bin/sh



echo -n "Enter the passphrase: "
read -s oneline


{ hashed=`md5sum | awk ' { print $1 } '` ; }&lt;&lt;DONE

dmsetup create $mapname1 <<DONE
0 `blockdev --getsize $partition` crypt aes-plain $hashed 0 $partition 0
dmsetup create $mapname2 <<ONE
0 `blockdev --getsize $partition2` crypt aes-plain $hashed 0 $partition2 0

What this script does is to prompt the user for a passphrase, without echoing it to the screen. Once the passphrase is entered, it is converted to a 32 character hexadecimal string with the MD5 program. I use a here document, marked with the << characters, because that way the hexadecimal string does not appear in the process status list. Simply using echo risks having the secret visible to any user who types ps at the correct moment. Then, the dmsetup program creates the cryptographic mapping, using the hex sequence as the cryptographic key.

You will have to change the values of the $partition and $partition2 variables to correspond to those on your system. Note that volume labels are unavailable, because the system can't read the label off a cryptographic partition before the passphrase has been supplied.

Run this script, entering the passphrase. It's important that you do this through the script, and not manually at the command line, because later you'll modify the script to mount your cryptographic partitions, and you want to ensure that exactly the same code read your passphrase when you created the partitions as will read your passphrase when you try to mount the partitions after a reboot some time in the future.

When the script exits, you will have two new objects appearing in the /dev/mapper directory. In this case, they are /dev/mapper/Crypto1 and /dev/mapper/Crypto2. So, in this example, /dev/sda6 is the encrypted volume, and /dev/mapper/Crypto1 is the decrypted version of the same volume. You do all of your work on /dev/mapper/Crypto1. You format and mount that device, never /dev/sda6.

This command will create an ext3 filesystem with 0 bytes reserved for the superuser.
/sbin/mke2fs -j -m 0 /dev/mapper/Crypto1

Now, you can mount /dev/mapper/Crypto1 onto a mount point, and start copying files there as usual. Until you remove the cryptographic mapping, the data is available as a normal mounted partition. So, we now append some code to the script above to allow the partitions to be mounted by the root user after a reboot. Take the script above and add the following lines to the bottom:
/sbin/e2fsck -y /dev/mapper/$mapname1 || \
{ dmsetup remove $mapname1 ; echo "" ; echo "fsck failed"; exit 1; }

/sbin/e2fsck -y /dev/mapper/$mapname2 || \
{ dmsetup remove $mapname1; dmsetup remove $mapname2 ;\
echo "" ; echo "fsck failed"; exit 1; }

mount -onodiratime /dev/mapper/$mapname1 $mtpt1 || \
{ dmsetup remove $mapname1 ; dmsetup remove $mapname2 ; \
echo "" ; echo "Failed" ; exit 1 ; }

mount -onodiratime /dev/mapper/$mapname1 $mtpt2 || \
{ umount $mtpt ; \
dmsetup remove $mapname1 ; dmsetup remove $mapname2 ; \
echo "" ; echo "Failed" ; exit 1 ; }
echo ""

This runs fsck on the partitions, if necessary (remember, fstab can't fsck these partitions because it doesn't know the passphrase). Note that if you entered the wrong passphrase, you'll find out at this point, when e2fsck fails to identify the partition as being an ext2 or ext3 partition.

It then manually mounts the cryptographic partitions onto the mountpoints in $mtpt1 and $mtpt2. In the event of a mount failure, it unmounts everything and removes the cryptographic mappings.

The next time the computer is rebooted, the root user will have to run this script and enter the correct passphrase before the data on those drives is readable. If somebody else obtains your laptop, any mounted cryptographic partitions will be unavailable if the computer is rebooted, or the drive removed from the laptop and inserted into another machine.

This is only half the story. In a later post I'll describe the care you have to take to make sure your sensitive data does not wind up as readable plaintext somewhere on your filesystem.

Why do I have so many hard drives?

There are five hard drives in my main computer. There is no RAID setup. Why?

Hard drives fail. I've had the drive holding my root partition fail more than once. When that happens, I used to restore from backup. I would make a backup tape at least once a week, but a badly timed disk failure could still result in the loss of a lot of work.

My solution to this has been to buy my hard drives in matched pairs. I partition them equally, format them the same way, and install them both in the computer. One of them is the live disk, the other is the spare. The spare is kept unmounted and spun down. Every night around 3:00 AM, a cron job spins up the spares drives. Then, one partition at a time is fsck-ed, mounted, and copied to. The shell script uses rdist to synchronize the contents of the two partitions. Finally, I take special care to make the backup drive bootable. I use the LILO boot loader, so, when the root partition is mounted under /mnt/backup, the script executes the command:
/sbin/lilo -r /mnt/backup -b /dev/sdc

which, on my system, writes the LILO boot magic to the backup boot drive, which appears as /dev/sdc when it is the spare in my system. My lilo.conf file, on both the live system and the spare, refer to the boot drive as being /dev/sda, but this '-b' switch overrides that, so that the information is written to the boot block of the current /dev/sdc, but is written so that is appropriate for booting the device at /dev/sda (which it will appear to be should my live boot drive fail and be removed from the system).

Next, I use volume labels to mount my partitions. You can't have duplicate labels in the system, so my spare drive has labels with the suffix "_bak" applied. That means that the /etc/fstab file suitable for the live drive would not work if the spare were booted with that fstab. To solve this problem, the copying script runs this command after it finishes copying the files in /etc:
sed -e 's|LABEL=\([^ \t]*\)\([ \t]\)|LABEL=\1_bak\2|' /etc/fstab > /mnt/backup/etc/fstab

which has the effect of renaming the labels in the fstab to their versions with the _bak suffix, so they match the volume partitions on the spare hard drive.

OK, that sounds like a lot of work, why do I do it? What does it buy me?

First of all, it gives me automatic backups. Every night, every file is backed up. When I go to the computer at the beginning of the day, the spare drive holds a copy of the filesystem as it appeared when I went to sleep the night before. Now, if I do something really unwise, deleting a pile of important files, or similarly mess up the filesystem, I have a backup that I haven't deleted. If I were to use RAID, deleting a file would delete it immediately from my backup, which isn't what I want. As long as I realize there's a problem before the end of the evening, I can always recover the machine to the way it looked before I started changing things in the morning. If I don't have enough time to verify that the things I've done are OK, I turn off the backup for a night by editing the script.

Another important thing it allows me to do is to test really risky operations. For instance, replacing glibc on a live box can be tricky. In recent years, the process has been improved to the point that it's not really scary to type "make install" on a live system, but ten years ago that would almost certainly have confused the dynamic linker enough that you would be forced to go to rescue floppies. Now, though, I can test it safely. I prepare for the risky operation, and then before doing it, I run the backup script. When that completes, I mount the complete spare filesystem under a mountpoint, /mnt/chroot. I chroot into that directory, and I am now running in the spare. I can try the unsafe operation, installing a new glibc, or a new bash, or something else critical to the operation of the Linux box. If things go badly wrong, I type "exit", and I'm back in the boot drive, with a mounted image of the damage in /mnt/chroot. I can investigate that filesystem, figure out what went wrong and how to fix it, and avoid the problem when the time comes to do the operation "for real". Then, I unmount the partitions under /mnt/chroot and re-run my backup script, and everything on the spare drive is restored. Think of it as a sort of semi-virtual machine for investigating dangerous filesystem operations.

The other thing this gives me is a live filesystem on a spare drive. When my hard drive fails (not "if", "when", your hard drive will fail one day), it's a simple matter of removing the bad hardware from the box, re-jumpering the spare if necessary, and then rebooting the box. I have had my computer up and running again in less than ten minutes, having lost, at most, the things I did earlier in the same day. While you get this benefit with RAID, the other advantages listed above are not easily available with RAID.

Of course, this is fine, but it's not enough for proper safety. The entire computer could catch fire, destroying all of my hard drives at once. I still make periodic backups to writable DVDs. I use afio for my backups, asking it to break the archive into chunks a bit larger than 4 GB, then burn them onto DVDs formatted with the ext2 filesystem (you don't have to use a UDF filesystem on a DVD, ext2 works just fine, and it's certain to be available when you're using any rescue and recovery disk). Once I've written the DVDs, I put them in an envelope, mark it with the date, and give it to relatives to hang onto, as off-site backups.

So, one pair of drives is for my /home partition, one pair for the other partitions on my system. Why do I have 5 drives? Well, the fifth one isn't backed up. It holds large data sets related to my work. These are files I can get back by carrying them home from the office on my laptop, so I don't have a backup for this drive. Occasionally I put things on that drive that I don't want to risk losing, and in that case I have a script that copies the appropriate directories to one of my backed-up partitions, but everything else on that drive is expendable.

There are two problems that can appear with large files.
  • rdist doesn't handle files larger than 2 GB. I looked through the source code to see if I could fix that shortcoming, and got a bit worried about the code. So I'm working on writing my own replacement for rdist with the features I want. In the mean time, I rarely have files that large, and when I do, they don't change often, so I've been copying the files to the backup manually.
  • Sometimes root's shells, even those spawned by cron, have ulimit settings. If you're not careful, you'll find that cron jobs cannot create a file in excess of some maximum size, often 1 GB. This is an inconvenient restriction, and one that I have removed on my system.

Breaking packages

I've used the term "breaking packages" a few times. As I said, I maintain my Linux boxes without a package manager. So, how did I get these Linux boxes?

My main Linux computer has over half a million files in its filesystem, and over 3000 separate executables. Where did they all come from? You need some way to start out, your computer isn't going to do much without a kernel, a shell, and a compiler.

In 1994, I installed Slackware on a 486-based computer. This computer had about 180 MB of hard drive space (nowadays that wouldn't even hold half of the kernel source tree) and 16 MB of RAM. At that time, Slackware didn't really have a package manager. It had packages, just compressed tar files of compiled binaries, grouped by function. If you weren't interested in networking, you didn't download the networking file. If you weren't interested in LaTeX, you didn't download that file. There were only a few dozen "packages", because of this very coarse granularity. The functions like "upgrade", "install", "find package owning file" weren't present. An upgrade was the same as an install, just install the new package into the filesystem, and it would probably replace the old package. To find out which package provided a certain file, you could look in per-package lists of files.

So, I never really had a package manager on that system. When I needed new programs, I downloaded the source code, compiled it, and installed it. When I moved to a new system, I brought backup images or a live hard drive to the new computer. I didn't start with a blank hard drive, I started with the hard drive from the old computer I was replacing. Over the years, I have replaced every executable that was installed in 1994 (I know this is the case because all of the files installed then were a.out format, and I have only ELF binaries on my computer now).

Sometimes, though, I've started with a computer that had a distribution installed on it. At a previous job, my laptop came with Mandrake Linux installed on it. I tried to keep the distribution alive for a while, but eventually got impatient with the package management system and broke the packages.

So, if you give me a new Linux computer and tell me it's mine to modify, a good first step for me is to kill the package manager. On an RPM-based system, that's generally achieved by recursively deleting the directory /var/lib/rpm. After that, the rpm command will stop working, and I have the finer control and more difficult task of managing the box myself.

What do we have running on that box?

As I mentioned in my first post, when you install a distribution you sometimes have programs running without your knowledge. Because some users may need these features, you often get them as well.

Last week at the office, somebody came up to me to ask if I could figure out why it was no longer possible to log into one of the servers. This server has a history of flakiness, there's probably a bad memory module on the board, and sometimes it becomes unresponsive. So, my co-worker, upon realizing that he couldn't log in, had rebooted the computer. However, even after the reboot, he still couldn't log on, either as his regular user through SSH, or as root on the console.

The first step, before getting out of my chair, was to telnet to port 22 on the box. I got a "connected" message, and a text string indicating that I was attached to an SSH daemon. This told me that the kernel was alive, it was accepting new connections and passing them to the appropriate processes, which were themselves able to make forward progress. So, the box wasn't wedged. I went to the console, and tried to log in through the getty running on the text login screen. I entered 'root' at the username, and got a password prompt. When I entered the password and pressed ENTER, the getty process froze, and did not present me with a shell.

So, we have two very different authentication schemes that are failing to allow logins. The console doesn't allow root logins. Something seemed to be interfering with the general activity of authentication. The first thought is that this might be a PAM problem, but it would be a strange one. We didn't get authentication failure messages, we got a hang after authentication. Root's credentials were stored on the local drive, so it wasn't an LDAP issue, and in any case, the machine was on the network, and there weren't LDAP problems anywhere else in the office.

When multiple independent programs fail together, the next thought is that there's probably a full disk somewhere. If you fill up /tmp your system can start to behave very strangely. The login problems were a symptom of something, as yet unknown. So, the next thing to do is to check the hard drive to see if we had any full partitions. Because I didn't know what else might be misbehaving, I wanted to avoid all of the startup jobs, so I rebooted the machine with an appended kernel parameter, "init=/bin/bash". Instead of running the usual /sbin/init, and all of the various scripts under /etc/init.d, the computer would start up the kernel and then drop immediately to a root shell. No logins, no passwords, no startup scripts. I could then run 'df' at the prompt, and confirm that there were no partitions within 5% of full (remember that a default ext2 format will reserve 5% of the blocks for root, so a disk that's 96% full could actually be entirely full for some users). Checking with 'df -i' showed that we had not run out of inodes either.

So, what's next? I decided to reboot the machine into single-user mode so that I could easily modify files on the disk but still get onto the computer without a password. This is done by appending the parameter "S" on the kernel boot line. Again, I get a shell, but this time the disks are read-write mounted, and various services have started up. So, I modify the inittab. I replaced the getty on tty1 with /bin/bash. That means that when the computer is rebooted into multi-user mode, tty1 has a root shell while the other ttys are still running their gettys.

Reboot into the usual multi-user mode. I have a root shell on tty1. I run "ps ax", and find the PID of the getty on tty2. Then, I run the command
strace -f -p <PID>

at the shell prompt of tty1. Changing virtual consoles to tty2 with the usual command, CTRL-ALT-F2, I am presented with a login prompt. I enter the username 'root', and enter the password. The program hangs. So, I change back to tty1 to see what strace has to say about the program. The last things the program did are on the screen. It opened a device called /dev/audit, did some things with it, then issued an ioctl() on the file descriptor. That ioctl was not returning to the caller, so the program was blocking waiting for a response from something associated with /dev/audit.

None of us had heard of /dev/audit, so it was time to do a bit of research. It turned out to be a package that was included in the RHEL distribution installed on that computer. There is communication between the device and a daemon. That daemon keeps logs, so I went to its logging directory to see what was there. I found 4 GB of data there. Apparently that had reached some sort of internal limit, and the daemon responded by forbidding further auditable actions until some of the logs were removed by the administrator. Logins, being auditable actions, were blocked.

So, delete all of the logs in the directory, and reboot the computer. Everything returned to normal.

Now, a logging function like this is very useful for some users. There are some people who must know exactly who logged into the machine, what database entries they accessed or modified, and so on. We are not such people. A service we never knew about, enabled for all because it is useful by some, wound up locking us out of our own machine.

It's a security feature that logins are forbidden until the logs have been inspected and removed. If you're going to design a function like this, then this is the correct way to go about it. Of course, it was very easy for me to overcome this security feature with access to the console, but that's generally true. I probably would have set it up so that gettys are permitted to log in as root even when an audit failure occurs, but that level of flexibility may not be available, if the behaviour is driven by a special PAM module or a patched glibc.

Breaking packages on the MythTV box

As I mentioned earlier, I have a MythTV computer, installed from packages, but I've broken some of the packages. Here are some of the issues that I had with the packages, and how I solved them.

Two of the package manager drawbacks I've mentioned previously appear here: the one-size-fits-all approach to software packaging, and the failure to receive timely updates.

The MythTV box is on old hardware. Because it has hardware assistance for both MPEG encoding and decoding, I didn't need a new computer with a fast CPU. The fact that this is old hardware, with a 7-year old BIOS, may be why I had problems, but I found it easier to break the packages than to try to solve the problems under the constraints of the package system.

First, the MythTV box controls an infra-red LED attached to its serial port, allowing it to change the channels on a digital cable box. This requires the use of the LIRC package, and the lirc_serial kernel module. Well, at the time I set this up, the lirc_serial module was having problems with the SMP kernel. The system would generate an oops quite regularly when it wanted to change channels. Looking at the oops logs, I could see that there were problems specifically with SMP. My MythTV box has only one CPU, so I didn't need an SMP kernel, but because some users will have SMP computers, the KnoppMyth distribution ships with an SMP kernel. I tried to find a non-SMP kernel for the system, without success. So, the easiest way to fix the problem was just to download a recent kernel source tree from, copy the configuration file from the Knoppix kernel, and reconfigure it as non-SMP. The spontaneous reboots stopped occurring. The package manager still believes that it knows what kernel is running on the computer, but that isn't what is really installed.

When I installed the MythTV box, the software was still a bit immature, and a stability fix in the form of version 0.20 came out several months later. I waited a few weeks with no update to the distribution, and no word of when an update might become available. Eventually, I grew impatient and downloaded the source code of 0.20 myself, recompiled it on the MythTV box, and installed it over top of the existing programs.

There was one other impact of the one-size-fits-all approach that caused difficulties with the MythTV box. I was regularly recording a television show between 6:00AM and 6:30AM. A few minutes before the end of the show, the recording would have problems. The audio would break up, and the video would jump. It appeared that the program was losing frames of data, either because it was losing interrupts, or because it couldn't get the data to the disk quickly enough. Because it happened at about the same time every day, I expected it was probably a cron job. I got a root shell on the box, and asked for the list of all root-owned cron jobs with the command "crontab -l". This reported that there were no root-owned cron jobs. I mistrusted this result, and did more investigation. As I mentioned in the first post, distribution packagers often break up a configuration file into a set of separate files. They did that with cron jobs, which means that the command-line tool that ought to tell you all about root-owned cron jobs didn't report the full set of such processes. A bit of digging around in /etc showed that the slocate database update was being run at that time. This process scans the entire disk, making a list of the files on it. While probably useful in a general context, this is an unnecessary operation on an appliance box that isn't changing, particularly when it results in so much bus traffic that the primary function of the box is degraded. My solution was to change the /etc/crontab file (which is, itself, not viewed by "crontab -l") so that a cron job would be skipped if there were any users (reported by the 'fuser' command) of either of the two video input devices, /dev/video0 and /dev/video1.

My hardware environment

I have two computers that I use for my work. One is an x86 laptop, a ThinkPad T42 with a built-in ATI video controller (Mobility Radeon 9600). The other is a quad-core x86_64 box with an NVidia card (GeForce 6600). My work involves a lot of scientific computation, sometimes multi-threaded, and I need hardware-accelerated 3D rendering to analyze the results. So, I'm running on two architectures, with two different video cards.

The laptop is fairly standard, so I won't discuss it further. My big box has the following hardware:
  • Intel DP35DP motherboard
  • Intel Core2 Quad CPU, Q6600, 2.4GHz per core
  • 4 GB RAM
  • Two 160 GB SATA disks
  • One 500 GB SATA disk
  • Two 120 GB EIDE disks

I'll discuss later why I have so many hard drives.

Because I sit next to this box all day, I've put a lot of effort into making it quiet. My laptop makes more noise than the big box.

Why not use a distribution and a package manager?

I have a few Linux computers, but they do not use a package manager. They're not "redhat" computers, or "debian", or "ubuntu". Once, 13 years ago, they were slackware. Briefly. I administer these boxes manually, for lack of a better adjective.

Maintaining a Linux computer manually is a fair amount of work. Installing new software is not always trivial, and sometimes things break in subtle ways that may take some effort to debug. I plan to start recording my adventures here, in part so that I can come back and see what I did the next time I upgrade something and it misbehaves in a familiar manner. Because I do things manually, I tend to run into problems that the majority of Linux users don't experience. I often have to look on the web for answers to questions, so I hope my experiences can help out other people who, for whatever reason, come across one of these unusual problems.

What do I have against distributions and package managers? Nothing, really. They are very useful. I do have one computer that was installed from packages, a MythTV computer that I installed from a KnoppMyth CD. This is a good example of a place where package managers are useful. The computer is an appliance that I set up once, and then don't ever modify. It's not exposed to the Internet, and it isn't going to change much. I don't need to install new software on it, because it's a dedicated single-purpose machine that already does what I want it to do. And yet, I've "broken the packages" on the box. There are files ostensibly under control of the knoppix package manager that I have replaced with recompiled binaries, and which I am maintaining myself now. I'll talk about that in a later post.

Here are some of the things that I think are good and useful about distributions and package managers (note that there are some exceptions to these rules, but most package managers supply at least some of these benefits):
  • They supply the entire filesystem in compiled form, allowing a new computer to be set up and running in under an hour with reasonable defaults, usually after asking just a handful of questions.
  • They usually are associated with a good setup tool that can configure the software correctly for the hardware attached to your computer.
  • They have a good, general-purpose kernel with modules ready to handle many situations.
  • They keep track of dependencies to help to ensure that interdependent packages are correctly installed, so that the user doesn't end up with an installed package that fails to work correctly.
  • They provide a single location for access to updates and security fixes. A user can simply ask the package manager to do an "update to latest packages", and expect that they have all of the updates provided by the distribution.
  • If you have a dozen new computers to set up, possibly even on different architectures, it's not a very big job with the correct installation media available.
  • Probably most importantly, distributions and package managers provide an easy way for people to administer their Linux computer without having to become Linux experts. The computer is a tool used to perform other activities, and a distribution lets the person work with the tool, instead of spending a lot of time maintaining the tool.

So, why don't I use package managers? There are a few drawbacks to the use of package managers, and for me, they outweigh the benefits. Other people will have different priorities. I would never suggest to a newcomer to Linux that they should be going distribution-free. A person who maintains a large collection of computers on dissimilar hardware might also be poorly served by breaking the distributions (though I have actually done exactly that).

What don't I like about package managers and distributions? Well, here's a collection of drawbacks:
  • It isn't always clear what your computer is doing. There may be packages or services installed that you don't want, doing things you don't understand. Somewhere in the 200 packages that were installed when you set up the computer, you may have wound up with, say, an FTP daemon you didn't ask to have. When you're installing software manually, you're more likely to install only the things you really need.
  • Distributions tend to ship with older code. Distributors have to freeze their versions and do extensive testing, and by the time the packages are shipped there may have been improvements, bugfixes, or security fixes that didn't make it into the base media.
  • Bugfixes and security fixes can be delayed as you wait for the distributor to build updated packages. While most Linux distributors get security fixes out within a small number of days, there is still some delay between the time a fix is produced and the time that updated packages are available.
  • Distributions are set up to be good for the general case, but there will be times when they do the wrong thing for a particular special use.
  • Package installers are generally forbidden from interacting with the user, otherwise a new install would be a tedious exercise in configuring every package as it came along. Consequently, packages are usually dropped in with some default configuration.
  • Many programs come with multiple compile-time configuration options. A media player may have support for multiple codecs, output devices, companion devices, and so on. A distribution will usually turn on as many of these options as possible. Some of these options might not be of interest to a specific user, but that user is still forced to install other packages holding libraries he or she doesn't expect to use. These dependent libraries increase the interconnectedness of the packages, which can make what would be a simple upgrade of one package into a huge transaction that touches a dozen other packages and the kernel.
  • Because it's easier for a particular file to be owned by a specific package, even when that file controls the behaviour of multiple packages, distributions tend, when possible, to break up the file into fragments that are logically collated in some other place. This can make it hard to figure out exactly what a specific application is doing.
  • Distributions and package manages don't insulate the user in all cases. Some users with unusual requirements may still end up having to install software by hand, and figure out how to tie the new software into the system correctly, and sometimes the package management system makes such efforts more difficult.
  • Most importantly, for me, a package manager hides too much of what is happening. You don't have to learn how to configure a program, you don't know what files it's installing, it's a bit too much of a black box for my tastes.

Given all this, I've decided that I prefer not to use package manager. Consequently, I've been manually modifying my Linux computers for over 13 years now.