Showing posts with label hardware. Show all posts
Showing posts with label hardware. Show all posts

Friday, October 28, 2011

Converting DVDs for viewing on a tablet, while inlining captions

Previously, I  described how to convert HDTV videos for my EEE Pad Transformer.  Now, I'll go over something a bit more difficult.

My wife and I have some DVDs of Bollywood films that we enjoy watching.  Aaja Nachle, Om Shanti Om, 3 Idiots, Billu, among others.  These films are mostly in Hindi, but there are English subtitles available.  As we don't understand Hindi, we watch the movies with the subtitles.  The Android media viewer that comes with the tablet doesn't have a way to select subtitles from an alternate video stream.

Now, I wanted to make files of these movies that I could watch on the Android tablet.  As noted in the previous article, the resulting files have to be H.264 Baseline profile, and under 2GB in size.

Here's how I did this.  Note that this procedure required no less than 70 GB of free disk space to hold a large intermediate file, as I wanted to avoid artefacts introduced by running through multiple codecs, so I used a lossless intermediate state.

First of all, I used the MythTV option to rip a perfect copy of the DVD.  That gave me a file, say 3IDIOTS.vob.

Next, I used mencoder to inline the captions directly into the video stream:

mencoder -ovc lavc -lavcopts vcodec=ljpeg:aspect=16/9 \
    -vobsubid 0 -oac lavc -lavcopts acodec=flac \
    -o 3idiots 3IDIOTS.vob

The output file, 3idiots, was, as noted, huge.  It consisted of a lossless jpeg video stream, with the subtitle 0 track overlaid on the video stream itself.

Next, the file had to be converted to H.264 Baseline.  In this case, I decided, rather than setting a qmax, that I would set a bitrate.  That way I could be certain ahead of time what the final size of the file would be, though at the cost of increased trancoding time.  To get a fixed bitrate, it is necessary to run ffmpeg in two passes, once to collect statistics, and the second time to generate the file itself.  Here's how this is run:

ffmpeg -pass 1 -i 3idiots -vcodec libx264 -vpre fast \
    -vpre baseline -b 1400 -acodec libfaac -ab 64k \
    -ac 2 -ar 44100 -threads 3 \
    -deinterlace -y junkfile.mp4

ffmpeg -pass 2 -i 3idiots -vcodec libx264 -vpre fast \
    -vpre baseline -b 1400k -acodec libfaac -ab 64k \
    -ac 2 -ar 44100 -threads 3 \
    -deinterlace 3idiots.mp4 

The "junkfile.mp4" file can be deleted.  The H.264 file, 3idiots.mp4, came in at 1.8 GB, and was of quite acceptable quality to view on the tablet.

Converting HDTV videos for viewing on a tablet

I have an Android-based tablet computer, the EEE Pad Transformer.  My MythTV computer can record digital over-the-air broadcasts in high definition now that I have put an HDHomerun on my network.  So, it would be nice to be able to transfer some HDTV programs to the Android computer to watch them there while traveling.  The HDTV shows are 1080i, encoded as mpeg2 video, at a bitrate of close to 16000 kbits/sec.

So, what are our constraints?  The Android computer is not powerful enough to play videos without hardware assist, and that hardware assist is only available when viewing H.264 videos encoded with the baseline profile.  It doesn't work on main profile H.264 videos.  Also, the Micro-SD card that I plug into the tablet must be formatted as VFAT, it isn't recognized when I reformat it to any more modern Linux filesystems, so our files are going to have to be under 2GB in size.  Also, the Android screen is only 1280x800, so there's no point copying a 2560x1080 file there, the machine will have to reduce the resolution, we might as well do it before we copy it to the card.

So, a 1 hour show, recorded on the MythTV box, is about 8 GB and in the wrong format.  We convert it in two steps.  First, cut out any commercials and transcode it at high quality.  For network broadcast television that chops off about 25% of the file size, and you probably didn't want to watch the commercials while sitting on the train/airplane anyway.

Next, it has to be transcoded to H.264 Basline.  This can be done with ffmpeg:

ffmpeg -i PROGRAM.mpg -vcodec libx264 -vpre fast \
     -vpre baseline -s hd720 -qmax 30 -acodec libfaac \
     -ab 128k -ac 2 -threads 4 -ar 44100 -deinterlace \
     PROGRAM.mp4

This takes the HDTV .mpg file from mythtv, "PROGRAM.mpg", and converts it.  We use the libx264 video codec, fast settings, baseline profile, formatted for a high definition 720 line screen.  "qmax" sets a limit on quality loss, I usually use a value between 25 and 30.  We use the FAAC audio codec at 128kbits/sec, deinterlace the result, and write it to "PROGRAM.mp4".

The resulting file, about 45 minutes of air time, is about 600 MB in size.

Thursday, February 21, 2008

A Followup On Cryptographic Mounts, The Bad News

Previously, I discussed cryptographic mounts to hold sensitive data. It's worth pointing out an article that is making the rounds today by 9 authors from Princeton, in which the researchers describe an attack on cryptographic techniques, including the one I've described.

The technique relies on the fact that modern memory can retain its information for several minutes after the computer stops sending it refresh signals. What this means is that a person with physical access to the computer can pull the power connector from the computer and then remove the memory chips, insert them in another computer, and read the cryptographic keys out of the memory. I don't know of a good way to avoid this attack. If the cryptographic volumes are mounted when the computer falls into the hands of the attacker, the data will be, in theory, recoverable.

So, what can be done to prevent the key from being resident in the computer's memory at the instant that the attacker unplugs it? The key has to be available to the operating system so that it can read and write that data in normal operation. Sure, you could get specially modified hardware that deliberately overwrites the main memory from batteries when the power connector is removed, but maybe there's a way to store 128 bits somewhere other than in main memory?

A cache line on a modern CPU is 64 bytes, big enough to hold two 128-bit keys. Could the operating system subvert the hardware's L1 caching mechanism sufficiently to pin a value in the cache and remove it from L2 and main memory? This attack won't recover data from the L1 cache, so if that's the only place the key is kept, maybe that would be enough. You sacrifice a cache line, but maybe it's worth it?

How about the TLB? That's another part of the CPU that holds data, and that one is explicitly designed to interact with the operating system. Could we find a way to store 128 bits in parts of the TLB, and then deliberately avoid overwriting them? Can the operating system read those numbers back out of the TLB?

Are there any registers that could be used? Probably not on 32-bits, there aren't many registers there, and on 64-bits you'd probably have to use a special-purpose compiler to avoid these registers being touched by a context switch, and avoid them being saved to memory when an interrupt handler runs.

What if you have fifteen keys, all of 128 bits? Well, I believe we could handle that if we had 256 bits of volatile storage space. The first 128 bits of volatile space holds an XOR key, that decodes all of the fifteen keys. The second 128 bits of volatile space holds the decoded key in active use.

Those are my thoughts, anyway.

Sunday, February 17, 2008

Why do I have so many hard drives?

There are five hard drives in my main computer. There is no RAID setup. Why?

Hard drives fail. I've had the drive holding my root partition fail more than once. When that happens, I used to restore from backup. I would make a backup tape at least once a week, but a badly timed disk failure could still result in the loss of a lot of work.

My solution to this has been to buy my hard drives in matched pairs. I partition them equally, format them the same way, and install them both in the computer. One of them is the live disk, the other is the spare. The spare is kept unmounted and spun down. Every night around 3:00 AM, a cron job spins up the spares drives. Then, one partition at a time is fsck-ed, mounted, and copied to. The shell script uses rdist to synchronize the contents of the two partitions. Finally, I take special care to make the backup drive bootable. I use the LILO boot loader, so, when the root partition is mounted under /mnt/backup, the script executes the command:
/sbin/lilo -r /mnt/backup -b /dev/sdc

which, on my system, writes the LILO boot magic to the backup boot drive, which appears as /dev/sdc when it is the spare in my system. My lilo.conf file, on both the live system and the spare, refer to the boot drive as being /dev/sda, but this '-b' switch overrides that, so that the information is written to the boot block of the current /dev/sdc, but is written so that is appropriate for booting the device at /dev/sda (which it will appear to be should my live boot drive fail and be removed from the system).

Next, I use volume labels to mount my partitions. You can't have duplicate labels in the system, so my spare drive has labels with the suffix "_bak" applied. That means that the /etc/fstab file suitable for the live drive would not work if the spare were booted with that fstab. To solve this problem, the copying script runs this command after it finishes copying the files in /etc:
sed -e 's|LABEL=\([^ \t]*\)\([ \t]\)|LABEL=\1_bak\2|' /etc/fstab > /mnt/backup/etc/fstab

which has the effect of renaming the labels in the fstab to their versions with the _bak suffix, so they match the volume partitions on the spare hard drive.

OK, that sounds like a lot of work, why do I do it? What does it buy me?

First of all, it gives me automatic backups. Every night, every file is backed up. When I go to the computer at the beginning of the day, the spare drive holds a copy of the filesystem as it appeared when I went to sleep the night before. Now, if I do something really unwise, deleting a pile of important files, or similarly mess up the filesystem, I have a backup that I haven't deleted. If I were to use RAID, deleting a file would delete it immediately from my backup, which isn't what I want. As long as I realize there's a problem before the end of the evening, I can always recover the machine to the way it looked before I started changing things in the morning. If I don't have enough time to verify that the things I've done are OK, I turn off the backup for a night by editing the script.

Another important thing it allows me to do is to test really risky operations. For instance, replacing glibc on a live box can be tricky. In recent years, the process has been improved to the point that it's not really scary to type "make install" on a live system, but ten years ago that would almost certainly have confused the dynamic linker enough that you would be forced to go to rescue floppies. Now, though, I can test it safely. I prepare for the risky operation, and then before doing it, I run the backup script. When that completes, I mount the complete spare filesystem under a mountpoint, /mnt/chroot. I chroot into that directory, and I am now running in the spare. I can try the unsafe operation, installing a new glibc, or a new bash, or something else critical to the operation of the Linux box. If things go badly wrong, I type "exit", and I'm back in the boot drive, with a mounted image of the damage in /mnt/chroot. I can investigate that filesystem, figure out what went wrong and how to fix it, and avoid the problem when the time comes to do the operation "for real". Then, I unmount the partitions under /mnt/chroot and re-run my backup script, and everything on the spare drive is restored. Think of it as a sort of semi-virtual machine for investigating dangerous filesystem operations.

The other thing this gives me is a live filesystem on a spare drive. When my hard drive fails (not "if", "when", your hard drive will fail one day), it's a simple matter of removing the bad hardware from the box, re-jumpering the spare if necessary, and then rebooting the box. I have had my computer up and running again in less than ten minutes, having lost, at most, the things I did earlier in the same day. While you get this benefit with RAID, the other advantages listed above are not easily available with RAID.

Of course, this is fine, but it's not enough for proper safety. The entire computer could catch fire, destroying all of my hard drives at once. I still make periodic backups to writable DVDs. I use afio for my backups, asking it to break the archive into chunks a bit larger than 4 GB, then burn them onto DVDs formatted with the ext2 filesystem (you don't have to use a UDF filesystem on a DVD, ext2 works just fine, and it's certain to be available when you're using any rescue and recovery disk). Once I've written the DVDs, I put them in an envelope, mark it with the date, and give it to relatives to hang onto, as off-site backups.

So, one pair of drives is for my /home partition, one pair for the other partitions on my system. Why do I have 5 drives? Well, the fifth one isn't backed up. It holds large data sets related to my work. These are files I can get back by carrying them home from the office on my laptop, so I don't have a backup for this drive. Occasionally I put things on that drive that I don't want to risk losing, and in that case I have a script that copies the appropriate directories to one of my backed-up partitions, but everything else on that drive is expendable.

There are two problems that can appear with large files.
  • rdist doesn't handle files larger than 2 GB. I looked through the source code to see if I could fix that shortcoming, and got a bit worried about the code. So I'm working on writing my own replacement for rdist with the features I want. In the mean time, I rarely have files that large, and when I do, they don't change often, so I've been copying the files to the backup manually.
  • Sometimes root's shells, even those spawned by cron, have ulimit settings. If you're not careful, you'll find that cron jobs cannot create a file in excess of some maximum size, often 1 GB. This is an inconvenient restriction, and one that I have removed on my system.

My hardware environment

I have two computers that I use for my work. One is an x86 laptop, a ThinkPad T42 with a built-in ATI video controller (Mobility Radeon 9600). The other is a quad-core x86_64 box with an NVidia card (GeForce 6600). My work involves a lot of scientific computation, sometimes multi-threaded, and I need hardware-accelerated 3D rendering to analyze the results. So, I'm running on two architectures, with two different video cards.

The laptop is fairly standard, so I won't discuss it further. My big box has the following hardware:
  • Intel DP35DP motherboard
  • Intel Core2 Quad CPU, Q6600, 2.4GHz per core
  • 4 GB RAM
  • Two 160 GB SATA disks
  • One 500 GB SATA disk
  • Two 120 GB EIDE disks

I'll discuss later why I have so many hard drives.

Because I sit next to this box all day, I've put a lot of effort into making it quiet. My laptop makes more noise than the big box.