Unix/Linux: Finding and Killing Processes by User

If you've ever run ps aux | grep user to list processes and hunt for process IDs, you'll be happy to know there is a simpler alternative. Both finding and killing processes owned by a particular user is made simple using the handy pgrep and pkill utilities.

Listing Processes with pgrep

Listing all the processes owned by the user raam can be done like this (the -l switch causes the output to include the process name):

[bash]
$ pgrep -l -u raam
9614 screen
9628 bash
9644 irssi
16165 bash
16297 rtorrent
19462 ssh
19515 bash
19526 ssh
20964 sshd
[/bash]

You can also filter the list of results by appending a full (or partial) process name to the command:

[bash]
$ pgrep -l -u raam bash
9628 bash
16165 bash
19515 bash
[/bash]

Killing Processes with pkill

The pkill command does basically the same thing as pgrep, except it kills the processes instead of listing them. This is useful if you have a user with several dead processes, or if you were deleting a user and you wanted to kill any running processes first.

Killing all the processes owned by the user raam looks like this:

[bash]
$ pkill -u raam
[/bash]

And once again, if you only wanted to kill all the bash processes owned by raam, you would append the process name to the command:

[bash]
$ pkill -u raam bash
[/bash]

As always, check the man pages for pgrep and pkill for more information and switch options.

A Script to Install & Configure ifplugd on Debian

The default configuration on some older Linux systems is to only send a DHCP request while booting up. This means if the network cable gets unplugged, or if the router is powered off, the system may lose its IP configuration. To restore the network connection, the system may need to be manually rebooted or have someone at the local console run the dhclient command to request a DHCP lease.

For systems that are only accessed remotely via SSH, such a scenario can be painful. What is needed is a daemon that watches the link status of the Ethernet jack and reconfigures the network (or sends out another DHCP request) when it detects a cable is plugged in (or the power to the router is restored).

ifplugd does exactly that:

ifplugd is a Linux daemon which will automatically configure your ethernet device when a cable is plugged in and automatically unconfigure it if the cable is pulled.

On a Debian system, installing and configuring ifplugd is relatively simple using apt-get install ifplugd. Once its been installed, it needs to be configured by editing /etc/default/ifplugd. The most basic configuration is to simply set INTERFACES="auto" and HOTPLUG_INTERFACES="all". This configuration tells ifplugd to watch all network interfaces for a new link status and automatically reconfigure them using the Debian network configuration defined in /etc/network/interfaces.

I recently needed to automate the install and configuration of ifplugd on many remote Linux systems, so I wrote this simple script.

Download: install-ifplugd.tar.gz

[sourcecode lang="bash"]
#!/bin/sh

#########################################
# Author: Raam Dev
#
# This script installs ifplugd and configures
# it to automatically attempt to restore any
# lost connections.
#
# Must be run as root!
#########################################

# Check if we're running this as root
if [ $EUID -ne 0 ]; then
echo "This script must be run as root" 1>&2
exit 1
fi

# Files used when configuring ifplugd
OUTFILE=/tmp/outfile.$$
CONFIG_FILE=/etc/default/ifplugd

# Update package list and install ifplugd, assuming yes to any questions asked
# (to insure the script runs without requiring manual intervention)
apt-get update --assume-yes ; apt-get install --assume-yes ifplugd

# Configure ifplugd to watch all interfaces and automatically attempt configuration
sed 's/INTERFACES=""/INTERFACES="auto"/g' < $CONFIG_FILE > $OUTFILE
mv $OUTFILE $CONFIG_FILE

sed 's/HOTPLUG_INTERFACES="auto"/HOTPLUG_INTERFACES="all"/g' < $CONFIG_FILE > $OUTFILE
mv $OUTFILE $CONFIG_FILE

[/sourcecode]

If you're interested in doing more with ifplugd, check out this article.

Saving Files as root From Inside VIM

Oftentimes I will be editing a Linux configuration file using vim only to discover that I cannot save it because the file requires root permission to write to it. This ends up looking something like this:

[sourcecode lang="bash"]
vi /path/to/some/file.conf
[make some edits]
:w
VIM Message: E45: 'readonly' option is set (add ! to override)
:q!
$ sudo vi /path/to/some/file.conf
[make all my edits AGAIN]
:w
[/sourcecode]

I have gone through this process so many times that I knew there must be an easy fix for it. (I know about sudo !! for running the previous command, but I only recently started developing the habit of using it.) After forgetting to use sudo while editing a configuration file yet again this morning, I finally decided to search Google and find a solution. Here it is:

[sourcecode lang="bash"]
vi /path/to/some/file.conf
[make some edits]
:w
VIM Message: E45: 'readonly' option is set (add ! to override)
:w !sudo tee %
[/sourcecode]

The :w !sudo tee % command tells VIM to write the file (w) but run the sudo command first (!sudo) and read the writing of the file from standard input to standard output (tee) using the same filename as the one we're editing (%).

After saving the file as root, you'll get this message: "W12: Warning: File "/private/etc/smb.conf" has changed and the buffer was changed in Vim as well". You'll be given the option to reload it, but since you were already looking at the new version it doesn't much matter which option you choose (OK or Reload).

And last but not least, if you don't want to remember the syntax for this command, you can map the command in your ~/.vimrc file:

[sourcecode lang="bash"]
cmap w!! w !sudo tee % >/dev/null
[/sourcecode]

Now, if you forget to edit a file with sudo, you can simply type :w!! to fix the problem!

Advanced Linux Programming

The authors of Advanced Linux Programming "take a tutorial approach and teach the most important concepts and power features of the GNU/Linux system". This book contains exactly what I need to fill the gaps in my understanding of the inner workings of Linux. The breadth of information is amazing. Even more amazing is the fact that the entire book is freely available for download in PDF format.

Forcing fsck to Run on the Next Reboot

If you need to make sure fsck runs on the next reboot, here's a really simple way to do it:

[sourcecode lang="bash"]
$ sudo touch /forcefsck
[/sourcecode]

Alternatively (and depending on your version of Linux) you may also be able to pass an option to the shutdown command:

[sourcecode lang="bash"]
$ shutdown -rF now
[/sourcecode]

Playing Tetris and Pong with Emacs

I'm vi user, but I'm slowly trying to pick up Emacs too. I discovered today that Emacs has a built-in Tetris game. Simply launch Emacs, press Esc and then the letter x, and then type tetris and press Enter (or simply start emacs with emacs -q --no-splash -f tetris). You can also play Pong using the same method!

Configuring Static DNS with DHCP on Debian/Ubuntu

Note: This article is outdated as of Ubuntu 12.04. Please see this article if you're using Ubuntu 12.04 or later.

Dynamic Host Configuration Protocol (DHCP) is a commonly used method of obtaining IP and DNS information automatically from the network. In some cases, you may wish to statically define the DNS servers instead of using the ones provided by the DHCP server. For example if your ISP commonly experiences DNS outages, you might want to use the DNS servers provided by OpenDNS instead of the ones provided by your ISP.

When using a static IP configuration on Linux, you normally add the DNS servers to the /etc/resolv.conf. However, if you try to add a DNS server to /etc/resolv.conf under a DHCP configuration, you'll notice that your static entry disappears as soon as the DHCP client runs (usually on boot). To prevent this, you need to tell the DHCP client to prepend the static DNS server(s) to /etc/resolv.conf before adding the ones provided from the DHCP server (if any).

The configuration file you'll need to edit is the same on both Debian and Ubuntu, however depending on your setup the location of the file may vary. Here are the two common places I've found the file:

Debian: /etc/dhclient.conf
Ubuntu: /etc/dhcp3/dhclient.conf

Open the file in your favorite editor and add one of two lines at the top, separating multiple DNS servers with a comma and ending the entry with a semi-colon:

If you simply want to add static DNS servers to be used in addition to the ones provided by DHCP, use a prepend entry:

prepend domain-name-servers 208.67.222.222, 208.67.220.220;

If you want to override the DNS servers provided by DHCP entirely and force the system to use the ones you provide, use the supersede entry:

supersede domain-name-servers 208.67.222.222, 208.67.220.220;

Before these static DNS servers will to be appended to your /etc/resolv.conf file, you'll need to re-run the DHCP client. The easiest way to do this is by running /etc/init.d/networking restart (sudo required) or you can try running the dhclient command.

After re-running the DHCP client, check your /etc/resolv.conf file to confirm the static DNS servers have been added.

A Special Valentines Gift for Unix Geeks

This Valentines Day at precisely 00:31:30 CET (which is technically the day before Valentines in the USA), the Unix time will be 1234567890. Unix time is measured in the number of seconds elapsed since midnight (UTC) of January 1, 1970.

Gustavo Duarte on the Anatomy of a Program in Memory

In his latest post, Anatomy of a Program in Memory, Gustavo Duarte explains beautifully the way in which programs are laid out in memory. He explains things in a very clear and concise manner and the diagrams are amazingly helpful in illustrating what he's talking about. I feel that memory management is where my C programming class left off, so this post is extremely helpful in deepening my understanding.

Blogging from the Command-Line

I'm a command-line person. If you can show me a command-line version of something I already do in a windowed environment, I'll get stuff done faster. I often look for command-line solutions to tasks that become repetitive and feel as though time could be saved by doing them on the console.

A recent example of this is the posting of asides on my blog. Asides are often very short (one or two sentences at most -- they appear on my blog without a title) and navigating the WordPress Administration interface in a web browser simply to post one or two sentences became very time consuming and distracting. Since I'm constantly editing files and code on the console using my favorite editor (vi), being able to quickly create and post an aside from the same environment would be awesome.

Before writing a tool that allowed me to post to my WordPress blog, I searched Google to see if someone else had already written something. Sure enough, I found blogpost, a script written in Python by Stuart Rackham:

blogpost is a WordPress command-line weblog client. It creates and updates weblog entries directly from AsciiDoc (or HTML) source documents. You can also delete and list weblog entries from the command-line.

It uses XML-RPC to post to WordPress blogs and also supports automatically uploading media files (images, videos, audio, documents) that are referenced within the AsciiDoc (or HTML) post file. Check the blogpost man page for full details.

Remember, my main goal here is to make posting short asides easier. I'm perfectly happy using the WordPress web interface to write longer posts. In fact, I prefer the web interface for longer posts because I get things like automatic spell checking (through OS X) and automatic draft saving (through WordPress).

After installing blogpost and modifying the configuration file to include my WordPress login details, I created a file called post.txt using the vi editor and, after saving the file and closing vi, I published the aside using blogpost:

$ blogpost.py --title="My Test Aside Post" -U --doctype='html' create post.txt creating published post 'My Test Aside Post'... id: 2758 url: https://raamdev.com/2009/01/24/my-test-aside-post

$ blogpost.py cat --categories="Asides, Blog Entries, General" post.txt assigning categories: Asides,Blog Entries,General

Note that I only need really basic formatting (i.e., HTML for links), so I use the --doctype='html' option. This allows me to type raw HTML in vi when I'm editing the post file, just as I do now in WordPress (I don't use the Visual Editor).

While the options and flexibility provided by blogpost are great, the process of publishing an aside needed to be more automated to solve my problem. Creating a new file in vi, typing all those options, running two separate commands, and then deleting the file every time I wanted to post a few sentences on my blog didn't make a whole lot of sense. So I whipped together this little shell script to help automate the steps above:

#!/bin/sh
##
## aside.sh - automates publishing asides using blogpost.py
##

# Open a temporary file in the vi editor
vi aside.$$

# Display new aside before publishing
echo "New Aside:"
cat aside.$$
echo

# Prompt for an aside title
echo "Enter a title for this Aside:"
read TITLE
echo "OK!"
echo

# Using the temp file saved above, post the Aside
blogpost.py --title="$TITLE" -U --doctype='html' create aside.$$
blogpost.py cat --categories="Blog Entries, Asides" aside.$$

# Remove the temporary file
rm aside.$$

Now posting an aside to my blog is as simple as running ./aside.sh, typing the aside in vi, saving and quitting (:wq), and then typing a title. The rest of the work, including cleanup, is taken care of by the script!

Stuart did an excellent job with blogpost and if you have a blog and use the console (and why shouldn't you?!) I recommend you check it out. The blogpost README is a great place to start, as it includes prerequisites and installation information.

Lazy Linux: 10 Essential tricks for admins

Lazy Linux: 10 Essential tricks for admins is an awesome list of cool things you can do with Linux. I learned about trick #3 (collaboration with screen) from an admin at the datacenter where one of my servers is hosted. I remember being thrilled sitting there watching him do stuff on my server while sharing the keyboard to type messages back and forth in vi (think Remote Desktop or VNC, but on the console). Trick #5 (SSH back door) is something I've been using for years at work for remote diagnostics. It is an invaluable trick for getting around firewalls. Very cool stuff!

Installing rTorrent on OS X Leopard (10.5) using Fink

I've been using Transmission as a BitTorrent client on my MacBook Pro for a while now, but after setting up rTorrent on my Linux server earlier today and seeing how awesome it was, I just had to install it on my laptop as well. I absolutely love text-based applications!

The easiest way to install rTorrent is by using Fink or MacPorts. (Both of these tools allow you to download software that has been ported from Unix/Linux to Mac OS X.) I'll use Fink since I'm a fan of Debian Linux and Fink uses the Debian dpkg and apt-get package management tools.

Apparently there is no Fink binary available yet for OS X Leopard (10.5), so it must be compiled from source. These directions (which also contain instructions for setting up rTorrent on earlier versions of OS X) helped explain the overall process presented here. Since you'll need to compile from source, you will need to have Xcode installed (a set of development tools from Apple).

The basic steps for setting up Fink are as follows:

Download the latest Fink source

Open up a terminal (Applications -> Utilities -> Terminal.app) and run the following commands

$ cd /path/to/download/directory

$ tar xvzf fink-x.xx.x.tar.gz

$ cd fink-x.xx.x

$ ./bootstrap

You will now be presented with several questions. Answer using the defaults (press Enter) for everything except the question about whether you want to enable the unstable tree; you must answer Yes to this question (see here if you accidentally missed this step).

When the script finishes, run /sw/bin/pathsetup.sh

For good measure, run apt-get update

Great! Now that Fink is installed, installing rTorrent is really easy:

$ fink install rtorrent

You might be notified that a bunch of extra packages need to be installed (there were 46 needed on my system!) so just choose Yes. After the packages have been downloaded and compiled (this might take a while) rTorrent should be on your system and ready to use.

Getting Started with rTorrent

rTorrent

You can launch rTorrent by simply running rtorrent in a Terminal, but before you get started you should look over the man page (man rtorrent) and then set up a configuration file (~/.rtorrent.rc). There are only four lines in my configuration file (check the man page to see what these do):

port_range = 26000-26999 directory = ~/downloads/torrents/ session = ~/downloads/torrents/sessions/ encryption = allow_incoming,try_outgoing,enable_retry

If you're interested in a lot more options, you might want to grab a copy of the sample .rtorrent.rc config file (why this wasn't included in the package, I don't know) and place it in your home directory.

The rTorrent User Guide has information about all the stuff on the screen as well as various commands to navigate the interface.

Using the wonderful screen utility, rTorrent becomes even more powerful on remote systems. I leave rTorrent running on my server and whenever I remotely SSH into the box I can then reattach the screen session that rTorrent is running inside of and instantly have access to it!

Being Greedy With Bash

Last night at my C/Unix class the professor quickly glossed over an interesting shell scripting technique that allows you to strip stuff off the beginning or end of a variable. I forgot about it until I saw the technique used again while editing a shell script at work today.

I didn't know what the technique was called but I remembered the professor saying something about "greedy clobbering" and, since I cannot search Google for special characters, I Googled "Bash greedy" and luckily found 10 Steps to Beautiful Shell Scripts, which just so happened to contain the technique I was looking for (#5).

There are basically four versions of this technique:

${var#pattern}
Search var from left-to-right and return everything after the first occurrence of pattern

${var##pattern}
Search var from left-to-right and return everything after the last occurrence of pattern (be greedy)

${var%pattern}
Search var from right-to-left and return everything after the first occurrence of pattern

${var%%pattern}
Search var from right-to-left and return everything after the last occurrence of pattern (be greedy)

Here's how it works. Let's say you have a variable that contains the path to a file:

FILE=/home/raam/bin/myscript.sh

Now let's say you wanted to extract the myscript.sh part from that variable. You could do some funky stuff with awk but there is a much easier solution built into Bash:

SCRIPTNAME=${FILE##*/}

Now $SCRIPTNAME will contain myscript.sh!

The ##*/ tells the shell to search left-to-right for everything before and including the slash (*/), be greedy while doing it so that all the slashes will be found (##), and then return whatever is left over (in this case, myscript.sh is the only thing remaining after the last slash).

AFAIK, this is a Bash-specific feature, but I'm not entirely certain and I wasn't sure where I could look to find out. It's amazing how four characters can do so much work so easily. The more I learn about what I can do with Bash, the more I wonder how I ever lived without all this knowledge!

I subconsciously converted a problem into a shell script

I have been writing a lot of shell scripts lately as part of the C/Unix class that I'm taking at Harvard Extension. My familiarity with how the Unix shell and the underlying system works has grown exponentially. When I came across a problem earlier today, I subconsciously turned the problem into a shell script without even thinking about it!

The problem: "How can I check to make sure my program is running every 30 minutes and restart it if it's not?"

Answer:

# If myscript isn't running, restart it
ONLINE=`ps aux | grep -c myscript`
# 2 because grep myscript also finds 'grep myscript'
if [ $ONLINE -ne "2" ]; then
        $MYSCRIPT_PATH/restart_service.sh
fi

I'm sure there are many better ways to solve this problem, but the fact that I instantly translated the problem into shell scripting code (and that it worked as expected on my first try) astonished me. I can see how good programmers who write in a particular language, and know the in's and out's the like the back of their hand can turn problems into code seamlessly (or know exactly where to look to find answers if they're unsure).

It's really amazing how easily you can solve simple problems when you have a deeper understanding of how the system works.

That's all. I just wanted to share my excitement. 🙂

HOWTO: Count Files Recursively with Exclusion on Linux

Find all files in this directory, including the files in sub-directories, and exclude all files that start with a period (dot files) and any directories named .thumbs. Then pass the list of results to the wc command to get a total count:

find . ! -name ".*" ! -path "*.thumbs*" -type f | wc -l

Mounting HFS+ with Write Access in Debian

When I decided to reformat and install my Mac Mini with the latest testing version of Debian (lenny, at the time of this writing) I discovered that I couldn't mount my HFS+ OS X backup drive with write access:

erin:/# mount -t hfsplus /dev/sda /osx-backup [ 630.769804] hfs: write access to a journaled filesystem is not supported, use the force option at your own risk, mounting read-only.

This warning puzzled me because I was able to mount fine before the reinstall and, since the external drive is to be used as the bootable backup for my MBP, anything with "at your own risk" was unacceptable.

I had already erased my previous Linux installation so I had no way of checking what might have previously given me write access to the HFS+ drive. A quick apt-cache search hfs revealed a bunch of packages related to the HFS filesystem. I installed the two that looked relevant to what I was trying to do:

hfsplus - Tools to access HFS+ formatted volumes hfsutils - Tools for reading and writing Macintosh volumes

No dice. I still couldn't get write access without that warning. I tried loading the hfsplus module and then adding it to /etc/modules to see if that would make a difference. As I expected, it didn't. I was almost ready to give up but there was another HFS package in the list that, even though it seemed unrelated to what was trying to do, seemed worth a shot:

hfsprogs - mkfs and fsck for HFS and HFS+ file systems

It worked! I have no idea how or why (and I'm not interested enough to figure it out), but after installing the hfsprogs package I was able to mount my HFS+ partition with write access.

Update:

As Massimiliano and Matthias have confirmed in the comments below, the following solution seems to work with Ubuntu 8.04:

From Linux, after installing the tools suggested before, you must run:
mount -o force /dev/sdx /mnt/blabla

Otherwise, in my fstab, I have an entry like this:
UUID=489276e8-7f9b-3ae6-8c73-69b99ccaab9c /media/Leopard hfsplus defaults,force 0 0

Understanding the Linux Load Averages

I have been using Linux for several years now and although I have looked at the load averages from time to time (either using top or uptime), I never really understood what they meant. All I knew was that the three different numbers stood for averages over three different time spans (1, 5, and 15 minutes) and that under normal operation the numbers should stay under 1.00 (which I now know is only true for single-core CPUs).

Earlier this week at work I needed to figure out why a box was running slow. I was put in charge of determining the cause, whether it be excessive heat, low system resources, or something else. Here's what I saw for load averages when I ran the top command on the box:

load average: 2.86, 3.00, 2.89

I knew that looked high, but I had no idea how to explain what "normal" was and why. I quickly realized that I needed a better understanding of what I was looking at before I could confidently explain what was going on. A quick Google search turned up this very detailed article about Linux load averages, including a look at some of the C functions that actually do the calculations (this was particularly interesting to me because I'm currently learning C).

To keep this post shorter than the aforementioned article, I'll simply quote the two sentences that gave me a clear-as-day explanation of how to read Linux load averages:

The point of perfect utilization, meaning that the CPUs are always busy and, yet, no process ever waits for one, is the average matching the number of CPUs. If there are four CPUs on a machine and the reported one-minute load average is 4.00, the machine has been utilizing its processors perfectly for the last 60 seconds.

The machine I was checking at work was a single-core Celeron machine. This meant with a continuous load of almost 3.00 the CPU was being stressed much higher than it should be. Theoretically, a dual-core machine would drop this load to around 1.50 and a quad-core would drop it to 0.75.

There is a lot more behind truly understanding the Linux load averages, but the most important thing to understand is that they do not represent CPU usage. Rather they represent the load on the CPU by processes waiting for their chance to use the CPU. If you still can't get your brain away from thinking in terms of percentages, consider 1.00 to be 100% load for single-core CPU's, 2.00 to be 100% load for dual-core CPUs, and so on.

Update: John Gilmartin had some insightful feedback and shared a link to Understanding Load Averages where there's a nice graphical description for how load averages work.

Creating a Bootable OS X Backup on Linux: Impossible?

I've had plans for a while now to set up a backup system using a Debian Linux server and rsync to back up my MacBook Pro laptop. At first glance, it seemed like it would be pretty straight forward. I've been able to make a bootable copy of my entire MBP using nothing but rsync (thanks to some very helpful directions by Mike Bombich, the creator of the popular, and free, Carbon Copy Cloner software). And by bootable copy I mean I could literally plug in the USB drive and boot my MBP from the drive (hold down the Alt/Option key while booting). Restoring a backup is as simple as running the rsync command again, but in the reverse direction. I know this solution works because I used it when I upgraded to a 320GB hard drive.

To start, I needed to create a big enough partition on the external USB drive using Disk Utility (formatted with Mac OS Extended (Journaled)). I then made a bootable copy of my MBP with one rsync command:

sudo rsync -aNHAXx --protect-args --fileflags --force-change --rsync-path="/usr/local/bin/rsync" / /Volumes/OSXBackup

But my dream backup system was more unattended. I wanted something that would periodically (a couple times a day) run that rsync command over SSH (in the background) and magically keep an up-to-date bootable copy of my MBP on a remote server.

I love Linux and I jump at any opportunity to use it for something new, especially in a heterogeneous network environment. So when I decided to set up a backup server, I naturally wanted to make use my existing Debian Linux machine (which just so happens to be running on an older G4 Mac Mini).

So, after making a bootable copy of my MBP using the local method mentioned above, I plugged the drive into my Linux machine, created a mount point (/osx-backup), and added an entry to /etc/fstab to make sure it was mounted on boot (note the filesystem type is hfsplus):

/dev/sda /osx-backup hfsplus rw,user,auto 0 0

All that's left to do now is to run the same rsync command as earlier but this time specifying the remote path in the destination ([email protected]:/osx-backup/). This causes rsync to tunnel through SSH and run the sync. Unfortunately, this is where things started to fall apart.

OS X uses certain file metadata which must be copied for the backup to be complete (again, we're talking about a true bootable copy that looks no different than the original). Several of the flags used in the rsync command above are required to maintain this metadata and unfortunately Linux doesn't support all the necessary system calls to set this data. In particular, here are the necessary flags that don't work when rsyncing an OS X partition to Linux:

-X (rsync: rsync_xal_set: lsetxattr() failed: Operation not supported (95))
-A (recv_acl_access: value out of range: 8000)
--fileflags (on remote machine: --fileflags: unknown option)
--force-change (on remote machine: --force-change: unknown option)
-N (on remote machine: -svlHogDtNpXrxe.iL: unknown option)

According to the man page for rsync on my MBP, the -N flag is used to preserve create times (crtimes) and the --fileflags option requires chflags system call. When I compiled the newer rsync 3.0.3 on my MBP, I had to apply two patches to the source that were relevant to preserving Mac OS X metadata:

patch -p1 <patches/fileflags.diff patch -p1 <patches/crtimes.diff

I thought that maybe if I downloaded the source to my Linux server, applied those same patches, and then recompiled rsync, that it would be able to use those options. Unfortunately, those patches require system-level function calls (such as chflags) that simply don't exist in Linux (the patched source wouldn't even compile).

So I tried removing all unsupported flags even though I knew lots of OS X metadata would be lost. After the sync finished, I tried booting from the backup drive to see if everything worked. It booted into OS X, but when I logged into my account lots of configuration was gone and several things didn't work. My Dock and Desktop were both reset and accessing my Documents directory gave me a "permission denied" error. Obviously that metadata is necessary for a viable bootable backup.

So, where to from here? Well, I obviously cannot use Linux to create a bootable backup of my OS X machine using rsync. I read of other possibilities (like mounting my Linux drive as an NFS share on the Mac and then using rsync on the Mac to sync to the NFS share) but they seemed like a lot more work than I was looking for. I liked the rsync solution because it could easily be tunneled over SSH (secure) and it was simple (one command). I can still use the rsync solution, but the backup server will need to be OS X. I'll be setting that up soon, so look for another post with those details.

Ubuntu Live-CD on G4 Mac Mini

I've been trying to create a new partition on the 250GB drive I installed in my G4 (PowerPC) Mac Mini but I could not for the life of me find a Live CD that would boot. Finally this helpful post pointed me to Ubuntu 6.06 (Dapper Drake). After downloading the 'Mac (PowerPC) desktop CD' and burning it, I was pleasantly surprised to see it boot the Mac Mini beautifully (I used the live-powerpc kernel at the boot: prompt). Apparently the later PowerPC distributions of Ubuntu don't come with the necessary ATI drivers for the G4 Mac Mini!

C Variables: Eerily Close to the Machine

In C programming, things as simple as variable assignment are not quite as simple as using an assignment operator---they sometimes require entire functions. For example, this code will not even compile:

#include        
#include        

int main()
{
        char    a[10], b[10];

        a = "hello";
        b = "world!";

        printf("%s %s", a, b);

        return 0;
}

$ cc test.c
test.c: In function ‘main’:
test.c:8: error: incompatible types in assignment
test.c:9: error: incompatible types in assignment

In C all strings are arrays. To create a string variable, you must create an array. The variable "a" is actually a pointer to the memory location of the character array, not the contents of the array itself! That's why I got the "incompatible types in assignment" error when I tried compiling the above code---I was trying to copy a string directly into a memory address!

The reason things are this way in C is for speed and simplicity. Sure, other languages automatically do the work of putting your five-character string into a variable and automatically allocate the necessary space in memory, but by doing that they spend a little more time behind the scenes---time and speed that may be precious to a systems-level programmer (who might be writing a program for, say, a tiny embedded device).

To copy a string into an array (i.e., assign a string to a variable), you can use the strcpy() function. This function does the work of taking each character in your string and putting it into the correct place in the given array:

#include        
#include        

int main()
{
        char    a[10], b[10];

        strcpy(a, "hello");
        strcpy(b, "world!");

        printf("%s %s", a, b);

        return 0;
}

$ cc test.c
$ ./a.out
hello world!

C was written in a time when assembly language was the norm. The problem with assembly language was that it was very tied to the hardware you were working on. Porting your work to other hardware, even if the changes in the hardware were only minor, required an entire rewrite of your code! Operating systems were also written in assembly at the time so creating a single operating system that worked on many different architectures was nearly impossible (unless you had an unlimited amount of time and money to have programmers constantly rewriting the operating system for every new hardware architecture that was released).

So the C programming language was created as a language one level higher than assembly. It was designed to maintain all the power and flexibility of assembly, while making it very easy to port to multiple architectures. This was made possible by using a compiler. The compiler simply took the C code and converted it into the necessary machine language for a specific architecture. If you wanted to port all your C code to a new architecture, all you needed to do was write a new compiler---not rewrite all your programs!

C lets you do stupid things not because it's stupid, but because flexibility and closeness to the physical hardware is necessary for writing operating systems. (As the programmer, it's your job to make sure what you're doing is possible with the hardware you're working on.) Where as other high-level languages will automatically take your string and stick it in the correct place in memory, C does only what you tell it to do. This makes it extremely fast, which is very important when you're writing an operating system.

The basic example of how a string cannot be assigned directly to the character variable because the variable is actually a pointer to a memory address, helped me realize why C is still used for systems-level programming and why it continues to be in use more than 35 years after its invention. I have flipped through many C books but never quite gotten this explanation of how C works. Understanding things at this level really helps me put the language in perspective.