Reflections on Trusting Trust by Ken Thompson

Reflections on Trusting Trust by Ken Thompson (the co-creator of the C programming language) is an excellent paper. My favorite line was, "You can't trust code that you did not totally create yourself. (Especially code from companies that employ people like me.)"

Gustavo Duarte on the Anatomy of a Program in Memory

In his latest post, Anatomy of a Program in Memory, Gustavo Duarte explains beautifully the way in which programs are laid out in memory. He explains things in a very clear and concise manner and the diagrams are amazingly helpful in illustrating what he's talking about. I feel that memory management is where my C programming class left off, so this post is extremely helpful in deepening my understanding.

Romance, sadness, humor, and fear all rolled into one

I don't know how to feel about this series of comics. The feeling is that of romance, sadness, humor, and fear all rolled into one. Coincidentally, I learned about linked lists in C class a few weeks ago (got a 94 on the assignment!), so I was able to fully appreciate the last strip.

My C Program Crashed the Terminal :(

As part of my homework assignment (due tonight!), I have to write two versions of the standard library function strncpy(), one using array's and one using pointers.

The strncpy() function basically takes two arguments, *src and *dest, and copies n number of bytes from *src to *dest.

#include 

// Function declarations
char *mystrncpy(char *dest, const char *src, size_t n);

main()
{
    char a[] = "this is a string";
    char b[50];

    mystrncpy(b, a, 400); // here is the problem
    printf("%sn", b);

    return 0;
}

// My attempt to replicate the strncpy() function with pointers
char *mystrncpy(char *dest, const char *src, size_t n)
{
    int i = 0;
    while (i <= n) {
        *dest = *src;
        src++;
        dest++;
        i++;
    }
}

While testing this pointer version of the function, I passed a much larger size (400) to the function than had been allocated for the destination variable (b[50]).

eris:hw3 raam$ gcc problem8.c -o problem8
eris:hw3 raam$ ./problem8
this is a string
Segmentation fault

Segmentation Fault

There is no better way to tell me my function needs more work than to stick a big "Application quit unexpectedly" message in my face! (And thankfully, my entire iTerm app did not crash.)

I wrote a post a few weeks ago about how eerily close C variables are to the machine, and the way this program crashed further confirms that point. I can't imagine the kinds of nasty things I will be able to do once I learn more advanced C functions. 😀

C Variables: Eerily Close to the Machine

In C programming, things as simple as variable assignment are not quite as simple as using an assignment operator---they sometimes require entire functions. For example, this code will not even compile:

#include        
#include        

int main()
{
        char    a[10], b[10];

        a = "hello";
        b = "world!";

        printf("%s %s", a, b);

        return 0;
}

$ cc test.c
test.c: In function ‘main’:
test.c:8: error: incompatible types in assignment
test.c:9: error: incompatible types in assignment

In C all strings are arrays. To create a string variable, you must create an array. The variable "a" is actually a pointer to the memory location of the character array, not the contents of the array itself! That's why I got the "incompatible types in assignment" error when I tried compiling the above code---I was trying to copy a string directly into a memory address!

The reason things are this way in C is for speed and simplicity. Sure, other languages automatically do the work of putting your five-character string into a variable and automatically allocate the necessary space in memory, but by doing that they spend a little more time behind the scenes---time and speed that may be precious to a systems-level programmer (who might be writing a program for, say, a tiny embedded device).

To copy a string into an array (i.e., assign a string to a variable), you can use the strcpy() function. This function does the work of taking each character in your string and putting it into the correct place in the given array:

#include        
#include        

int main()
{
        char    a[10], b[10];

        strcpy(a, "hello");
        strcpy(b, "world!");

        printf("%s %s", a, b);

        return 0;
}

$ cc test.c
$ ./a.out
hello world!

C was written in a time when assembly language was the norm. The problem with assembly language was that it was very tied to the hardware you were working on. Porting your work to other hardware, even if the changes in the hardware were only minor, required an entire rewrite of your code! Operating systems were also written in assembly at the time so creating a single operating system that worked on many different architectures was nearly impossible (unless you had an unlimited amount of time and money to have programmers constantly rewriting the operating system for every new hardware architecture that was released).

So the C programming language was created as a language one level higher than assembly. It was designed to maintain all the power and flexibility of assembly, while making it very easy to port to multiple architectures. This was made possible by using a compiler. The compiler simply took the C code and converted it into the necessary machine language for a specific architecture. If you wanted to port all your C code to a new architecture, all you needed to do was write a new compiler---not rewrite all your programs!

C lets you do stupid things not because it's stupid, but because flexibility and closeness to the physical hardware is necessary for writing operating systems. (As the programmer, it's your job to make sure what you're doing is possible with the hardware you're working on.) Where as other high-level languages will automatically take your string and stick it in the correct place in memory, C does only what you tell it to do. This makes it extremely fast, which is very important when you're writing an operating system.

The basic example of how a string cannot be assigned directly to the character variable because the variable is actually a pointer to a memory address, helped me realize why C is still used for systems-level programming and why it continues to be in use more than 35 years after its invention. I have flipped through many C books but never quite gotten this explanation of how C works. Understanding things at this level really helps me put the language in perspective.