C Variables: Eerily Close to the Machine

In C programming, things as simple as variable assignment are not quite as simple as using an assignment operator---they sometimes require entire functions. For example, this code will not even compile:

#include        
#include        

int main()
{
        char    a[10], b[10];

        a = "hello";
        b = "world!";

        printf("%s %s", a, b);

        return 0;
}
$ cc test.c
test.c: In function ‘main’:
test.c:8: error: incompatible types in assignment
test.c:9: error: incompatible types in assignment

In C all strings are arrays. To create a string variable, you must create an array. The variable "a" is actually a pointer to the memory location of the character array, not the contents of the array itself! That's why I got the "incompatible types in assignment" error when I tried compiling the above code---I was trying to copy a string directly into a memory address!

The reason things are this way in C is for speed and simplicity. Sure, other languages automatically do the work of putting your five-character string into a variable and automatically allocate the necessary space in memory, but by doing that they spend a little more time behind the scenes---time and speed that may be precious to a systems-level programmer (who might be writing a program for, say, a tiny embedded device).

To copy a string into an array (i.e., assign a string to a variable), you can use the strcpy() function. This function does the work of taking each character in your string and putting it into the correct place in the given array:

#include        
#include        

int main()
{
        char    a[10], b[10];

        strcpy(a, "hello");
        strcpy(b, "world!");

        printf("%s %s", a, b);

        return 0;
}
$ cc test.c
$ ./a.out
hello world!

C was written in a time when assembly language was the norm. The problem with assembly language was that it was very tied to the hardware you were working on. Porting your work to other hardware, even if the changes in the hardware were only minor, required an entire rewrite of your code! Operating systems were also written in assembly at the time so creating a single operating system that worked on many different architectures was nearly impossible (unless you had an unlimited amount of time and money to have programmers constantly rewriting the operating system for every new hardware architecture that was released).

So the C programming language was created as a language one level higher than assembly. It was designed to maintain all the power and flexibility of assembly, while making it very easy to port to multiple architectures. This was made possible by using a compiler. The compiler simply took the C code and converted it into the necessary machine language for a specific architecture. If you wanted to port all your C code to a new architecture, all you needed to do was write a new compiler---not rewrite all your programs!

C lets you do stupid things not because it's stupid, but because flexibility and closeness to the physical hardware is necessary for writing operating systems. (As the programmer, it's your job to make sure what you're doing is possible with the hardware you're working on.) Where as other high-level languages will automatically take your string and stick it in the correct place in memory, C does only what you tell it to do. This makes it extremely fast, which is very important when you're writing an operating system.

The basic example of how a string cannot be assigned directly to the character variable because the variable is actually a pointer to a memory address, helped me realize why C is still used for systems-level programming and why it continues to be in use more than 35 years after its invention. I have flipped through many C books but never quite gotten this explanation of how C works. Understanding things at this level really helps me put the language in perspective.

Write a Comment

Comment