Casey’s interview question #2

By Sijmen J. Mulder, 3 August 2023.

Casey of Molly Rocket posted four interview questions asked for his 1994 Microsoft intership.

This page is about the second: implementing a string copy function:

Implement strcpy(), which copies a string into another buffer (we don’t care about the return value for now):
char * strcpy(char *src, char *dst);

Let’s get this out of the way: using this function is usually a bad idea because you need to be absolutely sure that the destination buffer can fit the full source string. Instead, use strcpy_s() if available, strclpy(), or even snprintf() (snprintf(dst, sizeof(dst), "%s", src)).

Let’s do a simple for-loop copy first:

void strcpy_1(char *src, char *dst)
{
        size_t len, i;

        len = strlen(src);

        for (i=0; i < len; i++)
                dst[i] = src[i];
        
        dst[len] = '\0';
}

We find the length of the string, then copy it over one char at a time. Don’t forget the null termination!

But since strlen has to walk the string to find the null terminator, we’s looping over the string twice! See by expanding strlen:

void strcpy_2(char *src, char *dst)
{
        size_t len=0, i;

        for (i=0; src[i] != '\0'; i++)
                len++;
        for (i=0; i < len; i++)
                dst[i] = src[i];

        dst[len] = '\0';
}

Let’s put that in one loop:

void strcpy_3(char *src, char *dst)
{
        size_t i;

        for (i=0; src[i] != '\0'; i++)
                dst[i] = src[i];

        dst[i] = '\0';
}

This works perfectly fine, but we can avoid having the i variable altogether by incrementing src and dst directly. Walking a pointer like that is a common idiom in C:

void strcpy_4(char *src, char *dst)
{
        while (*src != '\0') {
                *dst = *src;
                src++;
                dst++;
        }
}

Our final change is to simplify this version by removing the explicit \0 comparison and folding the increment expressions into the assign statement – which may make your hair stand up but this access-and-increment pattern is so common that it’s a useful trick to know:

void strcpy_5(char *src, char *dst)
{
        while (*src)
                *dst++ = *src++;

        *dst = '\0';
}

Let’s dissect that: *dst++ is *(dst++). The ++ here is post-increment, which means that first the old value is returned, and only after the statement the new value is assigned to dst. So first *dst = *src is performed, and only then are dst and src incremented – just like in the previous version.

Perhaps a lot to grasp for those unfamiliar with this pattern, but again it’s a often-used solution to this common situation. For a different take, here’s how OpenBSD implements the function with a for loop instead (and also returning the copied string, per spec):

char *
strcpy(char *to, const char *from)
{
	char *save = to;

	for (; (*to = *from) != '\0'; ++from, ++to);
	return(save);
}

Appendum

Casey’s video on this question is out now. His solution was:

for (int i=0; (dst[i] = src[i]); i++) ;

I hadn’t thought to use the result of the assignment as the loop condition. Now you also don’t need the extra \0 assignment because that happens in the last iteration of the loop.

Another version from the video:

while (*dst++ = *src++) ;

In a Borland C compiler of the time the first version generates faster code.

Comments welcome by email or on Mastodon.