By Sijmen J. Mulder, 3 August 2023.
Casey of Molly Rocket posted four interview questions asked for his 1994 Microsoft intership.
This page is about the second: implementing a string copy function:
Implement strcpy(), which copies a string into another buffer (we don’t care about the return value for now):
char * strcpy(char *src, char *dst);
Let’s get this out of the way: using this function is usually
a bad idea because you need to be absolutely sure that the
destination buffer can fit the full source string. Instead, use
strcpy_s() if available, strclpy(), or even snprintf() (snprintf(dst, sizeof(dst), "%s", src)
).
Let’s do a simple for-loop copy first:
void strcpy_1(char *src, char *dst)
{
size_t len, i;
len = strlen(src);
for (i=0; i < len; i++)
dst[i] = src[i];
dst[len] = '\0';
}
We find the length of the string, then copy it over one char at a time. Don’t forget the null termination!
But since strlen has to walk the string to find the null terminator, we’s looping over the string twice! See by expanding strlen:
void strcpy_2(char *src, char *dst)
{
size_t len=0, i;
for (i=0; src[i] != '\0'; i++)
len++;
for (i=0; i < len; i++)
dst[i] = src[i];
dst[len] = '\0';
}
Let’s put that in one loop:
void strcpy_3(char *src, char *dst)
{
size_t i;
for (i=0; src[i] != '\0'; i++)
dst[i] = src[i];
dst[i] = '\0';
}
This works perfectly fine, but we can avoid having the i variable altogether by incrementing src and dst directly. Walking a pointer like that is a common idiom in C:
void strcpy_4(char *src, char *dst)
{
while (*src != '\0') {
*dst = *src;
src++;
dst++;
}
}
Our final change is to simplify this version by removing the explicit \0 comparison and folding the increment expressions into the assign statement – which may make your hair stand up but this access-and-increment pattern is so common that it’s a useful trick to know:
void strcpy_5(char *src, char *dst)
{
while (*src)
*dst++ = *src++;
*dst = '\0';
}
Let’s dissect that: *dst++
is
*(dst++)
. The ++
here is
post-increment, which means that first the old value is
returned, and only after the statement the new value is assigned to
dst. So first *dst = *src
is performed, and only
then are dst and src incremented – just like in
the previous version.
Perhaps a lot to grasp for those unfamiliar with this pattern, but again it’s a often-used solution to this common situation. For a different take, here’s how OpenBSD implements the function with a for loop instead (and also returning the copied string, per spec):
char *
strcpy(char *to, const char *from)
{
char *save = to;
for (; (*to = *from) != '\0'; ++from, ++to);
return(save);
}
for (int i=0; (dst[i] = src[i]); i++) ;
I hadn’t thought to use the result of the assignment as the loop
condition. Now you also don’t need the extra \0 assignment
because that happens in the last iteration of the loop.
Another version from the video:
while (*dst++ = *src++) ;
In a Borland C compiler of the time the first version generates faster code.