K&R Exercise 3-3: Expand Shorthand Notations (e.g. a-d to abcd)

The following code is my answer for exercise 3-3 in K&R:

#include <stdio.h>
#include <ctype.h>

#define MAXLINE 1000    /* maximum output line size */

int my_getline(char s[], int lim);
void expand(char s1[], char s2[]);

/* Exercise 3-3. Write a function expand(s1,s2) that expands shorthand
    notations like a-z in the string s1 into the equivalent complete list
    abc...xyz in s2. Allow for letters of either case and digits, and be
    prepared to handle cases like a-b-c and a-z0−9 and -a-z. Arrange that a
    leading or trailing - is taken literally. */
main()
{
    char s1[MAXLINE], s2[MAXLINE];

    my_getline(s1, MAXLINE);
    expand(s1, s2);
    printf("Original: %sExpanded: %s", s1, s2);
    return 0;
}

/* getline:  read a line into s, return length */
int my_getline(char s[], int lim)
{
    int c, i;

    for (i=0; i<lim-1 && (c=getchar())!=EOF && c!='\n'; ++i)
        s[i] = c;
    if (c == '\n') {
        s[i] = c;
        ++i;
    }
    s[i] = '\0';
    return i;
}

/* expand:  expand short hand notations in s1 into the equivalent list in s2 */
void expand(char s1[], char s2[])
{
    int i, j, k, valid;

    for (i = j = 0; s1[i] != '\0'; ++i, ++j) {
        valid = (islower(s1[i-1]) && islower(s1[i+1]) ||
            isupper(s1[i-1]) && isupper(s1[i+1]) ||
            isdigit(s1[i-1]) && isdigit(s1[i+1])) && s1[i-1] <= s1[i+1];
        if (i != 0 && s1[i+1] != '\0' && s1[i] == '-' && valid) {
            for (k = s1[i-1] + 1; k < s1[i+1]; ++k)
                s2[j++] = k;
            s2[j] = s1[++i];
            if (s1[i+1] != '\0') {
                if (s1[i+1] == '-') /* take trailing - literally */
                    s2[++j] = s1[++i];
            } else
                --j;    /* otherwise j will increment by 2 bypassing '\0' */
        } else
            s2[j] = s1[i];
    }
    s2[j] = '\0';
}

Are there any bugs with my approach? Are there ways I could make my code clearer?

Answer

I strongly recommend to avoid K&R since it teaches bad style and is very much outdated. This review will focus on general C programming rather than commenting on your specific algorithm.

Invalid C

  • main() is an incorrect signature, this style went obsolete 23 years ago. You must use int main (void).

Bugs

  • for (i = j = 0; ... then s1[i-1]. This accesses s1 out of bounds at the first lap of the loop.

Dangerous practice

  • char s1[MAXLINE], s2[MAXLINE]; it’s not a good idea to allocate arrays of thousands of bytes in local scope – these will end up on the stack and eventually you risk stack overflow. Declaring them as static char s1[MAXLINE] would have avoided that.
  • Incrementing/decrementing loop iterators anywhere else but in the third clause of a for loop is poor and dangerous practice, the code turns completely unreadable when you do. You should reconsider writing this loop differently, as close to the idiomatic for(int i=0; s1[i]!='\0'; i++) as possible.
  • Assignment inside conditions such as c=getchar() inside the 2nd for clause is bad style. K&R does this a lot but it’s universally regarded as bad and bug-prone, why all mainstream compilers force an additional parenthesis when you do.

Poor style

  • Avoid declaring multiple variables on a single line, it is harder to read and you can get subtle bugs that way too. Instead, declare each variable on a line of its own.
  • Loop iterators should be declared inside the for loop clause 1 whenever possible – K&R is completely outdated here. That is for(int i=0; ....
  • s1 isn’t modified by the expand function so that function should be written with const correctness:
    void expand (const char s1[], char s2[])
  • valid should ideally have been declared as bool.

Attribution
Source : Link , Question Author : vim_overlord , Answer Author : Lundin

Leave a Comment