# The Practice of Programming (Addison-Wesley Professional Computing Series)

All Hacker News 7
This Year Hacker News 2
This Month Stack Overflow 3

I would be tempted to read Practice of Programming and Programming Pearls. Both are quite terse books and C orientated

One good example of why named constants are beneficial comes from the excellent book The Practice of Programming by Kernighan and Pike (1999).

### §1.5 Magic Numbers

[...] This excerpt from a program to print a histogram of letter frequencies on a 24 by 80 cursor-addressed terminal is needlessly opaque because of a host of magic numbers:

``````...
fac = lim / 20;
if (fac < 1)
fac = 1;
for (i = 0, col = 0; i < 27; i++, j++) {
col += 3;
k = 21 - (let[i] / fac);
star = (let[i] == 0) ? ' ' : '*';
for (j = k; j < 22; j++)
draw(j, col, star);
}
draw(23, 2, ' ');
for (i = 'A'; i <= 'Z'; i++)
printf("%c  ", i);
``````

The code includes, among others, the numbers 20, 21, 22, 23, and 27. They're clearly related...or are they? In fact, there are only three numbers critical to this program: 24, the number of rows on the screen; 80, the number of columns; and 26, the number of letters in the alphabet. But none of these appears in the code, which makes the numbers that do even more magical.

By giving names to the principal numbers in the calculation, we can make the code easier to follow. We discover, for instance, that the number 3 comes from (80 - 1)/26 and that let should have 26 entries, not 27 (an off-by-one error perhaps caused by 1-indexed screen coordinates). Making a couple of other simplifications, this is the result:

``````enum {
MINROW   = 1,                 /* top row */
MINCOL   = 1,                 /* left edge */
MAXROW   = 24,                /* bottom edge (<=) */
MAXCOL   = 80,                /* right edge (<=) */
LABELROW = 1,                 /* position of labels */
NLET     = 26,                /* size of alphabet */
HEIGHT   = (MAXROW - 4),      /* height of bars */
WIDTH    = (MAXCOL - 1)/NLET  /* width of bars */
};

...
fac = (lim + HEIGHT - 1) / HEIGHT;
if (fac < 1)
fac = 1;
for (i = 0; i < NLET; i++) {
if (let[i] == 0)
continue;
for (j = HEIGHT - let[i]/fac; j < HEIGHT; j++)
draw(j+1 + LABELROW, (i+1)*WIDTH, '*');
}
draw(MAXROW-1, MINCOL+1, ' ');
for (i = 'A'; i <= 'Z'; i++)
printf("%c  ", i);
``````

Now it's clearer what the main loop does; it's an idiomatic loop from 0 to NLET, indicating that the loop is over the elements of the data. Also the calls to `draw` are easier to understand because words like MAXROW and MINCOL remind us of the order of arguments. Most important, it's now feasible to adapt the program to another size of display or different data. The numbers are demystified and so is the code.

The revised code doesn't actually use MINROW, which is interesting; one wonders which of the residual 1's should be MINROW.

There's a simple CSV parser library that's described in the excellent book The Practice of Programming by Kernighan and Pike, and the source is available from the site linked to.

``````int n = sscanf("string", "%s %[^, ]%*[, ]%s", word1, word2, word3);
``````

The return value in `n` tells you how many assignments were made successfully. The `%[^, ]` is a negated character-class match that finds a word not including either commas or blanks (add tabs if you like). The `%*[, ]` is a match that finds a comma or space but suppresses the assignment.

I'm not sure I'd use this in practice, but it should work. It is, however, untested.

Maybe a tighter specification is:

``````int n = sscanf("string", "%s %[^, ]%*[,]%s", word1, word2, word3);
``````

The difference is that the non-assigning character class only accepts a comma. `sscanf()` stops at any space (or EOS, end of string) after `word2`, and skips spaces before assigning to `word3`. The previous edition allowed a space between the second and third words in lieu of a comma, which the question does not strictly allow.

As pmg suggests in a comment, the assigning conversion specifications should be given a length to prevent buffer overflow. Note that the length does not include the null terminator, so the value in the format string must be one less than the size of the arrays in bytes. Also note that whereas `printf()` allows you to specify sizes dynamically with `*`, `sscanf()` et al use `*` to suppress assignment. That means you have to create the string specifically for the task at hand:

``````char word1, word2, word3;
int n = sscanf("string", "%19s %31[^, ]%*[,]%63s", word1, word2, word3);
``````

(Kernighan & Pike suggest formatting the format string dynamically in their (excellent) book 'The Practice of Programming' or Amazon The Practice of Programming 1999.)

Just found a problem: given `"word1 word2 ,word3"`, it doesn't read `word3`. Is there a cure?

Yes, there's a cure, and it is actually trivial, too. Add a space in the format string before the non-assigning, comma-matching conversion specification. Thus:

``````#include <stdio.h>

static void tester(const char *data)
{
char word1, word2, word3;
int n = sscanf(data, "%19s %31[^, ] %*[,]%63s", word1, word2, word3);
printf("Test data: <<%s>>\n", data);
printf("n = %d; w1 = <<%s>>, w2 = <<%s>>, w3 = <<%s>>\n", n, word1, word2, word3);
}

int main(void)
{
const char *data[] =
{
"word1 word2 , word3",
"word1 word2 ,word3",
"word1 word2, word3",
"word1 word2,word3",
"word1 word2       ,       word3",
};
enum { DATA_SIZE = sizeof(data)/sizeof(data) };
size_t i;
for (i = 0; i < DATA_SIZE; i++)
tester(data[i]);
return(0);
}
``````

Example output:

``````Test data: <<word1 word2 , word3>>
n = 3; w1 = <<word1>>, w2 = <<word2>>, w3 = <<word3>>
Test data: <<word1 word2 ,word3>>
n = 3; w1 = <<word1>>, w2 = <<word2>>, w3 = <<word3>>
Test data: <<word1 word2, word3>>
n = 3; w1 = <<word1>>, w2 = <<word2>>, w3 = <<word3>>
Test data: <<word1 word2,word3>>
n = 3; w1 = <<word1>>, w2 = <<word2>>, w3 = <<word3>>
Test data: <<word1 word2       ,       word3>>
n = 3; w1 = <<word1>>, w2 = <<word2>>, w3 = <<word3>>
``````

Once the 'non-assigning character class' only accepts a comma, you can abbreviate that to a literal comma in the format string:

``````int n = sscanf(data, "%19s %31[^, ] , %63s", word1, word2, word3);
``````

Plugging that into the test harness produces the same result as before. Note that all code benefits from review; it can often (essentially always) be improved even after it is working.

Just about anything. Too many coworkers don't read any technical books at all. Some will pick up books to learn specific tools or languages, which is better, but not enough...

If you made me pick one, though...

The Practice of Programming - Brian W. Kernighan, Rob Pike - http://www.amazon.com/Practice-Programming-Brian-W-Kernighan...

Many of these are elaborated on in the more recent (1999) "The Practice of Programming." Excellent book. http://www.amazon.com/Practice-Programming-Brian-W-Kernighan...
The efficiency and comments part read like a tl;dr version of Kernighan & Pike The Practice of Programming, which, I might add, is an excellent read. As for coding conventions, I quite like OpenBSD's (man style), which are also present on FreeBSD. Though I rarely write C these days, I have to read some every now and then, and code from BSD projects following those conventions feels very readable.