One of the most basic approaches when you need a count of the number of times any element within a collection occurs for a given set is to use a Frequency Array. Very simply, a Frequency Array contains the number of elements that can occur for a collection. (e.g. with what frequency does the thing mapped to each index occur?)
For example, if you want to know the number of times any one alpha-character occurs within a line of text (ignoring case), your collection has 26-elements (the letters in the alphabet). So you would simply declare an array of 26-elements (usually int
or size_t
), initialized all zero, and then loop over the characters in your line of text, and for each alpha-character, you increment the corresponding element of the array.
For example, with the array int chars[26] = {0};
and for the line containing "Dog"
, you would simply increment:
chars[toupper(line[0]) - 'A']++; /* increments index chars[3]++ */
chars[toupper(line[1]) - 'A']++; /* increments index chars[14]++ */
chars[toupper(line[2]) - 'A']++; /* increments index chars[6]++ */
(see ASCII Table and Description to understand how toupper(line[0]) - 'A'
maps the character to the corresponding index. Also note cast to unsigned char
intentionally omitted for clarity)
After looping over all character in the line, you can output the results by looping over all elements of the frequency array, and if the value at an index is greater than zero, you output the character corresponding to that index and the value held as the count, e.g.
for (int i = 0; i < 26; i++)
if (chars[i])
printf ("[%c] => %d
", i + 'A', chars[i]);
To handle the digit-characters in the line, you simply use a second 10-element frequency array and do the same thing, but this time subtracting '0'
to map the ASCII digits to elements 0-9
of the array.
The other aspect of the problem you have omitted is validating every input and conversion. You cannot use any input function correct unless you check the return to determine whether the operation succeeded or failed. The same applies to every conversion to a numeric type and any other expression critical to the continued defined operation of your code.
Putting the pieces together, you could use a global enum
to define needed constants for your program (or to the same thing with multiple #define
statements) and open your file and read and convert the first line to an integer value with:
#include <stdio.h>
#include <ctype.h>
enum { DIGITS=10, CHARS=26, MAXC=1024 }; /* constants for use in program */
int main (int argc, char **argv) {
char line[MAXC]; /* buffer to hold each line of input */
int ncases = 0; /* integer for number of cases */
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
if (!fgets (line, MAXC, fp)) { /* read/validate 1st line */
fputs ("(user canceled input)
", stdout);
return 0;
}
if (sscanf (line, "%d", &ncases) != 1) { /* convert/validate to int */
fputs ("error: invalid integer input for ncases.
", stderr);
return 1;
}
Now with the number of cases in ncases
, you will simply loop that number of times, reading the next line from the file, looping over each character in line
incrementing the corresponding elements of two frequency arrays chars
and digits
to count the occurrences of each character type (alpha
and digits
) and then output the results looping over each frequency array in sequence to output the counts from each. You can do that as follows:
for (int i = 0; i < ncases; i++) { /* loop ncases times */
int chars[CHARS] = {0}, /* frequency array for characters */
digits[DIGITS] = {0}; /* frequency array for digits */
if (!fgets(line, MAXC, fp)) { /* read/validate line */
fputs ("error: reading line from file.
", stderr);
return 1;
}
for (int j = 0; line[j]; j++) { /* loop over each char in line */
int c = toupper((unsigned char)line[j]); /* convert to uppercase */
if (isalpha((unsigned char)c)) /* check if A-Z */
chars[c-'A']++; /* increment chars at index */
else if (isdigit((unsigned char)c)) /* check if 0-9 */
digits[c-'0']++; /* increment digits at index */
}
printf ("
Case #%d:
", i + 1); /* output case no. */
for (int j = 0; j < CHARS; j++) /* loop over chars array */
if (chars[j]) /* if value at index non-zero */
printf ("[%c] => %d
", j + 'A', chars[j]); /* output count */
for (int j = 0; j < DIGITS; j++) /* loop over digits array */
if (digits[j]) /* if value at index non-zero */
printf ("[%d] => %d
", j, digits[j]); /* output count */
}
(note: for outputting the digits counts, there is no need to map the indexes back to their ASCII value -- you can simply output the integer representation with %d
instead of mapping j + '0'
to output the ASCII chars with %c
-- your choice really. Also see man 3 isalpha for the requirement that each argument to any of the classification macros be of the type unsigned char
-- explaining the purpose of the casts above to (unsigned char)
)
That is essentially the complete code for approaching your problem with frequency arrays. All you need is to tidy up and return success to the shell, e.g.
if (fp != stdin) /* close file if not stdin */
fclose (fp);
return 0;
}
Example Use/Output
Placing your sample input in the file dat/charcountlines.txt
and then providing that as input to the program, would result in the following:
$ ./bin/charcount dat/charcountlines.txt
Case #1:
[A] => 4
[B] => 1
[E] => 1
[I] => 3
[N] => 4
[R] => 2
[S] => 2
[T] => 2
[U] => 2
[V] => 1
[Y] => 1
Case #2:
[B] => 1
[I] => 1
[N] => 1
[S] => 1
[U] => 1
[0] => 2
[2] => 2
Case #3:
[A] => 1
[C] => 2
[D] => 1
[I] => 2
[N] => 1
[O] => 3
[R] => 2
[S] => 1
[U] => 1
[V] => 2
[1] => 1
[9] => 1
The above matches the output you specify and the ordering. Look things over and let me know if you have further questions.