Redirecting the result to a file
In this recipe, we will learn how to redirect the output of a program to two different files. We are also going to learn some best practices when writing a filter, a program specifically made to be connected with other programs with a pipe.
The program we will build in this recipe is a new version of the program from the previous recipe. The mph-to-kph
program in the previous recipe had one drawback: it always stopped when it found a non-numeric character. Often, when we run filters on long input data, we want the program to continue running, even if it has detected some erroneous data. This is what we are going to fix in this version.
We will keep the default behavior just as it was previously; that is, it will abort the program when it encounters a non-numeric value. However, we will add an option (-c
) so that it can continue running the program even if a non-numeric value was detected. Then, it's up to the end user to decide how he or she wants to run it.
Getting ready
All the requirements listed in the Technical requirements section of this chapter apply here (the GCC compiler, the Make tool, and the Bash shell).
How to do it…
This program will be a bit longer, but if you like, you can download it from GitHub at https://github.com/PacktPublishing/Linux-System-Programming-Techniques/blob/master/ch2/mph-to-kph_v2.c. Since the code is a bit longer, I will be splitting it up into several steps. However, all of the code still goes into a single file called mph-to-kph_v2.c
. Let's get started:
- Let's start with the feature macro and the required header files. Since we are going to use
getopt()
, we need the_XOPEN_SOURCE
macro, as well as theunistd.h
header file:#define _XOPEN_SOURCE 500 #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h
- Next, we will add the function prototype for the help function. We will also start writing the
main()
function body:void printHelp(FILE *stream, char progname[]); int main(int argc, char *argv[]) { char mph[10] = { 0 }; int opt; int cont = 0;
- Then, we will add the
getopt()
function inside awhile
loop. This is similar to the Writing a program that parses command-line options recipe from Chapter 1, Getting the Necessary Tools and Writing Our First Linux Programs:/* Parse command-line options */ while ((opt = getopt(argc, argv, "ch")) != -1) { switch(opt) { case 'h': printHelp(stdout, argv[0]); return 0; case 'c': cont = 1; break; default: printHelp(stderr, argv[0]); return 1; } }
- Then, we must create another
while
loop, where we will fetch data from stdin withfgets()
:while(fgets(mph, sizeof(mph), stdin) != NULL) { /* Check if mph is numeric * (and do conversion) */ if( strspn(mph, "0123456789.-\n") == strlen(mph) ) { printf("%.1f\n", (atof(mph)*1.60934) ); } /* If mph is NOT numeric, print error * and return */ else { fprintf(stderr, "Found non-numeric " "value\n"); if (cont == 1) /* Check if -c is set */ { continue; /* Skip and continue if * -c is set */ } else { return 1; /* Abort if -c is not set */ } } } return 0; }
- Finally, we must write the function body for the
help
function:void printHelp(FILE *stream, char progname[]) { fprintf(stream, "%s [-c] [-h]\n", progname); fprintf(stream, " -c continues even though a non" "-numeric value was detected in the input\n" " -h print help\n"); }
- Compile the program using Make:
$> make mph-to-kph_v2 cc mph-to-kph_v2.c -o mph-to-kph_v2
- Let's try it out, without any options, by giving it some numeric values and a non-numeric value. The result should be the same as what we received previously:
$> ./mph-to-kph_v2 60 96.6 40 64.4 hello Found non-numeric value
- Now, let's try it out using the
-c
option so that we can continue running the program even though a non-numeric value has been detected. Type some numeric and non-numeric values into the program:$> ./mph-to-kph_v2 -c 50 80.5 90 144.8 hello Found non-numeric value 10 16.1 20 32.2
- That worked just fine! Now, let's add some more data to the
avg.txt
file and save it asavg-with-garbage.txt
. This time, there will be more lines with non-numeric values. You can also download the file from https://github.com/PacktPublishing/Linux-System-Programming-Techniques/blob/master/ch2/avg-with-garbage.txt:10-minute average: 61 mph 30-minute average: 55 mph 45-minute average: 54 mph 60-minute average: 52 mph 90-minute average: 52 mph 99-minute average: nn mph 120-minute average: 49 mph 160-minute average: 47 mph 180-minute average: nn mph error reading data from interface 200-minute average: 43 mph
- Now, let's run
awk
on that file again to see only the values:$> cat avg-with-garbage.txt | awk '{ print $3 }' 61 55 54 52 52 nn 49 47 nn data 43
- Now comes the moment of truth. Let's add the
mph-to-kph_v2
program at the end with the-c
option. This should convert all the mph values into kph values and continue running, even though non-numeric values will be found:$> cat avg-with-garbage.txt | awk '{ print $3 }' \ > | ./mph-to-kph_v2 -c 98.2 88.5 86.9 83.7 83.7 Found non-numeric value 78.9 75.6 Found non-numeric value Found non-numeric value 69.2
- That worked! The program continued, even though there were non-numeric values. Since the error messages are printed to stderr and the values are printed to stdout, we can redirect the output to two different files. That leaves us with a clean output file and a separate error file:
$> (cat avg-with-garbage.txt | awk '{ print $3 }' \ > | ./mph-to-kph_v2 -c) 2> errors.txt 1> output.txt
- Let's take a look at the two files:
$> cat output.txt 98.2 88.5 86.9 83.7 83.7 78.9 75.6 69.2 $> cat errors.txt Found non-numeric value Found non-numeric value Found non-numeric value
How it works…
The code itself is similar to what we had in the previous recipe, except for the added getopt()
and the help function. We covered getopt()
in detail in Chapter 1, Getting the Necessary Tools and Writing Our First Linux Programs, so there's no need to cover it again here.
To continue reading data from stdin when a non-numeric value is found (while using the -c
option), we use continue
to skip one iteration of the loop. Instead of aborting the program, we print an error message to stderr and then move on to the next iteration, leaving the program running.
Also, note that we passed two arguments to the printHelp()
function. The first argument is a FILE
pointer. We use this to pass stderr or stdout to the function. Stdout and stderr are streams, which can be reached via their FILE
pointer. This way, we can choose if the help message should be printed to stdout (in case the user asked for the help) or to stderr (in case there was an error).
The second argument is the name of the program, as we have seen already.
We then compiled and tested the program. Without the -c
option, it works just as it did previously.
After that, we tried the program with data from a file that contains some garbage. That's usually how data looks; it's often not "perfect". That's why we added the option to continue, even though non-numeric values were found.
Just like in the previous recipe, we used awk
to select only the third field (print $3
) from the file.
The exciting part is Step 12, where we redirected both stderr and stdout. We separated the two outputs into two different files. That way, we have a clean output file with only the km/h values. We can then use that file for further processing since it doesn't contain any error messages.
We could have written the program to do all the steps for us, such as filter out the values from the text file, do the conversions, and then write the result to a new file. But that's an anti-pattern in Linux and Unix. Instead, we want to write small tools that do one thing only—and do it well. That way, the program can be used on other files with a different structure, or for a completely different purpose. We could even grab the data straight from a device or modem if we wanted to and pipe it into our program. The tools for extracting the correct fields from the file (or device) have already been created; there's no need to reinvent the wheel.
Notice that we needed to enclose the entire command, with pipes and all, before redirecting the output and error messages.
There's more…
Eric S. Raymond has written some excellent rules to stick to when developing software for Linux and Unix. They can all be found in his book, The Art of Unix Programming. Two of the rules that apply to us in this recipe include the Rule of Modularity, which says that we should write simple parts that are connected with clean interfaces. The other rule that applies to us is the Rule of Composition, which says to write programs that will be connected to other programs.
His book is available for free online at http://www.catb.org/~esr/writings/taoup/html/.