Time for action – creating counters, task states, and writing log output
We'll modify our UFORecordValidationMapper
to report statistics about skipped records and also highlight some other facilities for recording information about a job:
Create the following as the
UFOCountingRecordValidationMapper.java
file:import java.io.IOException; import org.apache.hadoop.io.* ; import org.apache.hadoop.mapred.* ; import org.apache.hadoop.mapred.lib.* ; public class UFOCountingRecordValidationMapper extends MapReduceBase implements Mapper<LongWritable, Text, LongWritable, Text> { public enum LineCounters { BAD_LINES, TOO_MANY_TABS, TOO_FEW_TABS } ; public void map(LongWritable key, Text value, OutputCollector<LongWritable, Text> output, Reporter reporter) throws IOException { String line = value.toString(); if (validate(line, reporter)) Output.collect(key, value); } private boolean validate(String str...