java - Using MultipleOutputs without context.write results empty files -
i don't know how use multipleoutputs class. i'm using create multiple output files. following driver class's code snippet
configuration conf = new configuration(); job job = job.getinstance(conf); job.setjarbyclass(customkeyvaluetest.class);//class mapper , reducer job.setoutputkeyclass(customkey.class); job.setoutputvalueclass(text.class); job.setmapoutputkeyclass(customkey.class); job.setmapoutputvalueclass(customvalue.class); job.setmapperclass(customkeyvaluetestmapper.class); job.setreducerclass(customkeyvaluetestreducer.class); job.setinputformatclass(textinputformat.class); path in = new path(args[1]); path out = new path(args[2]); out.getfilesystem(conf).delete(out, true); fileinputformat.setinputpaths(job, in); fileoutputformat.setoutputpath(job, out); multipleoutputs.addnamedoutput(job, "islnd" , textoutputformat.class, customkey.class, text.class); lazyoutputformat.setoutputformatclass(job, textoutputformat.class); multipleoutputs.setcountersenabled(job, true); boolean status = job.waitforcompletion(true);
and in reducer, used multipleoutputs this,
private multipleoutputs<customkey, text> multipleoutputs; @override public void setup(context context) throws ioexception, interruptedexception { multipleoutputs = new multipleoutputs<>(context); } @override public void reduce(customkey key, iterable<customvalue> values, context context) throws ioexception, interruptedexception { ... multipleoutputs.write("islnd", key, pop, key.tostring()); //context.write(key, pop); } public void cleanup() throws ioexception, interruptedexception { multipleoutputs.close(); }
}
when use context.write output files data in it. when remove context.write output files empty. don't want call context.write because creates file part-r-00000. stated here(last para in description of class) used lazyoutputformat avoid part-r-00000 file. still didn't work.
lazyoutputformat.setoutputformatclass(job, textoutputformat.class);
this means , in case not creating output, dont create empty files.
can please @ hadoop counters , find 1. map.output.records 2. reduce.input.groups 3. reduce.input.records verify if mappers sending data mapper.
code multioutput http://bytepadding.com/big-data/map-reduce/multipleoutputs-in-map-reduce/
Comments
Post a Comment