java - Using MultipleOutputs without context.write results empty files -


i don't know how use multipleoutputs class. i'm using create multiple output files. following driver class's code snippet

    configuration conf = new configuration();      job job = job.getinstance(conf);     job.setjarbyclass(customkeyvaluetest.class);//class mapper , reducer     job.setoutputkeyclass(customkey.class);     job.setoutputvalueclass(text.class);     job.setmapoutputkeyclass(customkey.class);     job.setmapoutputvalueclass(customvalue.class);     job.setmapperclass(customkeyvaluetestmapper.class);     job.setreducerclass(customkeyvaluetestreducer.class);     job.setinputformatclass(textinputformat.class);      path in = new path(args[1]);     path out = new path(args[2]);     out.getfilesystem(conf).delete(out, true);      fileinputformat.setinputpaths(job, in);     fileoutputformat.setoutputpath(job, out);      multipleoutputs.addnamedoutput(job, "islnd" , textoutputformat.class, customkey.class, text.class);     lazyoutputformat.setoutputformatclass(job, textoutputformat.class);     multipleoutputs.setcountersenabled(job, true);      boolean status = job.waitforcompletion(true); 

and in reducer, used multipleoutputs this,

private multipleoutputs<customkey, text> multipleoutputs;  @override public void setup(context context) throws ioexception, interruptedexception {     multipleoutputs = new multipleoutputs<>(context); }  @override public void reduce(customkey key, iterable<customvalue> values, context context) throws ioexception, interruptedexception {     ...      multipleoutputs.write("islnd", key, pop, key.tostring());     //context.write(key, pop);  }  public void cleanup() throws ioexception, interruptedexception {     multipleoutputs.close(); } 

}

when use context.write output files data in it. when remove context.write output files empty. don't want call context.write because creates file part-r-00000. stated here(last para in description of class) used lazyoutputformat avoid part-r-00000 file. still didn't work.

lazyoutputformat.setoutputformatclass(job, textoutputformat.class);

this means , in case not creating output, dont create empty files.

can please @ hadoop counters , find   1. map.output.records  2. reduce.input.groups  3. reduce.input.records  verify if mappers sending data mapper. 

code multioutput http://bytepadding.com/big-data/map-reduce/multipleoutputs-in-map-reduce/


Comments

Popular posts from this blog

Command prompt result in label. Python 2.7 -

javascript - How do I use URL parameters to change link href on page? -

amazon web services - AWS Route53 Trying To Get Site To Resolve To www -