apache pig - pig filter and getting original dataset -
i have pig input file looks this:
1, cornflakes, regular, post, 10 2, cornflakes, regular,general mills, 12 3, cornflakes, mixed nuts, post, 14 4, chocolate syrup, regular, hersheys, 5 5, chocolate syrup, no high fructose, hersheys, 8 6, chocolate syrup, regular, ghirardeli, 6 7, chocolate syrup, strawberry flavor, ghirardeli, 7
i need filter out cornflakes less 12 , need use original set of data next step of filtering.
total = load 'location_of_file' using pigstorage('\t') (item_sl : int, item : chararray, type: chararray, manufacturer: chararray, price : int); filter1 = filter total item == 'cornflakes' , price < 12;
now need use original dataset after filter1 next step of filtering.
use split
total = load '/output/systemhawk/file_inventory/test34.txt' using pigstorage(',') (item_sl : int, item : chararray, type: chararray, manufacturer: chararray, price : int); split total filter1 if (item == 'cornflakes' , price < 12),filter2 otherwise; dump filter2;
Comments
Post a Comment