apache pig - pig filter and getting original dataset -


i have pig input file looks this:

1, cornflakes, regular, post, 10 2, cornflakes, regular,general mills, 12 3, cornflakes, mixed nuts, post, 14 4, chocolate syrup, regular, hersheys, 5 5, chocolate syrup, no high fructose, hersheys, 8 6, chocolate syrup, regular, ghirardeli, 6 7, chocolate syrup, strawberry flavor, ghirardeli, 7 

i need filter out cornflakes less 12 , need use original set of data next step of filtering.

total = load 'location_of_file' using pigstorage('\t') (item_sl : int, item : chararray, type: chararray, manufacturer: chararray, price : int); filter1 = filter total item == 'cornflakes' , price < 12; 

now need use original dataset after filter1 next step of filtering.

use split

total = load '/output/systemhawk/file_inventory/test34.txt' using pigstorage(',') (item_sl : int, item : chararray, type: chararray, manufacturer: chararray, price : int); split total filter1 if (item == 'cornflakes' , price < 12),filter2 otherwise; dump filter2; 

output


Comments

Popular posts from this blog

Command prompt result in label. Python 2.7 -

javascript - How do I use URL parameters to change link href on page? -

amazon web services - AWS Route53 Trying To Get Site To Resolve To www -