unix - Merging(Joining) 2 huge flat files in Solaris, using an index column(first field) -

September 15, 2010

i have 2 huge flat files in unix(solaris), each 500-600 gb. , need join , merge 2 files single flat file using first column key index column. how in optimized way?

basically should inner join between 2 flat files. reason try use flat files is, have 2 huge table have been split 2 separate tables, , extracted 2 flat files, , trying join @ unix level instead of @ database level.

i did use below commands :

sort -n file1 > file_temp1; sort -n file2 > file_temp2; join -j 1 -t';' file_temp1 file_temp2 > final

it works fine sort 1st field index column. when join happens, hardly 2% of data in final file. trying understand mistake doing in join command? both files contain .2 million records , of records matching between 2 files. want have performance check if join made @ unix better performed @ database level. sorry incomplete question! first field numeric index field. have a"-n" switch indicate join first field numeric index?

you should not sort -n, since join has no corresponding -n flag. keep leading/trailing whitespace is:

#!/bin/sh  sort -t';' -k 1 file1 > file1.srt sort -t';' -k 1 file2 > file2.srt  join -t';' -1 1 -2 1 file1.srt file2.srt > both  #cat both

Search This Blog

MOno

unix - Merging(Joining) 2 huge flat files in Solaris, using an index column(first field) -

Comments

Post a Comment

Popular posts from this blog

javascript - Confirm a form & display message if form is valid with JQuery -

Retrieving ETA (estimated time of arrival) with Google Distance Matrix API and public transit as transport mode -

ionic framework - Meteor - Error: Failed to execute 'insertBefore' on 'Node' -