unix - Merging(Joining) 2 huge flat files in Solaris, using an index column(first field) -
i have 2 huge flat files in unix(solaris), each 500-600 gb. , need join , merge 2 files single flat file using first column key index column. how in optimized way?
basically should inner join between 2 flat files. reason try use flat files is, have 2 huge table have been split 2 separate tables, , extracted 2 flat files, , trying join @ unix level instead of @ database level.
i did use below commands :
sort -n file1 > file_temp1; sort -n file2 > file_temp2; join -j 1 -t';' file_temp1 file_temp2 > final it works fine sort 1st field index column. when join happens, hardly 2% of data in final file. trying understand mistake doing in join command? both files contain .2 million records , of records matching between 2 files. want have performance check if join made @ unix better performed @ database level. sorry incomplete question! first field numeric index field. have a"-n" switch indicate join first field numeric index?
you should not sort -n, since join has no corresponding -n flag. keep leading/trailing whitespace is:
#!/bin/sh sort -t';' -k 1 file1 > file1.srt sort -t';' -k 1 file2 > file2.srt join -t';' -1 1 -2 1 file1.srt file2.srt > both #cat both
Comments
Post a Comment