unix - Merging(Joining) 2 huge flat files in Solaris, using an index column(first field) -


i have 2 huge flat files in unix(solaris), each 500-600 gb. , need join , merge 2 files single flat file using first column key index column. how in optimized way?

basically should inner join between 2 flat files. reason try use flat files is, have 2 huge table have been split 2 separate tables, , extracted 2 flat files, , trying join @ unix level instead of @ database level.

i did use below commands :

sort -n file1 > file_temp1; sort -n file2 > file_temp2; join -j 1 -t';' file_temp1 file_temp2 > final 

it works fine sort 1st field index column. when join happens, hardly 2% of data in final file. trying understand mistake doing in join command? both files contain .2 million records , of records matching between 2 files. want have performance check if join made @ unix better performed @ database level. sorry incomplete question! first field numeric index field. have a"-n" switch indicate join first field numeric index?

you should not sort -n, since join has no corresponding -n flag. keep leading/trailing whitespace is:

#!/bin/sh  sort -t';' -k 1 file1 > file1.srt sort -t';' -k 1 file2 > file2.srt  join -t';' -1 1 -2 1 file1.srt file2.srt > both  #cat both 

Comments

Popular posts from this blog

How to understand 2 main() functions after using uftrace to profile the C++ program? -

c# - Update a combobox from a presenter (MVP) -

How to put a lock and transaction on table using spring 4 or above using jdbcTemplate and annotations like @Transactional? -