python - Conditionally join a pandas DF on multiple indexes -
i have 2 pandas dataframes of different lengths, , need conditionally overwrite value 1 other when 2 columns match.
df1.val = df2.val df1.val == null , df1.key1 == df2.key1 , df1.key2 == df2.key2 df1 , df2 have different sets of columns , row lengths, save key1, key2, , val. thing that's guaranteed every (key1, key2) in df1, there 1 (key1, key2) in df2.
so far i've been going down path of
df1.loc[df1.val.isnull(), "val"] = df2.val trying set indexes match, haven't got anywhere on that.
i know there's join-ons, haven't made progress on front either. syntactical appreciated.
edit data:
df1:
first last val random1 ... john smith 4 x todd smith 5 nan john todd nan z df2:
first last val random2 ... john smith 7 4 todd smith 6 9 john todd 3 3 eric smith 5 2 result:
first last val random1 ...otherdf1cols... john smith 4 x todd smith 5 nan john todd 3 z
set indices first, fillna
df1.set_index(['first', 'last']).fillna(df2.set_index(['first', 'last'])) val first last john smith 4.0 todd smith 5.0 john todd 3.0 use combine_first include stuff both dataframes
df1.set_index(['first', 'last']).combine_first(df2.set_index(['first', 'last'])) val first last eric smith 5.0 john smith 4.0 todd 3.0 todd smith 5.0 or, update val column, , limit rows first
d1 = df1.set_index(['first', 'last']) d2 = df2.set_index(['first', 'last']) print(d1.combine_first(d2[['val']]).loc[d1.index].reset_index()) first last val 0 john smith 4.0 1 todd smith 5.0 2 john todd 3.0 an option using update
d1 = df1.set_index(['first', 'last']) d2 = df2.set_index(['first', 'last']) d1.update(d2.val, overwrite=false) d1.reset_index() first last val 0 john smith 4.0 1 todd smith 5.0 2 john todd 3.0
Comments
Post a Comment