python - Removing duplicates in pandas data frame if one column differs, but is in a given list -


i have dataframe duplicate entries coming 2 sources, values should unique, 1 column not formatted same, hence should remove duplicate different names in 1 column, if names within list.

technically, remove row in pandas dataframe if there exist row same a , b values, if row’s z value 'bar' , other’s 'z' 'foo'.

an example might clearer:

i have given dataframe df

     b     z  'a'   'a'   'foo' 'a'   'a'   'bar' 'b'   'a'   'bar' 'c'   'c'   'foo' 'd'   'd'   'blb' 

and get

     b     z  'a'   'a'   'foo' 'b'   'a'   'bar' 'c'   'c'   'foo' 'd'   'd'   'blb' 

note that:

  • the rows other values 'foo' , 'bar' in z column should not touched.
  • it’s not important if 'foo' , 'bar' stay same because changed same value afterwards.
  • it great generalize duo 'foo' , 'bar' list.

attempts far: here best guess, doesn’t work though… don’t understand groupby returns. i’m sure there magical pandas one-liner can’t find.

new_df = [] row in df.groupby('a'):     if rowloc['z'].isin('foo'):          if not row['z'].isin('bar'):                     new_df.append(row) 

thanks !

i think can expected result concatenating 2 subsets of original dataframe:

  • one z values neither foo nor bar
  • and other 1 duplicates according a , b dropped

here's example gives me expected output:

data = """     b     z     foo     bar b     bar c   c   foo d   d   blb""" df = pd.read_csv(stringio(data),sep='\s+')  ls = ['foo','bar'] df1 = pd.concat((df.loc[~(df.z.isin(ls))], # no foos or bars here                  df.loc[  df.z.isin(ls)].drop_duplicates(subset=['a','b'])                  )).sort_index() 

an simpler option might replace foo bar everywhere in z , drop duplicates:

df1 = df.replace({'z':{'foo':'bar'}}).drop_duplicates() 

you replace both foo , bar other value you're going use:

df1 = df.replace({'z':{'foo':'xyz', 'bar':'xyz'}}).drop_duplicates() 

Comments

Popular posts from this blog

Command prompt result in label. Python 2.7 -

javascript - How do I use URL parameters to change link href on page? -

amazon web services - AWS Route53 Trying To Get Site To Resolve To www -