python - Find Pairs in Pandas Data Frames and Perform operations on them -


i have dataframe this. , want find first pair , operations that.

        fname         lname          time of entry ............... other columns         adrian        peter                 1         jhon          adrian                3         peter         rusk                  4         rusk          anton                 10         gile          john                  12         angela        gomes                 13         gomes         angela                14 

now want culprit value both in fname , lname. if example both values in fname , lname angela gomes case below culprit has have 1 line angela , other gomes.

  pair  fname         lname        culprit       time diff ...... other columns    1    adrian        peter        adrian           -2    1    john          adrian       adrian            2    2    peter         rusk         rusk             -6    2    rusk          anton        rusk              6    3    angela        gomes        angela           -1    3    gomes         angela       gomes             1 

from above know in number 3 both angela , gomes culprits. time should sorted ascending order.

i'm not in love this, there's better way, works , doesn't use python iteration / lists.

code:

# find , number pairs , filter out rows don't belong df = df.loc[(df['fname'].isin(df['lname'].shift())) | (df['fname'].isin(df['lname'].shift(-1)))].reset_index(drop = true) df['pair'] = (df.index / 2.0).astype(int) + 1  # find culprit df['culprit'] = df.loc[(df['fname'] == df['lname'].shift(-1)) | (df['fname'] == df['lname'].shift(1)), 'fname'] df.sort_values(by = ['pair','culprit'], inplace = true) df.fillna(method = 'ffill', inplace = true)  # calculate time difference df['time_diff'] = df.loc[df['pair'] == df['pair'].shift(1), 'toe'] - df['toe'].shift(1) df['time_diff'] = df['time_diff'].fillna(df['time_diff'].shift(-1) * -1).astype(int)  # sort df.sort_values(by = ['pair','time_diff'], inplace = true)  print df[['pair','fname','lname','culprit','time_diff']].to_string(index = false) 

output:

pair   fname   lname culprit  time_diff    1  adrian   peter  adrian         -2    1    john  adrian  adrian          2    2   peter    rusk    rusk         -6    2    rusk   anton    rusk          6    3  angela   gomes  angela         -1    3   gomes  angela   gomes          1 

Comments

Popular posts from this blog

Command prompt result in label. Python 2.7 -

javascript - How do I use URL parameters to change link href on page? -

amazon web services - AWS Route53 Trying To Get Site To Resolve To www -