python - Find Pairs in Pandas Data Frames and Perform operations on them -
i have dataframe this. , want find first pair , operations that.
fname lname time of entry ............... other columns adrian peter 1 jhon adrian 3 peter rusk 4 rusk anton 10 gile john 12 angela gomes 13 gomes angela 14
now want culprit value both in fname , lname. if example both values in fname , lname angela gomes case below culprit has have 1 line angela , other gomes.
pair fname lname culprit time diff ...... other columns 1 adrian peter adrian -2 1 john adrian adrian 2 2 peter rusk rusk -6 2 rusk anton rusk 6 3 angela gomes angela -1 3 gomes angela gomes 1
from above know in number 3 both angela , gomes culprits. time should sorted ascending order.
i'm not in love this, there's better way, works , doesn't use python iteration / lists.
code:
# find , number pairs , filter out rows don't belong df = df.loc[(df['fname'].isin(df['lname'].shift())) | (df['fname'].isin(df['lname'].shift(-1)))].reset_index(drop = true) df['pair'] = (df.index / 2.0).astype(int) + 1 # find culprit df['culprit'] = df.loc[(df['fname'] == df['lname'].shift(-1)) | (df['fname'] == df['lname'].shift(1)), 'fname'] df.sort_values(by = ['pair','culprit'], inplace = true) df.fillna(method = 'ffill', inplace = true) # calculate time difference df['time_diff'] = df.loc[df['pair'] == df['pair'].shift(1), 'toe'] - df['toe'].shift(1) df['time_diff'] = df['time_diff'].fillna(df['time_diff'].shift(-1) * -1).astype(int) # sort df.sort_values(by = ['pair','time_diff'], inplace = true) print df[['pair','fname','lname','culprit','time_diff']].to_string(index = false)
output:
pair fname lname culprit time_diff 1 adrian peter adrian -2 1 john adrian adrian 2 2 peter rusk rusk -6 2 rusk anton rusk 6 3 angela gomes angela -1 3 gomes angela gomes 1
Comments
Post a Comment