python 3.x - Plot SVM with Matplotlib? -


i have interesting user data. gives information on timeliness of tasks users asked perform. trying find out, if late - tells me if users on time (0), little late (1), or quite late (2) - predictable/explainable. generate late column giving traffic light information (green = not late, red = super late).

here do:

  #imports   import pandas pd   import numpy np   import matplotlib.pyplot plt   sklearn import preprocessing   sklearn import svm   import sklearn.metrics sm       #load user data   df = pd.read_csv('april.csv', error_bad_lines=false, encoding='iso8859_15', delimiter=';')     #convert objects datetime data types   cols = ['planned start', 'actual start', 'planned end', 'actual end']   df = df[cols].apply(   pd.to_datetime, dayfirst=true, errors='ignore'   ).join(df.drop(cols, 1))    #convert datetime numeric data types   cols = ['planned start', 'actual start', 'planned end', 'actual end']   df = df[cols].apply(   pd.to_numeric, errors='ignore'   ).join(df.drop(cols, 1))     #add likert scale green, yellow , red traffic lights   df['late'] = 0   df.ix[df['end time traffic light'].isin(['yellow']), 'late'] = 1   df.ix[df['end time traffic light'].isin(['red']), 'late'] = 2    #supervised learning      #x , y arrays   # x = np.array(df.drop(['late'], axis=1))   x = df[['planned start', 'actual start', 'planned end', 'actual end', 'measure package', 'measure' , 'responsible user']].as_matrix()    y = np.array(df['late'])      #preprocessing data   x = preprocessing.scale(x)     #supper vector machine   clf = svm.svc(decision_function_shape='ovo')   clf.fit(x, y)    print(clf.score(x, y)) 

i trying understand how plot decision boundaries.my goal plot 2-way scatter actual end , planned end. naturally, checked documentation (see e.g. here). can't wrap head around it. how work?

as heads future, you'll faster (and better) responses if provide publicly available dataset attempted plotting code, since don't have 'april.csv'. can leave out data-wrangling code 'april.csv'. said...

sebastian raschka created mlxtend package, has has pretty awesome plotting function doing this. uses matplotlib under hood.

import numpy np import pandas pd sklearn import svm mlxtend.plotting import plot_decision_regions import matplotlib.pyplot plt   # create arbitrary dataset example df = pd.dataframe({'planned_end': np.random.uniform(low=-5, high=5, size=50),                    'actual_end':  np.random.uniform(low=-1, high=1, size=50),                    'late':        np.random.random_integers(low=0,  high=2, size=50)} )  # fit support vector machine classifier x = df[['planned_end', 'actual_end']] y = df['late']  clf = svm.svc(decision_function_shape='ovo') clf.fit(x.values, y.values)   # plot decision region using mlxtend's awesome plotting function plot_decision_regions(x=x.values,                        y=y.values,                       clf=clf,                        legend=2)  # update plot object x/y axis labels , figure title plt.xlabel(x.columns[0], size=14) plt.ylabel(x.columns[1], size=14) plt.title('svm decision region boundary', size=16) 

enter image description here


Comments

Popular posts from this blog

How to understand 2 main() functions after using uftrace to profile the C++ program? -

c# - Update a combobox from a presenter (MVP) -

How to put a lock and transaction on table using spring 4 or above using jdbcTemplate and annotations like @Transactional? -