python - Use sklearn's FunctionTransformer with string data? -
i'm using sklearn's functiontransformer preprocess of data, date strings such "2015-01-01 11:09:15".
my customized function takes string input, found out functiontransformer cannot deal strings in source code didn't implement fit_transform. therefore, call got routed parent class as:
57 def fit(self, x, y=none): 58 if self.validate: ---> 59 check_array(x, self.accept_sparse) 60 return self
the check_array seems working numeric ndarrays. of course can in pandas domain, wonder if there's better way of dealing in sklearn - esp. given possibly use pipeline in future?
thanks!
seems if validate
parameter looking for: http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.functiontransformer.html
here example, may make sense leave string on converting float mentioned in comment. let's want add time zone info date string:
import pandas pd def add_tz(df): df['date'] = df['date'].astype(str) + "z" data = { 'date' : ["2015-01-01 11:00:00", "2015-01-01 11:15:00", "2015-01-01 11:30:00"], 'value' : [4., 3., 2.]} df = pd.dataframe(data)
this fail noted due check:
ft = functiontransformer(func=add_tz) ft.fit_transform(df)
output:
valueerror: not convert string float: '2015-01-01 11:30:00'
this works:
ft = functiontransformer(func=add_tz, validate=false) ft.fit_transform(df)
output:
date value 0 2015-01-01 11:00:00z 4.0 1 2015-01-01 11:15:00z 3.0 2 2015-01-01 11:30:00z 2.0
Comments
Post a Comment