azure data lake - How to parse big string U-SQL Regex -

April 15, 2015

i have got big csvs contain big strings. wanna parse them in u-sql.

@t1 =  select     regex.match("id=881cf2f5f474579a:t=1489536183:s=alni_mzsmmpa4voge4kqmyxoocew2aor0q", "id=(?<id>\\w+):t=(?<t>\\w+):s=(?<s>[\\w\\d_]*)") p     (values(1)) fe(n);  @t2 =  select     p.groups["id"].value gads_id,     p.groups["t"].value gads_t,     p.groups["s"].value gads_s     @t1;  output @t "/inhabit/test.csv" using outputters.csv();

severity code description project file line suppression state error e_csc_user_invalidcolumntype: 'system.text.regularexpressions.match' cannot used column type.

i know how in sql way explode/cross apply/group by. may possible without these dances?

one more update

@t1 =  select     regex.match("id=881cf2f5f474579a:t=1489536183:s=alni_mzsmmpa4voge4kqmyxoocew2aor0q", "id=(?<id>\\w+):t=(?<t>\\w+):s=(?<s>[\\w\\d_]*)").groups["id"].value id,     regex.match("id=881cf2f5f474579a:t=1489536183:s=alni_mzsmmpa4voge4kqmyxoocew2aor0q", "id=(?<id>\\w+):t=(?<t>\\w+):s=(?<s>[\\w\\d_]*)").groups["t"].value t,     regex.match("id=881cf2f5f474579a:t=1489536183:s=alni_mzsmmpa4voge4kqmyxoocew2aor0q", "id=(?<id>\\w+):t=(?<t>\\w+):s=(?<s>[\\w\\d_]*)").groups["s"].value s     (values(1)) fe(n);  output @t1 "/inhabit/test.csv" using outputters.csv();

this wariant works fine. there question. regex evauated 3 times per row? exists chance hint u-sql engine - function regex.match deterministic.

you should using more efficient regex.match. answer original question:

system.text.regularexpressions.match not part of built-in u-sql types.

thus need convert built-in type, such string or sqlarray<string> or wrap udt provides iformatter make user-defined type.

Search This Blog

MOno

azure data lake - How to parse big string U-SQL Regex -

Comments

Post a Comment

Popular posts from this blog

'hasOwnProperty' in javascript -

python - ValueError: No axis named 1 for object type <class 'pandas.core.series.Series'> -

java - How to provide dependency injections in Eclipse RCP 3.x? -