python - Lookahead assertion with multiple values -
i have following text:
[red] aaa [bbb] hello [blue] aaa [green] ccc
i want extract texts between section headers. tried lookahead assertion matches particular section header until header list of headers:
keys = ('red', 'blue', 'green') key in keys: match = re.search(r'\[' + key + r'\](.*)(?=(?:' + '|'.join(keys) + r'|$))', text, flags=re.dotall) print(key, match.group(1))
i'm missing though since doesn't match anything. ideas?
you can regex findall! can group section , values in like,
>>> import re >>> print re.findall(r'\[(\w*)\]([\w \n]*)',text) [('red', '\n\naaa '), ('bbb', ' hello\n\n'), ('blue', '\n\naaa\n\n'), ('green', '')]
here section \[(\w*)\]
, ([\w \n]*)
contents in section. result in hand, can strip or replace redundant newlines!
hope helps!
Comments
Post a Comment