utf 8 - Can grep output the result in UTF-8? -


is possible encode output of grep command in utf-8 no matter encoding of input file was?

i execute grep statement in python script (subprocess) , want guarantee resulting bytes utf-8.

example:

grep -p "Äa" -m -1 file.txt 

i dont know input encoding of file...

grep follows the unix philosophy, is, one thing, , 1 thing well. file encoding not part of 1 thing.

that's other tools for. there tool character decoding , encoding well, called iconv. use change encoding of input file utf-8.

this require know input file encoding. if don't know, have guess, based on heuristic analysis of input file (it'll hard certain, recognising has been decoded using wrong codec requires human verify result). there tool too, called enca. tool can conversion once guess has been made. separate install (it not part of common default posix toolset). see how auto detect text file encoding? on over super user more options.

note however, given codec guessing tools need using statistical analysis, better guessing on input file, not on output of grep.

none of has python, of course. except if wanted encoding detection in python instead, @ point you'd want @ chardet library.


Comments

Popular posts from this blog

Command prompt result in label. Python 2.7 -

javascript - How do I use URL parameters to change link href on page? -

amazon web services - AWS Route53 Trying To Get Site To Resolve To www -