python - Parse text file from content-type=application/zip and base64 encoding in AWS SES -
on amazon ses, have rule save incoming emails s3 buckets. amazon saves these in mime format.
these emails have .txt in attachment shown in mime file content-type=text/plain, content-disposition=attachment ... .txt, , content-transfer-encoding=quoted-printable or bases64.
i able parse fine using python.
i have problem decoding content of .txt file attachment when compressed (i.e., content-type: applcation/zip), if encoding wasn't base64.
my code:
import base64 s = unicode(base64.b64decode(attachment_content), "utf-8") throws error:
traceback (most recent call last): file "<input>", line 796, in <module> unicodedecodeerror: 'utf8' codec can't decode byte 0xcf in position 10: invalid continuation byte below first few lines of "base64" string in attachment_content, btw has length 53683 + "==" @ end, , thought length of base64 should multiple of 4 (??). maybe decoding failing because compression changing attachment_content , need other operation before/after decoding it? have no idea..
uesdbbqaaaaiam9ah0otgkpwx5oaadmtagajaaaax2noyxqudhh0tl3bjirjkix23sd+g0u3ioxu rewgu8c1l2ag8lkd0v2zwajm3kluc6hubu5ufezm3nyjl6+n4t4ry8eodwcsmyqxbrblgmq+7cp5 qpbj5gdyn0cri6jqfxwv7hlyszursijv1g6qoni5cmqyet6dpp9cncat6yvp5yvz6xfje7cp8p/k 1sbl8xfju0osvuvr2q3tonfvwjxrknwzfeuk2vrlu978s19mrvnmrhneov51sozlgutmlynfp0nd ... i have tried used "latin-1", gibberish.
the problem that, after conversion, dealing zipped file in format, "pk \x03 \x04 \x3c \xa \x0c ...", , needed unzip before transforming utf-8 unicode.
this code worked me:
import email # parse results email received_email = email.message_from_string(email_text) part in received_email.walk(): c_type = part.get_content_type() c_enco = part.get('content-transfer-encoding') attachment_content = part.get_payload() if c_enco == 'base64': import base64 decoded_file = base64.b64decode(attachment_content) print("file decoded base64") if c_type == "application/zip": cstringio import stringio import zipfile zfp = zipfile.zipfile(stringio(decoded_file), "r") unzipped_list = zfp.open(zfp.namelist()[0]).readlines() decoded_file = "".join(unzipped_list) print('and un-zipped') result = unicode(decoded_file, "utf-8")
Comments
Post a Comment