csv - parse a list looking string having dict type elements in python -


i want parse below list looking string, ( calling string because type str ) , info dict elements:

 "[{""isin"": ""us51817r1068"", ""name"": ""latam airlines group sa""}, {""isin"": ""cl0000000423"", ""name"": ""latam airlines group sa""}, {""isin"": null, ""name"": ""latam airlines group sa""}, {""isin"": ""brlatmbdr001"", ""name"": ""latam airlines group sa""}]" 

i used ast packege , literal_eval convert list , parse on it. counter valueerror: malformed string error.

below code same:

company_list = ast.literal_eval(line[18]) print company_list in company_list:     #print type(i)     print i["isin"] 

here line[18] string above.

or how can ignore such list lookign string if contains null value, does.

ps: line[18] column number of csv want read.

ok going start off saying: wow way harder thought going be!

so 2 problems string:

  1. when python prints string removes double-quotes because parser getting confused - have add them in.
  2. the null type doesn't exist in python need change none.

so here's code:

import re import ast  data_in = "[{""isin"": ""us51817r1068"", ""name"": ""latam airlines group sa""}, {""isin"": ""cl0000000423"", ""name"": ""latam airlines group sa""}, {""isin"": null, ""name"": ""latam airlines group sa""}, {""isin"": ""brlatmbdr001"", ""name"": ""latam airlines group sa""}]"  # make copy modification. formatted_data = data_in  # captures positional information of adding , removing characters. offset = 0  # finds key , values. p = re.compile("[\{\:,]([\w\s\d]{2,})") m in p.finditer(data_in):     # counts number of characters removed via strip().     strip_val = len(m.group(1)) - len(m.group(1).strip())     # adds in quotes single match.     formatted_data = formatted_data[:m.start(1)+offset] + "\"" + m.group(1).strip() + "\"" + formatted_data[m.end(1)+offset:]     # offset add 2 ("+name+"), minus whitespace removed.      offset += 2 - strip_val  company_list = ast.literal_eval(formatted_data)  # finds 'null' values , replaces them none. item in company_list:     k,v in item.iteritems():         if v == 'null':             item[k] = none  print company_list 

it written in python 3 , changed bits remembered 2, there might small errors.

the result list of dict objects:

[{'isin': 'us51817r1068', 'name': 'latam airlines group sa'}, {'isin': 'cl0000000423', 'name': 'latam airlines group sa'}, {'isin': none, 'name': 'latam airlines group sa'}, {'isin': 'brlatmbdr001', 'name': 'latam airlines group sa'}] 

for more info on regex used, see here.


Comments

Popular posts from this blog

Command prompt result in label. Python 2.7 -

javascript - How do I use URL parameters to change link href on page? -

amazon web services - AWS Route53 Trying To Get Site To Resolve To www -