python - How to convert a list into float for using the '.join' function? -
i have compress file list of words , list of positions recreate original file. program should able take compressed file , recreate full text, including punctuation , capitalization, of original file. have correct apart recreation, using map function program can't convert list of positions floats because of '[' list.
my code is:
text = open("speech.txt") charactersunique = [] listofpositions = [] downline = false while true: line = text.readline() if not line: break twolist = line.split() word in twolist: if word not in charactersunique: charactersunique.append(word) listofpositions.append(charactersunique.index(word)) if not downline: charactersunique.append("\n") downline = true listofpositions.append(charactersunique.index("\n")) w = open("list_wordspos.txt", "w") c in charactersunique: w.write(c) w.close() x = open("list_wordspos.txt", "a") x.write(str(listofpositions)) x.close() open("list_wordspos.txt", "r") f: newwordsunique = f.readline() f.close() h = open("list_wordspos.txt", "r") lines = h.readlines() newlistofpositions = lines[1] newlistofpositions = map(float, newlistofpositions) print("recreated text:\n") recreation = " " .join(newwordsunique[pos] pos in (newlistofpositions)) print(recreation)
the error is:
task 3 code.py", line 42, in <genexpr> recreation = " " .join(newwordsunique[pos] pos in (newlistofpositions)) valueerror: not convert string float: '['
i using python idle 3.5 (32-bit). have ideas on how fix this?
why want turn position values in list
floats, since list
indices, , must integer? suspected might instance of called xy problem.
i found code difficult understand because haven't followed pep 8 - style guide python code. in particular, how many (although not all) of variable names camelcased
, according guidelines, should should reserved class names.
in addition of variables had misleading names, charactersunique
, [mostly] contained unique words.
so, 1 of first things did transform camelcased
variables lowercase underscore-separated words, camel_case
. in several instances gave them better names reflect actual contents or role: example: charactersunique
became unique_words
.
the next step improve handling of files using python's with
statement ensure closed automatically @ end of block. in other cases consolidated multiple file open()
calls one.
after had working, that's when discovered problem approach of treating newline "\n"
characters separate words of input text file. caused problem when file being recreated expression:
" ".join(newwordsunique[pos] pos in (newlistofpositions))
because adds 1 space before , after every "\n"
character encountered aren't there in original file. workaround that, ended writing out for
loop recreates file instead of using list comprehension, because doing allows newline "words" handled properly.
at rate, here's resulting rewritten (and working) code:
input_filename = "speech.txt" compressed_filename = "list_wordspos.txt" # 2 lists represent contents of input file. unique_words = ["\n"] # preload newline "word" word_positions = [] open(input_filename, "r") input_file: line in input_file: word in line.split(): if word not in unique_words: unique_words.append(word) word_positions.append(unique_words.index(word)) word_positions.append(unique_words.index("\n")) # add newline @ end of each line # write representations of 2 data-structures compressed file. open(compressed_filename, "w") compr_file: words_repr = " ".join(repr(word) word in unique_words) compr_file.write(words_repr + "\n") positions_repr = " ".join(repr(posn) posn in word_positions) compr_file.write(positions_repr + "\n") def strip_quotes(word): """strip first , last characters string (assumed quotes).""" tmp = word[1:-1] return tmp if tmp != "\\n" else "\n" # newline "words" special case # recreate input file data in compressed file. open(compressed_filename, "r") compr_file: line = compr_file.readline() new_unique_words = list(map(strip_quotes, line.split())) line = compr_file.readline() new_word_positions = map(int, line.split()) # using int, not float here words = [] lines = [] posn in new_word_positions: word = new_unique_words[posn] if word != "\n": words.append(word) else: lines.append(" ".join(words)) words = [] print("recreated text:\n") recreation = "\n".join(lines) print(recreation)
i created own speech.txt
test file first paragraph of question , ran script on these results:
recreated text: have compress file list of words , list of positions recreate original file. program should able take compressed file , recreate full text, including punctuation , capitalization, of original file. have correct apart recreation, using map function program can't convert list of positions floats because of '[' list.
Comments
Post a Comment