python - How to load file created with numpy.savez_compressed? -
i saving numpy array using following export_vectors
defined below. in function, load string values separated space , store them floats in numpy array.
def export_vectors(vocab, input_filename, output_filename, dim): embeddings = np.zeros([len(vocab), dim]) open(input_filename) f: line in f: line = line.strip().split(' ') word = line[0] embedding = line[1:] if word in vocab: word_idx = vocab[word] embeddings[word_idx] = np.asarray(embedding).astype(float) np.savez_compressed(output_filename, embeddings=embeddings)
here embeddings
ndarray
of float64
type.
although, when trying load file, using:
def get_vectors(filename): open(filename) f: return np.load(f)["embeddings"]
when trying loading, getting error:
file "/usr/lib/python3.5/codecs.py", line 321, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) unicodedecodeerror: 'utf-8' codec can't decode byte 0x99 in position 10: invalid start byte
why this?
you using open
wrong. suspect, need give flag use binary-mode (docs):
open(filename, 'rb') # r: read-only; b: binary
the docs explain default-behaviour: normally, files opened in text mode, means, read , write strings , file, encoded in specific encoding.
but can make simple , use filepath (as np.load able take file-like object, string, or pathlib.path
):
np.load(filename) # more natural # it's kind of direct inverse of save-code; # -> no manual file-handling
(a simplified rule: using general-purpose compression alway's working binary-files; not text-files!)
Comments
Post a Comment