Python - How to import module from .so file from egg file without using an absolute file path -
i built pyahocorasick library python setup.py bdist_egg
command , uploaded onto spark pyspark job.
however, .so file inside pyahocorasick can't imported through pkg_resources.resource_filename()
method on spark cluster security reasons.
traceback (most recent call last): file "spark_datawash.py", line 251, in <module> import ahocorasick file "build/bdist.linux-x86_64/egg/ahocorasick.py", line 7, in <module> file "build/bdist.linux-x86_64/egg/ahocorasick.py", line 4, in __bootstrap__ file "build/bdist.linux-x86_64/egg/pkg_resources/__init__.py", line 1152, in resource_filename file "build/bdist.linux-x86_64/egg/pkg_resources/__init__.py", line 1696, in get_resource_filename file "build/bdist.linux-x86_64/egg/pkg_resources/__init__.py", line 1726, in _extract_resource file "build/bdist.linux-x86_64/egg/pkg_resources/__init__.py", line 1219, in get_cache_path file "build/bdist.linux-x86_64/egg/pkg_resources/__init__.py", line 1199, in extraction_error pkg_resources.extractionerror: can't extract file(s) egg cache following error occurred while trying extract file(s) python egg cache: [errno 13] permission denied: '/home/.python-eggs' python egg cache directory set to: /home/.python-eggs perhaps account not have write access directory? can change cache directory setting python_egg_cache environment variable point accessible directory.
this how pyahocorasick import .so:
def __bootstrap__(): global __bootstrap__, __loader__, __file__ import sys, pkg_resources, imp __file__ = pkg_resources.resource_filename(__name__, 'ahocorasick.so') __loader__ = none; del __bootstrap__, __loader__ imp.load_dynamic(__name__,__file__) __bootstrap__()
can import .so resource_stream()
instead of resource_filename()
or other way without need read absolute file path? all.
btw, can't install pyahocorasick on every node on spark cluster other reasons. have upload egg-zipped distribution later use.
Comments
Post a Comment