multithreading - locks needed for multithreaded python scraping? -
i have list of zipcodes want pull business listings using yelp fusion api. each zipcode have make @ least 1 api call ( more) , so, want able keep track of api usage daily limit 25000. have defined each zipcode instance of user defined locale class. locale class has class variable locale.pulls, acts global counter number of pulls.
i want multithread using multiprocessing module not sure if need use locks , if so, how so? concern race conditions need sure each thread sees current number of pulls defined zip.pulls class variable in pseudo code below.
import multiprocessing.dummy mt class locale(): pulls = 0 max_pulls = 20000 def __init__(self,x,y): #initialize instance arguments needed complete api call def pull(self): if locale.pulls > max_pulls: return none else: # make request, store returned data , increment counter self.data = self.call_yelp() locale.pulls += 1 def main(): #zipcodes below list of arguments needed initialize each zipcode locale class object pool = mt.pool(len(zipcodes)/100) # let each thread work on 100 zipcodes data = pool.map(locale, zipcodes)
a simple solution check len(zipcodes) < map_pulls
before running map()
.
Comments
Post a Comment