multithreading - locks needed for multithreaded python scraping? -


i have list of zipcodes want pull business listings using yelp fusion api. each zipcode have make @ least 1 api call ( more) , so, want able keep track of api usage daily limit 25000. have defined each zipcode instance of user defined locale class. locale class has class variable locale.pulls, acts global counter number of pulls.

i want multithread using multiprocessing module not sure if need use locks , if so, how so? concern race conditions need sure each thread sees current number of pulls defined zip.pulls class variable in pseudo code below.

import multiprocessing.dummy mt    class locale():     pulls = 0     max_pulls = 20000      def __init__(self,x,y):         #initialize instance arguments needed complete api call        def pull(self):         if locale.pulls > max_pulls:              return none         else:              # make request, store returned data , increment counter             self.data = self.call_yelp()              locale.pulls += 1   def main():     #zipcodes below list of arguments needed initialize each zipcode locale class object     pool = mt.pool(len(zipcodes)/100) # let each thread work on 100 zipcodes     data = pool.map(locale, zipcodes) 

a simple solution check len(zipcodes) < map_pulls before running map().


Comments

Popular posts from this blog

Command prompt result in label. Python 2.7 -

javascript - How do I use URL parameters to change link href on page? -

amazon web services - AWS Route53 Trying To Get Site To Resolve To www -