python - How to divide up webscraping workload -


i have massive webscraping project (part 1 scraping 300k+ separate data entries website). may have more of these in future , 1 data entry @ time won't suffice. have been using selenium enter data js site, , beautifulsoup parse results. have looked @ selenium grid don't believe accomplish want because i'm not trying have every instance perform same function.

i take ~300k separate data entries , split them search, example, 8+ @ time.

is option @ point (in python) setup several vms , execute python script in each? current time finish scrape 30 hours.


Comments

Popular posts from this blog

'hasOwnProperty' in javascript -

How to understand 2 main() functions after using uftrace to profile the C++ program? -

java - How to implement an entity bound odata action in olingo v4.3 -