python - How to divide up webscraping workload -


i have massive webscraping project (part 1 scraping 300k+ separate data entries website). may have more of these in future , 1 data entry @ time won't suffice. have been using selenium enter data js site, , beautifulsoup parse results. have looked @ selenium grid don't believe accomplish want because i'm not trying have every instance perform same function.

i take ~300k separate data entries , split them search, example, 8+ @ time.

is option @ point (in python) setup several vms , execute python script in each? current time finish scrape 30 hours.


Comments

Popular posts from this blog

Command prompt result in label. Python 2.7 -

javascript - How do I use URL parameters to change link href on page? -

amazon web services - AWS Route53 Trying To Get Site To Resolve To www -