html - Login to website using python requests -
i'm trying login https://www.voxbeam.com/login using requests scrape data. i'm python beginner , have done tutorials, , web scraping on own beautifulsoup.
looking @ html:
<form id="loginform" action="https://www.voxbeam.com//login" method="post" autocomplete="off"> <input name="username" id="username" class="text auto_focus" placeholder="username" autocomplete="off" type="text"> <input name="password" id="password" class="password" placeholder="password" autocomplete="off" type="password"> <input id="challenge" name="challenge" value="78ed64f09c5bcf53ead08d967482bfac" type="hidden"> <input id="hash" name="hash" type="hidden"> i understand should using method post, , sending username , password
i'm trying this:
import requests import webbrowser url = "https://www.voxbeam.com/login" login = {'username': 'xxxxxxxxx', 'password': 'yyyyyyyyy'} print("original url:", url) r = requests.post(url, data=login) print("\nnew url", r.url) print("status code:", r.status_code) print("history:", r.history) print("\nredirection:") in r.history: print(i.status_code, i.url) # open r in browser check if logged in new = 2 # open in new tab, if possible webbrowser.open(r.url, new=new) i’m expecting, after successful login in r url dashboard, can begin scraping data need.
when run code authentication information in place of xxxxxx , yyyyyy, following output:
original url: https://www.voxbeam.com/login new url https://www.voxbeam.com/login status code: 200 history: [] redirection: process finished exit code 0 i in browser new tab www.voxbeam.com/login
is there wrong in code? missing in html? it’s ok expect dashboard url in r, or redirected , trying open url in browser tab check visually response, or should doing things in different way?
i been reading many similar questions here couple of days, seems every website authentication process little bit different, , checked http://docs.python-requests.org/en/latest/user/authentication/ describes other methods, haven’t found in html suggest should using 1 of instead of post
i tried too
r = requests.get(url, auth=('xxxxxxxx', 'yyyyyyyy')) but doesn’t seem work either.
as said above, should send values of fields of form. can find in web inspector of browser. form send 2 addition hidden values:
url = "https://www.voxbeam.com//login" data = {'username':'xxxxxxxxx','password':'yyyyyyyyy','challenge':'zzzzzzzzz','hash':''} # note in email have encoded '@' uuuuuuu%40gmail.com session = requests.session() r = session.post(url, headers=headers, data=data) also, many sites have protection bot hidden form fields, js, send encoded values, etc. variants could:
1) use cookies manual login:
url = "https://www.voxbeam.com" headers = {'user-agent': "mozilla/5.0 (windows nt 10.0; wow64) applewebkit/537.36 (khtml, gecko) chrome/57.0.2987.98 safari/537.36"} cookies = {'phpsessid':'zzzzzzzzzzzzzzz', 'loggedin':'yes'} s = requests.session() r = s.post(url, headers=headers, cookies=cookies) 2) use module sellenium:
from selenium import webdriver selenium.webdriver.common.keys import keys url = "https://www.voxbeam.com//login" driver = webdriver.firefox() driver.get(url) u = driver.find_element_by_name('username') u.send_keys('xxxxxxxxx') p = driver.find_element_by_name('password') p.send_keys('yyyyyyyyy') p.send_keys(keys.return)
Comments
Post a Comment