Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
424 views
in Technique[技术] by (71.8m points)

python - Getting the final destination of a javascript redirect on a website

I parse a website with python. They use a lot of redirects and they do them by calling javascript functions.

So when I just use urllib to parse the site, it doesn't help me, because I can't find the destination url in the returned html code.

Is there a way to access the DOM and call the correct javascript function from my python code?

All I need is the url, where the redirect takes me.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I looked into Selenium. And if you are not running a pure script (meaning you don't have a display and can't start a "normal" browser) the solution is actually quite simple:

from selenium import webdriver

driver = webdriver.Firefox()
link = "http://yourlink.com"
driver.get(link)

#this waits for the new page to load
while(link == driver.current_url):
  time.sleep(1)

redirected_url = driver.current_url

For my usecase this is more than enough. Selenium can also interact with forms and send keystrokes to the website.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

57.0k users

...