Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
369 views
in Technique[技术] by (71.8m points)

Python code to authenticate to website, navigate through links and download files

I'm looking something which could be interesting to you as well. I'm developing a feature using Python, which should be able to authenticate (using userid/password and/or with other preferred authentication methods) and connect to specify website, navigate through the website and download the file under a specific option. Later I have to write the schedules on developed code and automate it.

Did anyone come across such scenario and developed the code in python? Please suggest if any python libraries are there.

What I have achieved right now is:

I can download file with specific URL.

I know how to authenticate and download the file.

I'm able to pull the links from the specific website.

This is something we could achieve using selenium, but I want to write in Python.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

After 5 days of research, I found what I wanted. Your urlLogin and urlAuth could be same, its totally depends on what action taken on Login button or form action. I used crome inspect option to findout the actual GET or POST request used on the portal.

Here is the answer of my own question-->

import requests

urlLogin = 'https://example.com/jsp/login.jsp'
urlAuth = 'https://example.com/CheckLoginServlet'
urlBd = 'https://example.com/jsp/batchdownload.jsp'
payload = {
    "username": "username",
    "password": "password"
}

# Session will be closed at the end of with block
with requests.Session() as s:
    s.get(urlLogin)
    headers = s.cookies.get_dict()
    print(f"Session cookies {headers}")
    r1 = s.post(urlAuth, data=payload, headers=headers)
    print(f'MainFrame text:::: {r1.status_code}')  #200

    r2 = s.post(urlBd, data=payload)
    print(f'MainFrame text:::: {r2.status_code}')  #200
    print(f'MainFrame text:::: {r2.text}')  #page source

    # 3. Again cookies will be used through session to access batch download page
    r2 = s.post(config['access-url'])
    print(f'Batch Download status:::: {r2.status_code}')  #200
    source_code = r2.text
    # print(f'Batch Download source:::: {source_code}')

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...