Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
537 views
in Technique[技术] by (71.8m points)

python - Clean URL with BeautifulSoup

My script

import BeautifulSoup as bs
from BeautifulSoup import BeautifulSoup 
url_list = sys.argv[1]
urls = [tag['href'] for tag in 
    BeautifulSoup(open(url_list)).findAll('a')]

returns

[u'http://www.youtube.com/watch?v=Gg81zi0pheg', u'http://www.youtube.com/watch?v=pP9VjGmmhfo', u'http://www.youtube.com/watch?v=yTA1u6D1fyE', u'http://www.youtube.com/watch?v=4v8HvQf4fgE', u'http://www.youtube.com/watch?v=e9zG20wQQ1U', u'http://www.youtube.com/watch?v=khL4s2bvn-8', u'http://www.youtube.com/watch?v=XTndQ7bYV0A', u'http://www.youtube.com/watch?v=xTT2MqgWRRc', u'http://www.youtube.com/watch?v=J2ZYQngwSUw', u'http://www.youtube.com/watch?v=9RZwvg7unrU', u'http://www.youtube.com/watch?v=vz3qOYWwm10', u'http://www.youtube.com/watch?v=yarv52QX_Yw', u'http://www.youtube.com/watch?v=LRREY1H3GCI']

I would like it to return this:

http://www.youtube.com/watch?v=Gg81zi0pheg
http://www.youtube.com/watch?v=pP9VjGmmhfo
http://www.youtube.com/watch?v=yTA1u6D1fyE
http://www.youtube.com/watch?v=4v8HvQf4fgE
http://www.youtube.com/watch?v=e9zG20wQQ1U
http://www.youtube.com/watch?v=khL4s2bvn-8
http://www.youtube.com/watch?v=XTndQ7bYV0A
http://www.youtube.com/watch?v=xTT2MqgWRRc
http://www.youtube.com/watch?v=J2ZYQngwSUw
http://www.youtube.com/watch?v=9RZwvg7unrU
http://www.youtube.com/watch?v=vz3qOYWwm10
http://www.youtube.com/watch?v=yarv52QX_Yw
http://www.youtube.com/watch?v=LRREY1H3GCI

I am having a really hard time wrapping my head around BeautifulSoup. Anything would help. Thank you for your time.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

But this is completely basic Python. You're getting a list, and you want to output it one URL per line.

for url in urls:
    print url

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...