This python script is not working to output the email address [email protected] for this case.
This was my previous post.
How can I use BeautifulSoup or Slimit on a site to output the email address from a javascript variable
#!/usr/bin/env python
from bs4 import BeautifulSoup
import re
soup = '''
<script LANGUAGE="JavaScript">
function something()
{
var ptr;
ptr = "";
ptr += "<table><td class=france></td></table>";
ptr += "<table><td class=france><a href=mail";
ptr += "to:[email protected]>email</a></td></table>";
document.all.something.innerHTML = ptr;
}
</script>
'''
soup = BeautifulSoup(soup)
for script in soup.find_all('script'):
reg = '(<)?(w+@w+(?:.w+)+)(?(1)>)'
reg2 = 'mailto:.*'
secondHalf= re.search(reg, script.text)
firstHalf= re.search(reg2, script.text)
secondHalfEmail = secondHalf.group()
firstHalfEmail = firstHalf.group()
firstHalfEmail = firstHalfEmail.replace('mailto:', '')
firstHalfEmail = firstHalfEmail.replace('";', '')
if firstHalfEmail == secondHalfEmail:
email = secondHalfEmail
else:
if ('>') not in firstHalfEmail:
if ('>') not in secondHalfEmail:
if firstHalfEmail != secondHalfEmail:
email = firstHalfEmail + secondHalfEmail
else:
email = firstHalfEmail
else:
email = secondHalfEmail
print email
It would be nice if someone can help me.
Thank you
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…