python - Simple regex for simple xml string

Question

Welcome To Ask or Share your Answers For Others

python - Simple regex for simple xml string

asked Jan 31, 2022 in Technique[技术] by 深蓝 (71.8m points)

python - Simple regex for simple xml string

I have a string consisting of elements. Each element can contain "pear" or "apple". I can get all the elements using:

s = '<tag>uTSqUYRR8gapple</tag><tag>K9VGTZM3h8</tag><tag>pearTYysnMXMUc</tag><tag>udv5NZQdpzpearz5a4oS85mD</tag>'
import re; re.findall("<tag>.*?</tag>", s)

However, I want to get the last element that contains pear. What would the easiest/quickest way to do this? Is this a good way:

list = re.findall("<tag>.*?</tag>", s)
list.reverse()
last = next(x for x in list if re.match('.*pear', x))
re.match('<tag>(.*)</tag>', last).group(1)

or should I use a parser instead?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2022-01-31T07:26:40+0000

Use a parser, ie BeautifulSoup instead:

import re
from bs4 import BeautifulSoup

s = '<tag>uTSqUYRR8gapple</tag><tag>K9VGTZM3h8</tag><tag>pearTYysnMXMUc</tag><tag>udv5NZQdpzpearz5a4oS85mD</tag>'
soup = BeautifulSoup(s, "html5lib")
tags = soup.find_all(text=re.compile(r'pear'))
print tags
# [u'pearTYysnMXMUc', u'udv5NZQdpzpearz5a4oS85mD']

This sets up the dom and finds all tags where your text matches the regex pear (looking for "pear" literally).
See a demo on ideone.com.

Categories

python - Simple regex for simple xml string

python - Simple regex for simple xml string

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags