Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
4.7k views
in Technique[技术] by (71.8m points)

regex - python regular expression not matching file contents with re.match and re.MULTILINE flag

I'm reading in a file and storing its contents as a multiline string. Then I loop through some values I get from a django query to run regexes based on the query results values. My regex seems like it should be working, and works if I copy the values returned by the query, but for some reason isn't matching when all the parts are working together that ends like this

My code is:

with open("/path_to_my_file") as myfile:
    data=myfile.read()

#read saved settings then write/overwrite them into the config
items = MyModel.objects.filter(some_id="s100009")
for item in items:
    regexString = "^s*"+item.feature_key+":"

    print regexString #to verify its what I want it to be, ie debug
    pq = re.compile(regexString, re.M)

    if pq.match(data):
        #do stuff

So basically my problem is that the regex isn't matching. When I copy the file contents into a big old string, and copy the value(s) printed by the print regexString line, it does match, so I'm thinking theres some esoteric python/django thing going on (or maybe not so esoteric as python isnt my first language).

And for examples sake, the output of print regexString is :

^s*productDetailOn:

File contents:

    productDetailOn:true,
    allOff:false,
    trendingWidgetOn:true,
    trendingWallOn:true,
    searchResultOn:false,
    bannersOn:true,
    homeWidgetOn:true,
}

Running Python 2.7. Also, dumped the types of both item.feature and data, and both were unicode. Not sure if that matters? Anyway, I'm starting to hit my head off the desk after working this for a couple hours, so any help is appreciated. Cheers!


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

According to documentation, re.match never allows searching at the beginning of a line:

Note that even in MULTILINE mode, re.match() will only match at the beginning of the string and not at the beginning of each line.

You need to use a re.search:

regexString = r"^s*"+item.feature_key+":"
pq = re.compile(regexString, re.M)
if pq.search(data):

A small note on the raw string (r"^s+"): in this case, it is equivalent to "s+" because there is no s escape sequence (like or ), thus, Python treats it as a raw string literal. Still, it is safer to always declare regex patterns with raw string literals in Python (and with corresponding notations in other languages, too).


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...