Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
471 views
in Technique[技术] by (71.8m points)

.net - Regex using word boundary but word ends with a . (period)

want to match word i.v. case insensitive

have pattern

(?i)i.v.

but want a word boundary on the end
the above pattern fails in that it matches
i.v.x

but if I try and add a work boundary to the end

(?i)i.v.

it fails in that it does not even match i.v. as I think the is eating the literal . as . is a word break
need the . to be greedy

i want to match
sam i.v. sam

do not want to match
sam.i.v.
i.v.sam

This get closer

(?i)i.v.s$

But it fails to find i.v. at the end of a line

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

only matches between an alphanumeric character and a non-alphanumeric character (or the start/end of string). Therefore, it doesn't match after a ., unless an alphanumeric character immediately follows that dot.

If your intent is to make sure that no non-whitespace character follows after the dot, then you can specify that using a negative lookahead assertion:

(?i)i.v.(?!S)

(?!S) means "Assert that the next character is not a non-whitespace character".

This may sound a bit convoluted - why the double negative? Why not (?=s) which means "Assert that the next character is a whitespace character"? Well, there is a subtle difference: The second version requires a whitespace character to be there; that means the regex would fail to match at the end of the string. The first regex handles that corner case as well.

If you generally want the concept of "word boundary" to mean "space-delimited", then you need to replace the first as well:

(?i)(?<!S)i.v.(?!S)

or the regex will match sam.i.v. which you don't seem to want it to.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
...