I found a few references to regex filtering out non-English but none of them is in Java, aside from the fact that they are all referring to somewhat different problems than what I am trying to solve:
- Replace all non-English characters
with a space.
- Create a method that returns
true
if a string contains any non-English
character.
By "English text" I mean not only actual letters and numbers but also punctuation.
So far, what I have been able to come with for goal #1 is quite simple:
String.replaceAll("\W", " ")
In fact, so simple that I suspect that I am missing something... Do you spot any caveats in the above?
As for goal #2, I could simply trim()
the string after the above replaceAll()
, then check if it's empty. But... Is there a more efficient way to do this?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…