regex - Regular expression generator/reducer?

Question

Welcome To Ask or Share your Answers For Others

regex - Regular expression generator/reducer?

asked Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

regex - Regular expression generator/reducer?

I was posed an interesting question from a colleague for an operational pain point we currently have, and am curious if there's anything out there (utility/library/algorithm) that might help automate this.

Say you have a list of literal values (in our cases, they are URLs). What we want to do is, based on this list, come up with a single regex that matches all of those literal items.

So, if my list is:

http://www.example.com
http://www.example.com/subdir
http://foo.example.com

The simplest answer is

^(http://www.example.com|http://www.example.com/subdir|http://foo.example.com)$

but this gets large for lots of data, and we have a length limit we're trying to stay under.

Currently we manually write the regexes but this doesn't scale very well nor is it a great use of anyone's time. Is there a more automated way of decomposing the source data to come up with a length-optimal regex that matches all of the source values?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-17T02:47:37+0000

The Aho-Corasick matching algorithm constructs a finite automaton to match multiple strings. You could convert the automaton to its equivalent regex but it is simpler to use the automaton directly (this is what the algorithm does.)

Categories

regex - Regular expression generator/reducer?

regex - Regular expression generator/reducer?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags