I am working on https://github.com/F-Bergemann/RegexSplitter.
Purpose: parse a regular expression string, and create breakable and unbreakable top-level substrings. Breakable substrings can be broken up again. Unbreakable substrings must remain as is. Unbreakable applies to 'groups' and 'character classes'
I am currently working on the 'character classes'. For those, i mainly use qi::rule<Iterator, std::string()> parsers, and only a single qi::rule<Iterator, ASTNode*> parser for the root parser. I.e. only the root parser shall create an AST result. The child parsers shall just validate.
When testing compiled regex-splitter i get this:
> ./regex-splitter "[1]"
TEST:[1]
### ASTNode c'tor (std::string &) #1: Unbreakable
### ASTNode c'tor (std::string &) #2: U:[11]
### ASTNode c'tor (ASTNode const *, std::vector<ASTNode *> &) #1: Collection
### ASTNode d'tor #1: Unbreakable
### ASTNode d'tor #2: U:[11]
### ASTNode c'tor (ASTNode const *, std::vector<ASTNode *> &) #2: Collection
U:[11],
### ASTNode d'tor #1: Collection
### ASTNode d'tor #2: C:[11]
I.e. instead of "[1]" i get as a result "[11]".
I know it has to do with following part of the code:
tok_set_item =
tok_range | tok_char
;
tok_range =
tok_char >> qi::char_('-') >> tok_char
;
tok_char =
qi::alnum // TODO BNF: <char> ::= any non metacharacter | "" metacharacter
;
It seems to try tok_range, 1st. Then switches to tok_char.
But why to do i get "[11]" here?
It should just validate the syntax and return the original data.
I tried to find out what happens for the parser action here.
I have no explicit parser action here.
What it is using implicitly? boost::variant<...>?
Does it make a difference, when i use qi::as_string[...] wrappers?
question from:
https://stackoverflow.com/questions/66050263/boostspirit-alternative-parsers-return-duplicates 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…