Given an NLP parse tree like
(ROOT (S (NP (PRP You)) (VP (MD could) (VP (VB say) (SBAR (IN that) (S (NP (PRP they)) (ADVP (RB regularly)) (VP (VB catch) (NP (NP (DT a) (NN shower)) (, ,) (SBAR (WHNP (WDT which)) (S (VP (VBZ adds) (PP (TO to) (NP (NP (PRP$ their) (NN exhilaration)) (CC and) (NP (FW joie) (FW de) (FW vivre))))))))))))) (. .)))
Original sentence is "You could say that they regularly catch a shower, which adds to their exhilaration and joie de vivre."
How could the clauses be extracted and reverse engineered?
We would be splitting at S and SBAR (to preserve the type of clause, eg subordinated)
- (S (NP (PRP You)) (VP (MD could) (VP (VB say)
- (SBAR (IN that) (S (NP (PRP they)) (ADVP (RB regularly)) (VP (VB catch) (NP (NP (DT a) (NN shower))
- (, ,) (SBAR (WHNP (WDT which)) (S (VP (VBZ adds) (PP (TO to)
(NP (NP (PRP$ their) (NN exhilaration)) (CC and) (NP (FW joie) (FW
de) (FW vivre))))))))))))) (. .)))
to arrive at
- You could say
- that they regularly catch a shower
- , which adds to their exhilaration and joie de vivre.
Splitting at S and SBAR seems very easy. The problem seems to be stripping away all the POS tags and chunks from the fragments.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…