I'm trying to find information (documentation, advice, etc) on how certain IDE templates (e.g. in Eclipse, IntelliJ, and NetBeans) are instantiated internally by IDEs, and I'm having some trouble.
I'm hoping, perhaps optimistically, that I can automatically generate multiple (at least two) distinct samples of each pattern from templates written in the associated grammars.
Every pattern-parameter (including cursors) must be filled, and samples for the same pattern should only have non-pattern-parameter content in common.
At this stage, they need to be syntactically valid so that they can be parsed, but do not need to be fully semantically valid/compilable snippets.
If anyone knows how any of these IDEs work internally, and can tell me if/how I might be able to do this (or can point me towards sufficient documentation), I would greatly appreciate it.
Background/Context
I'm trying to create a research dataset for a pattern mining task - specifically, for mining code templates. I've been looking into it for some time and, as far as I'm aware, there isn't a suitable precedent dataset, so I have to make one.
Rather than painstakingly defining every feature of every pattern myself, I'm writing tools to partially automate the process. Specifically, automating the tasks of deriving candidate patterns from samples, and of filtering out any candidates not observed in the actual corpus. The tools are input-language-agnostic, but I am initially targetting Java ASTs via the Eclipse JDT.
My thinking is that well-established patterns such as idioms and IDE code templates, from sufficiently reputable sources, are rational and intuitive pattern candidates with which I can, at least, evaluate recall. I can, and will, define some target-sample sets manually. However, I would prefer to generate them automatically, so that I can collect more complicated templates en masse (e.g. those published by IDE community members).
Thanks in advance,
Marcos C-S