There is a fundamental difference in running a topology in LocalCluster
or remotely via StormSubmitter
(which is the default setting in the project).
The scope of storm-core
is set to <scope>provided</scope>
be the default because those class files are available in the cluster anyway. provided
tells maven, that those classes must not be included in the jar
file that is assembled, thus reducing the size of the jar. Furthermore, this avoids conflicts if files are provided multiple times -- that is what happens with default.yaml
if you change the scope to compile
. For those case, all files from storm-core
are packaged into you jar
and submitted to the cluster. Storm finds the file defaults.yaml
"locally" (ie, locally on the worker machine in the cluster) and in your jar
. Thus, Storm does not know which one to use and raises an error.
However, provided
excludes those class files if you run locally, too. Of course, locally those files are not available automatically but must be included in CLASSPATH when starting up the local JVM. As provided
excludes the files from storm-core
you get the ClassNotFound
exception.
As an alternative to change the scope each time you want to submit to a different environment, you can set the scope to compile
and include your topology Main/Bolt/Spout classes explicitly in your maven-jar-plugin
settings. This explicit inclusion automatically excludes all other files from the jar, ie, all files from storm-core
.
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>2.6</version>
<executions>
<execution>
<id>MyTopology</id>
<phase>package</phase>
<goals>
<goal>jar</goal>
</goals>
<configuration>
<includes>
<include>my/topology/package/**/*.class</include>
</includes>
</configuration>
</execution>
</executions>
</plugin>
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…