Generally speaking, you can monitor a nodetool repair
operation with two nodetool commands:
The repair operation has two distinct phases. First it calculates the differences between the nodes (repair work to be done), and then it acts on those differences by streaming data to the appropriate nodes.
This checks on the active Merkle Tree calculations:
$ nodetool compactionstats
pending tasks: 0
Active compaction remaining time : n/a
The repair streams can be monitored by:
$ nodetool netstats
In fact, TheLastPickle's Aaron Morton suggests using the following Bash script/command to monitor any active repair streams:
while true; do date; diff <(nodetool -h localhost netstats) <(sleep 5 && nodetool -h localhost netstats); done
DataStax has a posting in their support forums about troubleshooting hanging repairs. If you have any hung repair streams, you should be able to see them with a netstats
. This can happen if one of your nodes becomes unavailable during the repair process. To monitor the specific repair operations, you can check your log file for entries like this:
DEBUG [WRITE-/172.30.77.197] 2013-05-03 12:43:09,107 OutboundTcpConnection.java (line 165) error writing to /172.30.77.197
java.net.SocketException: Connection reset
Note that repair sessions should also be denoted in your system.log:
[repair #02fc68f0-210c-11e7-aa88-c35a9a02c19a] Starting...
[repair #02fc68f0-210c-11e7-aa88-c35a9a02c19a] Completed...
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…