nosql - How much data per node in Cassandra cluster?

Question

Welcome To Ask or Share your Answers For Others

nosql - How much data per node in Cassandra cluster?

1 Answer

深蓝 · Answer 1 · 2021-10-23T20:03:25+0000

1 TB is a reasonable limit on how much data a single node can handle, but in reality, a node is not at all limited by the size of the data, only the rate of operations.

A node might have only 80 GB of data on it, but if you absolutely pound it with random reads and it doesn't have a lot of RAM, it might not even be able to handle that number of requests at a reasonable rate. Similarly, a node might have 10 TB of data, but if you rarely read from it, or you have a small portion of your data that is hot (so that it can be effectively cached), it will do just fine.

Compaction certainly is an issue to be aware of when you have a large amount of data on one node, but there are a few things to keep in mind:

First, the "biggest" compactions, ones where the result is a single huge SSTable, happen rarely, even more so as the amount of data on your node increases. (The number of minor compactions that must occur before a top-level compaction occurs grows exponentially by the number of top-level compactions you've already performed.)

Second, your node will still be able to handle requests, reads will just be slower.

Third, if your replication factor is above 1 and you aren't reading at consistency level ALL, other replicas will be able to respond quickly to read requests, so you shouldn't see a large difference in latency from a client perspective.

Last, there are plans to improve the compaction strategy that may help with some larger data sets.

Categories

nosql - How much data per node in Cassandra cluster?

nosql - How much data per node in Cassandra cluster?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags