I have a Data model like below,
CREATE TABLE appstat.nodedata (
nodeip text,
timestamp timestamp,
flashmode text,
physicalusage int,
readbw int,
readiops int,
totalcapacity int,
writebw int,
writeiops int,
writelatency int,
PRIMARY KEY (nodeip, timestamp)
) WITH CLUSTERING ORDER BY (timestamp DESC)
where, nodeip - primary key and timestamp - clustering key (Sorted by descinding oder to get the latest),
Sample data in this table,
SELECT * from nodedata WHERE nodeip = '172.30.56.60' LIMIT 2;
nodeip | timestamp | flashmode | physicalusage | readbw | readiops | totalcapacity | writebw | writeiops | writelatency
--------------+---------------------------------+-----------+---------------+--------+----------+---------------+---------+-----------+--------------
172.30.56.60 | 2017-12-08 06:13:07.161000+0000 | yes | 34 | 57 | 19 | 27 | 8 | 89 | 57
172.30.56.60 | 2017-12-08 06:12:07.161000+0000 | yes | 70 | 6 | 43 | 88 | 79 | 83 | 89
This is properly available and whenever I need to get the statistics I am able to get the data using the partition key like below,
SELECT nodeip,readbw,timestamp FROM nodedata WHERE nodeip = '172.30.56.60' AND timestamp < 1512652272989 AND timestamp > 1512537899000;
Also successfully aggregate the data like below,
SELECT sum(readbw) FROM nodedata WHERE nodeip = '172.30.56.60' AND timestamp < 1512652272989 AND timestamp > 1512537899000;
Now comes the next use case, Where I need to get the cluster data (All the data's of the four nodes),
Like below,
SELECT nodeip,readbw,timestamp FROM nodedata WHERE nodeip IN ('172.30.56.60','172.30.56.61','172.30.56.62','172.30.56.63') AND timestamp < 1512652272989 AND timestamp > 1512537899000;
But It clearly mentioned in number of sites that, 'IN query' has lots of performance hiccups, So what is your suggestion in this Data Model of 'nodedata' table mentioned above? (NOTE: Doing Multiple queries in different partitions are okay which I feel like a last option)
Do you have a better approach (or) re-designing this data model in a better way (or) Any better solution to retrieve the data from multiple partitions?
Any help would be really appreciable.
Thanks,
Harry
See Question&Answers more detail:
os