Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.0k views
in Technique[技术] by (71.8m points)

hadoop - HBase get returns old values even with max versions = 1

I have the desire to find the columns that have not been updated for more than a specific time period.

So I want to do a scan against the columns with a timerange. The normal behaviour of HBase is that you then get the latest value in that time range (which is not what I want).

As far as I understand the way HBase should work is that if you set the maximum number of versions for the values in a column family to '1' it should retain only the last value that was put into the cell.

What I found is different.

If I do the following commands into the hbase shell

create 't1', {NAME => 'c1', VERSIONS => 1}
put 't1', 'r1', 'c1', 'One', 1000
put 't1', 'r1', 'c1', 'Two', 2000
put 't1', 'r1', 'c1', 'Three', 3000
get 't1', 'r1'
get 't1', 'r1' , {TIMERANGE => [0,1500]}

the result is this:

get 't1', 'r1'
COLUMN                     CELL
 c1:                       timestamp=3000, value=Three
1 row(s) in 0.0780 seconds

get 't1', 'r1' , {TIMERANGE => [0,1500]}
COLUMN                     CELL
 c1:                       timestamp=1000, value=One
1 row(s) in 0.1390 seconds

Why does the second query return a value eventhough I've set the max versions to only 1?

The HBase version I currently have installed here is HBase 0.94.6-cdh4.4.0

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

It turns out to be a bug in hbase. https://issues.apache.org/jira/browse/HBASE-10102


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...