Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
420 views
in Technique[技术] by (71.8m points)

hadoop - How to update table in Hive 0.13?

My Hive version is 0.13. I have two tables, table_1 and table_2

table_1 contains:

customer_id | items | price | updated_date
------------+-------+-------+-------------
10          | watch | 1000  | 20170626
11          | bat   | 400   | 20170625

table_2 contains:

customer_id | items    | price | updated_date
------------+----------+-------+-------------
10          | computer | 20000 | 20170624

I want to update records of table_2 if customer_id already exists in it, if not, it should append to table_2.

As Hive 0.13 does not support update, I tried using join, but it fails.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You can use row_number or full join. This is example using row_number:

insert overwrite table_1 
select customer_id, items, price, updated_date
from
(
select customer_id, items, price, updated_date,
       row_number() over(partition by customer_id order by new_flag desc) rn
from 
    (
     select customer_id, items, price, updated_date, 0 as new_flag
       from table_1
     union all
     select customer_id, items, price, updated_date, 1 as new_flag
       from table_2
    ) all_data
)s where rn=1;

Also see this answer for update using FULL JOIN: https://stackoverflow.com/a/37744071/2700344


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...