Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
978 views
in Technique[技术] by (71.8m points)

store nested json whose fields are seperated by in hive external table

I have nested JSON whose fields are separated by while saving that json to hive external table then I am getting error.

{"value":"{"DUUID": 67, "GUUID": 514, "EOT": 219.0, "cc": 3, "enghr": 20.0, "battvolt": 0.0, "EOP": 120.0, "ts": "2020-12-31T14:22:37", "ts1": 1609404757.2771647}"}

The above is my json message which is stored in hdfs /lambda3/test directory

I wrote the query in the hive as below ---

> CREATE EXTERNAL TABLE demo1.json11(
> value struct<
>
>     DUUID: INTEGER,
>     GUUID :INTEGER,
>     EOT: Double,
>     cc :Double,
>     enghr :double,
>     battvolt : double,
>     EOP : double,
>     ts : timestamp,
>     ts1 : timestamp >
> )
> ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
> LOCATION 'hdfs://localhost:9000/lambda3/test/';

Then after that when I am giving command select * from json11 Then I am getting an error message as below

Exception in thread "main" java.lang.Error: Data is not JSONObject  but java.lang.String with value {"DUUID": 67, "GUUID": 514, "EOT": 219.0, "cc": 3, "enghr": 20.0, "battvolt": 0.0, "EOP": 120.0, "ts": "2020-12-31T14:22:37", "ts1": 1609404757.2771647}
        at org.openx.data.jsonserde.objectinspector.JsonStructObjectInspector.getStructFieldData(JsonStructObjectInspector.java:73)
        at org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:366)
        at org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:202)
        at org.apache.hadoop.hive.serde2.DelimitedJSONSerDe.serializeField(DelimitedJSONSerDe.java:61)
        at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:231)
        at org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:55)
        at org.apache.hadoop.hive.serde2.DefaultFetchFormatter.convert(DefaultFetchFormatter.java:67)
        at org.apache.hadoop.hive.serde2.DefaultFetchFormatter.convert(DefaultFetchFormatter.java:36)
        at org.apache.hadoop.hive.ql.exec.ListSinkOperator.process(ListSinkOperator.java:94)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
        at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
        at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
        at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:438)
        at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:430)
        at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:147)
        at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2208)
        at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:253)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
        at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:226)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:141)

Kindly tell me how can I stored those JSON to Hive table. Thank you in advance


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Your JSON "value" is a STRING containing JSON {"value":string}, not nested JSON struct. Nested JSON struct should look like this:

{"value": {"DUUID": 67, "GUUID": 514, "EOT": 219.0, "cc": 3, "enghr": 20.0, "battvolt": 0.0, "EOP": 120.0, "ts": "2020-12-31T14:22:37", "ts1": 1609404757.2771647}}

If you can not fix JSON, then create table with value STRING and parse it using json_tuple:

CREATE EXTERNAL TABLE demo1.json11(
value string 
) 
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' 
LOCATION 'hdfs://localhost:9000/lambda3/test/';


select DUUID, GUUID,EOT,cc,enghr,battvolt,EOP,ts,ts1
  from demo1.json11 j
       lateral view json_tuple (j.value, 'DUUID', 'GUUID','EOT','cc','enghr','battvolt','EOP','ts','ts1') e 
                           as DUUID, GUUID,EOT,cc,enghr,battvolt,EOP,ts,ts1

Convert types if necessary, like this:

CAST(DUUID as int) as DUUID,
...
CAST(ts as timestamp) as ts,
CAST(ts1as timestamp) as ts1

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

57.0k users

...