Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
868 views
in Technique[技术] by (71.8m points)

importing nested dictionary data in pandas

If my json file looks like this...

!head test.json

{"Item":{"title":{"S":"https://medium.com/media/d40eb665beb374c0baaacb3b5a86534c/href"}}}
{"Item":{"title":{"S":"https://fasttext.cc/docs/en/autotune.html"}}}
{"Item":{"title":{"S":"https://nlp.stanford.edu/~socherr/EMNLP2013_RNTN.pdf"}}}
{"Item":{"title":{"S":"https://github.com/avinashbarnwal/GSOC-2019/tree/master/AFT/test/data/neuroblastoma-data-master/data/H3K27ac-H3K4me3_TDHAM_BP"}}}

I can import the data in pandas using...

import pandas as pd
df = pd.read_json("test.json", lines=True, orient="columns")

But the data looks like this...

Item
0   {'title': {'S': 'https://medium.com/media/d40e...
1   {'title': {'S': 'https://fasttext.cc/docs/en/a...
2   {'title': {'S': 'https://nlp.stanford.edu/~soc...
3   {'title': {'S': 'https://github.com/avinashbar...

I need all the URL in a single column.

question from:https://stackoverflow.com/questions/65839626/importing-nested-dictionary-data-in-pandas

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)
  • In this case it will be easiest to use pandas.json_normalize on the 'Item' column of df
  • Since you have a column of links, I've included the code to display it as clickable links in a notebook, or save to an html file.
import pandas as pd
from IPython.display import HTML  # used to show clickable link in a notebook

# read the file in as you are already doing
df = pd.read_json("test.json", lines=True, orient="columns")

# normalized the Item column
df = pd.json_normalize(df.Item)

# optional steps
# make the link clickable
df['title.S'] = '<a href=' + df['title.S'] + '>' +  df['title.S'] + '</a>'

# display clickable dataframe in notebook
HTML(df.to_html(escape=False))

# save to html file
HTML(so.to_html('test.html', escape=False))

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...