Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
2.8k views
in Technique[技术] by (71.8m points)

Populating JSON data from API in Python pandas DataFrame - TypeError and IndexError

I am trying to populate a pandas DataFrame with select information from JSON output fetched from an API.

candidate_list = []

for candidate in candidate_response['data']:
    if 'error' not in candidate_response:
       candidate_list.append([candidate['id'], candidate['attributes']['first_name'], candidate['attributes']
       ['last_name'], candidate['relationships']['educations']['data']['id']])

The DataFrame populates fine until I add candidate['relationships']['educations']['data']['id'], which throws TypeError: list indices must be integers or slices, not str.

When trying to get the values of the indexes for ['id'] by using candidate['relationships']['educations']['data'][0]['id'] instead, I get IndexError: list index out of range.

The JSON output looks something like:

"data": [
    {
        "attributes": {
            "first_name": "Tester",
            "last_name": "Testman",
            "other stuff": "stuff",
        },
        "id": "732887",
        "relationships": {
            "educations": {
                "data": [
                    {
                        "id": "605372",
                        "type": "educations"
                    },
                    {
                        "id": "605371",
                        "type": "educations"
                    },
                    {
                        "id": "605370",
                        "type": "educations"
                    }
                ]
            }
        },

How would I go about successfully filling a column in the DataFrame with the 'id's under 'relationships'>'educations'>'data'?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Please note then when using candidate['relationships']['educations']['data']['id'] you get that error because at data there is a list, and not a dictionary. And you cannot access dictionary by name.

Assuming, what you are trying to achieve is one entry per data.attributes.relationships.educations.data entry. Complete code that works and does what you are trying is:

import json

json_string = """{
    "data": [
        {
            "attributes": {
                "first_name": "Tester",
                "last_name": "Testman",
                "other stuff": "stuff"
            },
            "id": "732887",
            "relationships": {
                "educations": {
                    "data": [
                        {
                            "id": "605372",
                            "type": "educations"
                        },
                        {
                            "id": "605371",
                            "type": "educations"
                        },
                        {
                            "id": "605370",
                            "type": "educations"
                        }
                    ]
                }
            }
        }
    ]
}"""

candidate_response = json.loads(json_string)

candidate_list = []

for candidate in candidate_response['data']:
    if 'error' not in candidate_response:
        for data in candidate['relationships']['educations']['data']:
            candidate_list.append(
                [
                    candidate['id'], 
                    candidate['attributes']['first_name'], 
                    candidate['attributes']['last_name'], 
                    data['id']
                ]
            )

print(candidate_list)

Code run available at ideone.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...