How to read multi index dataframe in python

Question

Welcome To Ask or Share your Answers For Others

How to read multi index dataframe in python

asked Feb 5, 2021 in Technique[技术] by 深蓝 (71.8m points)

How to read multi index dataframe in python

Here is my dataframe which called df

University  Subject  Colour
Melb        Math     Red
            English  Blue
Sydney      Math     Green
            Arts     Yellow
            English  Green
Ottawa      Med      Blue
            Math     Yellow

Both University and Subject are the index key for this dataframe

when I do this

print(df.to_dict('index'))

I get

{(Melb, Math): {'Colour': Red}, (Melb, English): {'Colour': Blue}, ...

When I do this

print(df["Colour"])

I get this

University  Subject  Colour
Melb        Math     Red
            English  Blue
Sydney      Math     Green
            Arts     Yellow
            English  Green
Ottawa      Med      Blue
            Math     Yellow

When I do

print(df["University"])

I get an error

KeyError: 'University'

What I want is a way to read each value separately

I want to read the University and another read for Subject and a third for the Colour

How to do that?

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-02-05T04:19:54+0000

A quicker way to do this is by using python's zip function, this method will be significantly faster than manually running a for loop.

Quick-Answer to your question:

university_list = list(zip(*df.index))[0]
subject_list = list(zip(*df.index))[1]
colour_list = list(df['Colour'])

Explaination

To get Indexes as List:

index_list = list(zip(*df.index))

Output:

[('Melb','Sydney','Ottawa'),('Math','English','Math','Arts',...)]

You will get a list of tuples where each tuples will be relating to an index column.

(columns will be in Left to Right order: such as 1st index-column will be the first tuple, 2nd index-column will be the second tuple and so on!)

Now, to get the Separate Index Column Lists you can simply do,

Universities = list(index_list[0]) #this will give you separate list for university ('Melb','Sydney','Ottawa')
Subjects = list(index_list[1]) #this will give you separate list for Subjects ('Math','English','Math','Arts',...)

Getting data as a list from Non-Index Columns

You can do this by simply doing,

column_data = list(df['column_name'])

#which in your case will be

colour_list = list(df['Colour'])

I am extending the answer to answer one of the comments.

Now, Imagine a case where you need the whole Dataframe as a list of Tuples where each tuple will have data of a column. (Index columns included)

The list will look something like,

[(Col-1_data, ,...),(Col-2_data, ,...),...]

To achieve something like this you will have to reset the indexes, Fetch the data and set indexes again. Below code will do the task,

index_names = list(df.index.names) #saving current indexes so that we can reassign them later.
df.reset_index(inplace = True)
dataframe_raw_list = df.values.tolist() #This will be a list of tuples where each tuple is a row of dataframe
df.set_index(index_names, inplace = True)

dataframe_columns_list = list(zip(*dataframe_raw_list)) #This will be a list of tuples where each tuple is a Column of dataframe

Output:

[(Col-1_data, ,...),(Col-2_data, ,...),...]

Categories

How to read multi index dataframe in python

How to read multi index dataframe in python

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Quick-Answer to your question:

Explaination

To get Indexes as List:

Getting data as a list from Non-Index Columns

I am extending the answer to answer one of the comments.

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags