>>> info
0 (dataset, license, sources, weight)
1 (dataset, license, sources, weight)
2 (dataset, license, sources, weight)
3 (dataset, license, sources, weight)
4 (dataset, license, sources, weight)
...
491877 (dataset, license, sources, surfaceEnd, surfac...
491878 (dataset, license, sources, surfaceEnd, surfac...
491879 (dataset, license, sources, surfaceEnd, surfac...
491880 (dataset, license, sources, surfaceEnd, surfac...
491881 (dataset, license, sources, surfaceEnd, surfac...
Name: edge_info, Length: 491882, dtype: object
>>> info.drop_duplicates()
0 (dataset, license, sources, weight)
1 (dataset, license, sources, weight)
70 (dataset, license, sources, surfaceEnd, surfac...
71 (dataset, license, sources, surfaceEnd, surfac...
Name: edge_info, dtype: object
>>> info.iloc[0]==info.iloc[1]
True
>>> info.iloc[0]==info.iloc[2]
True
>>> info.iloc[0]
dict_keys(['dataset', 'license', 'sources', 'weight'])
>>>
The above commands require a series object to drop duplicate items.
However, the results seem that still have duplicate value as show above.
The first row info.iloc[0]
and the second row info.iloc[1]
of info is equal, but the drop_duplicates()
function do not remove the second item.
Does any know the reason for the results?