show distinct column values in pyspark dataframe: python

Question

Welcome To Ask or Share your Answers For Others

show distinct column values in pyspark dataframe: python

1 Answer

深蓝 · Answer 1 · 2021-10-23T18:28:27+0000

This should help to get distinct values of a column:

df.select('column1').distinct().collect()

Note that .collect() doesn't have any built-in limit on how many values can return so this might be slow -- use .show() instead or add .limit(20) before .collect() to manage this.

Categories

show distinct column values in pyspark dataframe: python

show distinct column values in pyspark dataframe: python

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags