# How to Get Distinct Combinations of Multiple Columns in a PySpark DataFrame

How can we get all unique combinations of multiple columns in a PySpark DataFrame?

Suppose I have a DataFrame `df`

with columns `col1`

and `col2`

.

We can easily return all distinct values for a single column using `distinct()`

.

```
df.select('col1').distinct().collect()
# OR
df.select('col1').distinct().rdd.map(lambda r: r[0]).collect()
```

How can we get only distinct pairs of values in these two columns?

## Get distinct pairs

We can simply add a second argument to `distinct()`

with the second column name.

```
df.select('col1','col2').distinct().collect()
# OR
df.select('col1','col2').distinct().rdd.map(lambda r: r[0]).collect()
```

## Get distinct combinations for all columns

We can also get the unique combinations for all columns in the DataFrame using the asterisk `*`

.

```
df.select('*').distinct().collect()
# OR
df.select('*').distinct().rdd.map(lambda r: r[0]).collect()
```