Pyspark noob here. I have a data set that looks like this (with thousands of different start and endIDs):

startID,  endID
1         1
1         2
1         3
2         3
1         1

And I need to count up all the times (rows) where the combinations of startID and endID occurred together and get something like this:

startID   endID  count
1         1      2
1         2      1

