Check if dataframe row in colA, colB matches any row in colC, colD

Using PySpark. Working with only one dataframe. I want to flag if a specific column's rows exist inside another set of columns rows. A illustration of the transformation is here.

The focus columns here are column1, column2, and column4, column5.

Row 1 for column1, column2 have 0,0. This combination exists in column5, column6 so its True. We also have Null, Null 3 rows in a row which can be found in column5, column6 so its also True for each row.

submitted by /u/ILikePlanning
[link] [comments]

Read more here:

Content Attribution

This content was originally published by /u/ILikePlanning at Microsoft Azure, and is syndicated here via their RSS feed. You can read the original post over there.

%d bloggers like this: