Filter dictionary in pyspark with key names

Newbie in pyspark. Given a dictionary like column in a dataset, I want to grab the value from a key given that the value from another key is satisfied.

Example: Say I have a column 'statistics' in a dataset, where each data row looks as:

array
0: {"hair": "black", "eye": "white", "metric": "feet"}
1: {"hair": "blue", "eye": "white", "metric": "m"}
2: {"hair": "red", "eye": "brown", "metric": "feet"}
3: {"hair": "yellow", "eye": "white", "metric": "cm"}

I want of get the value of 'eye' whenever hair is 'black'

I tried:

select
filter(statistics.hair, x -> x == "black") as Eyecolor
from arrayData

but I'm unable to grab the value for eye, please assist.



Read more here: https://stackoverflow.com/questions/68474081/filter-dictionary-in-pyspark-with-key-names

Content Attribution

This content was originally published by user1783739 at Recent Questions - Stack Overflow, and is syndicated here via their RSS feed. You can read the original post over there.

%d bloggers like this: