Using Python I'd like to write some code that classifies all items where the cumulative sum of the Miles column <=2.5 as being "IN" and the rest "OUT". Are there any suggestions where to start?
Example Data set
Rank Name Miles
1 A 0.5
2 A 1
3 B 1
4 B 1
5 C 2
Desired Output
Rank Name Miles Assign
1 A 0.5 IN
2 A 1 IN
3 B 1 IN
4 B 1 OUT
5 C 2 OUT
Read more here: https://stackoverflow.com/questions/65713474/use-cumulative-sum-to-assign-a-value-in-python-pyspark
Content Attribution
This content was originally published by steppermotor at Recent Questions - Stack Overflow, and is syndicated here via their RSS feed. You can read the original post over there.