How to process information on one column, and then build a logical statement?

I'm new with pandas and information processing and I'm struggling to do one thing a little bit complex in my opinion. I'll show you an example.

Let's say that I have a DataFrame df as follows:

    A    B    C    
a  237   1    0
b  243   0    -10candy
c  243   0    -15candy
d  243   0    again
e  243   0    -15candy+10of30000
f  243   1    0
g  246   0    -10
h  246   0    -20candy+10of20
i  246   0    -10
j  246   1    0
h  248   0    newcandybagopened
i  248   1    0
j  260   0    candybag70with74%~andmaybemorespecialchars)
k  260   0    +20candyof30sunnyday
l  260   1    0

So, I would like to process the modifications on the quantity of the candies, I would like to create another column in which the candy substractions would be added. The column A represents a group, I only want to compute the substraction of each group. The column B represent where a group "ends":

    A    B    C                          candy_substractions
a  237   1    0                                   0
b  243   0    -10candy                           10
c  243   0    -15candy                           25
d  243   0    again                              25
e  243   0    -15candy+10of30000                 40
f  243   1    0                                  40
g  246   0    -10                                10
h  246   0    -20candy+10of20                    30
i  246   0    -10                                40
j  246   1    0                                  40
h  248   0    newcandybagopened                   0
i  248   1    0                                   0
j  260   0    candybag70with74%~morespecialchars) 0 
k  260   0    +20candyof30sunnyday                0
l  260   1    0                                   0

Also, I want to compute how many candies have been added to each group. There are several types of candies which I know and we can set that prior to the processing.

    A    B    C                              candy_30000   candy_20000
a  237   1    0                                   0             0
b  243   0    -10candy                            0             0
c  243   0    -15candy                            0             0
d  243   0    again                               0             0
e  243   0    -15candy+10of3000067                10            0
e'  243   0   +10of30000+5of20                    20            5
f  243   1    0                                   20            5
g  246   0    -10                                 0             0
h  246   0    -20candy+10of20                     0             10
i  246   0    -10                                 0             10
j  246   1    0                                   0             10
h  248   0    newcandybagopened                   0             0
i  248   1    0                                   0             0
j  260   0    candybag70with74%~morespecialchars) 0             0
k  260   0    +20candyof30sunnyday                20            0
l  260   1    0                                   20            0

The problem here is that candies of type 20000 sometimes are written as 20, or maybe even 200. So I would like to have that in a set which I could modify easily. the same with the other candy type. It it's a substraction or an addition, it is always stated by it's arithmetic operators (-) and (+) respectively. There random numbers and messages I don't want to take into account. Like in row e or j.

Sorry but I have no idea from where to start. Any advices?

Thank you so much.



Read more here: https://stackoverflow.com/questions/66275783/how-to-process-information-on-one-column-and-then-build-a-logical-statement

Content Attribution

This content was originally published by enon97 at Recent Questions - Stack Overflow, and is syndicated here via their RSS feed. You can read the original post over there.

%d bloggers like this: