Cloud Stack Ninja

While splitting data into columns, there was some glitch, due to which I have got some noisy data.

    site          code
    ---           ---
0   apple_123     45
1   apple_456     xy_33
2   facebook_123  24
3   google_123    NaN
4   google_123    pq_51

I need to clean the data, such that I get the following result:

    site            code
    ---             ---
0   apple_123       45
1   apple_456_xy    33
2   facebook_123    24
3   google_123      NaN
4   google_123_pq   51

I have been able to obtain the rows that need to be modified, but am unable to progress further:

import numpy as np
import pandas as pd

site = ['apple_123','apple_456','facebook_123','google_123','google_123']
code = [45,'xy_33',24,np.nan,'pq_51']
df = pd.DataFrame(list(zip(site,code)), columns=['site','code'])

df[(~df.code.astype(str).str.isdigit())&(~df.code.isna())] 


Read more here: https://stackoverflow.com/questions/64400328/pandas-modifying-values-in-dataframe-from-another-column

Content Attribution

This content was originally published by SaadH at Recent Questions - Stack Overflow, and is syndicated here via their RSS feed. You can read the original post over there.

%d bloggers like this: