I have two data frames which are similar in structure, I need to update one of the data frame based on the other

Below are the tables in pandas DF, I am trying to merge two frames DF1 and DF2 based on columns cat, sub_cat and date so these columns act as merge criteria and update only the column count. Also, there could be additional data in DF2 which needs to be blindly added to the resulting DF if there is no match in the DF1 (based on the mentioned columns).

DF1

cat sub_cat date count
1 cat_1 sub_cat_1 2020-02-01 1
2 cat_2 sub_cat_1 2020-02-01 2
3 cat_2 sub_cat_1 2020-01-20 8
4 cat_1 sub_cat_1 2020-02-02 0

DF2

cat sub_cat date count
1 cat_1 sub_cat_1 2020-02-01 3
2 cat_2 sub_cat_1 2020-02-01 2
3 cat_3 sub_cat_1 2020-02-02 5

Here is the resulting DF3

cat sub_cat date count
1 cat_1 sub_cat_1 2020-02-01 3
2 cat_2 sub_cat_1 2020-02-01 2
3 cat_3 sub_cat_1 2020-02-02 5
4 cat_2 sub_cat_1 2020-01-20 8
5 cat_1 sub_cat_1 2020-02-02 0

I have highlighted the rows that are merged/added for reference.

I did refer to other questions/answer and tried a bit using df.set_index by as below

df1.set_index(['cat', 'sub_cat', 'date'])
df2.set_index(['cat', 'sub_cat', 'date'])
df1.update(df2)

but the above code is also replacing the dates which I do not want.



Read more here: https://stackoverflow.com/questions/65705046/i-have-two-data-frames-which-are-similar-in-structure-i-need-to-update-one-of-t

Content Attribution

This content was originally published by user3801185 at Recent Questions - Stack Overflow, and is syndicated here via their RSS feed. You can read the original post over there.

%d bloggers like this: