Deal with carriage return while filtering rows of pandas dataframe with str.contains and AND operation

I would like to filter my dataframe on the rows whose "Libelle" string column contains every element in a specific list of strings.

I found some great information about how to filter on rows that contains the intersection of string variables in a list (see here ).

The solution is to use the following regex

df[df['Libelle'].str.contains(r'(?=.*word1)(?=.*word2)...((?=.*word3)', case = False, regex = True)

Now I am facing an issue. It works except for the the lines where there is a carriage return.

So I tried in vain the following regex :

r'(?=.*word1[.\r\n]*)(?=.*word2[.\r\n]*)...((?=.*word3[.\r\n]*)'

or

r'(?=[.\r\n]*word1[.\r\n]*)(?=[.\r\n]*word2[.\r\n]*)...((?=[.\r\n]*word3[.\r\n]*)'

I do not know if this could help : the dataframe contains data loaded from an excel file.

Does any one has faced a similar problem ? Any help is welcome !! Many Thanks !



Read more here: https://stackoverflow.com/questions/67169241/deal-with-carriage-return-while-filtering-rows-of-pandas-dataframe-with-str-cont

Content Attribution

This content was originally published by Etienne Numérogliss Drt at Recent Questions - Stack Overflow, and is syndicated here via their RSS feed. You can read the original post over there.

%d bloggers like this: