match the word frequency and assign max score’s category and sub category from another data frame in pandas

Input:

df = pd.DataFrame([[121,'Customer Comments xxxx ttttt','loan, mortgage, payment, refinance, rate, new, time, credit, pay, current'],
[34,'Customer Comments xxxx','loan, mortgage, payment, refinance, rate, new, time, credit, pay, services'],
[356,'Customer Comments xxxx','loss, make, payment, refinance, rate, new, time, credit, pay, current'],
[908,'Customer Comments aaaaa','portal, improve, online, top, covid, web, deal, competitive, take, lost'],
[4356,'Customer Comments aaassds','portal, improve, website, top, covid, web, deal, competitive, take, care'],
[3333,'Customer Comments xxxx','communication, make, sure, process, rate, company, timely, interest, customer, know'],
[33456,'Customer Comments xxxx','communication, make, sure, process, rate, company, timely, interest, customer, lot']]
          , columns=['Loan Number','Commetns','Topic_Keywords'])


  df2=pd.DataFrame([[0,'loan, mortgage, payment, refinance, rate, new, time, credit, pay, current','Servicing','Refinance'],
[5,'closing, survey, time, notary, company, date, title, day, close, cost','Origination','Loan closing'],
[9,'service, customer, keep, good, work, excellent, great, continue, job, company','Servicing','good service'],
[6,'loan, phone, call, process, person, email, contact, time, processor, communication','Servicing','phone call process'],
[4, 'loan, helpful, processor, officer, professional, staff, knowledgeable, hire, work, process','Servicing','Staff/Agent behaviour'],
[3, 'process, easy, nothing, refinance, entire, whole, experience, time, everything, start','Origination','OnBoarding'],
[8, 'great, experience, everything, job, overall, company, nothing, work, mortgage, everyone','Servicing','good service'],
[1, 'portal, improve, online, top, covid, web, deal, competitive, take, care','Servicing','websites'],
[2, 'communication, make, sure, process, rate, company, timely, interest, customer, know',  'Origination','OnBoarding'],
[7, 'process, anything, website, app, change, think, easy, thing, use, mobile', 'Servicing','websites']]
,columns=['Dominant_Topic','Topic_Keywords','Cate','SubCategory'])

output:

outdf=pd.DataFrame([[121,'Customer Comments xxxx ttttt','loan, mortgage, payment, refinance, rate, new, time, credit, pay, current','Servicing','Refinance',10,100],
[34,'Customer Comments xxxx','loan, mortgage, payment, refinance, rate, new, time, credit, pay, services','Servicing','Refinance',9,90],
[356,'Customer Comments xxxx','loss, make, payment, refinance, rate, new, time, credit, pay, current','Servicing','Refinance',8,80],
[908,'Customer Comments aaaaa','portal, improve, online, top, covid, web, deal, competitive, take, lost','Servicing','websites',9,90],
[4356,'Customer Comments aaassds','portal, improve, website, top, covid, web, deal, competitive, take, care','Servicing','websites',10,100],
[3333,'Customer Comments xxxx','communication, make, sure, process, rate, company, timely, interest, customer, know','Origination','OnBoarding',9,90],
[33456,'Customer Comments xxxx','communication, make, sure, process, rate, company, timely, interest, customer, lot','Origination','OnBoarding',9,90]],
columns=['Loan Number','Commetns','Topic_Keywords','Category','subCategory','String_match','match_score'])

I ran the topic modeling and got the topics from each comments and I want to assign the category and sub category from another data frame with help of maximum words matching scores.

Let me know if any queries plz don't devote



Read more here: https://stackoverflow.com/questions/67003238/match-the-word-frequency-and-assign-max-scores-category-and-sub-category-from-a

Content Attribution

This content was originally published by deepan1303 chakravarthi at Recent Questions - Stack Overflow, and is syndicated here via their RSS feed. You can read the original post over there.

%d bloggers like this: