Input:
df = pd.DataFrame([[121,'Customer Comments xxxx ttttt','loan, mortgage, payment, refinance, rate, new, time, credit, pay, current'],
[34,'Customer Comments xxxx','loan, mortgage, payment, refinance, rate, new, time, credit, pay, services'],
[356,'Customer Comments xxxx','loss, make, payment, refinance, rate, new, time, credit, pay, current'],
[908,'Customer Comments aaaaa','portal, improve, online, top, covid, web, deal, competitive, take, lost'],
[4356,'Customer Comments aaassds','portal, improve, website, top, covid, web, deal, competitive, take, care'],
[3333,'Customer Comments xxxx','communication, make, sure, process, rate, company, timely, interest, customer, know'],
[33456,'Customer Comments xxxx','communication, make, sure, process, rate, company, timely, interest, customer, lot']]
, columns=['Loan Number','Commetns','Topic_Keywords'])
df2=pd.DataFrame([[0,'loan, mortgage, payment, refinance, rate, new, time, credit, pay, current','Servicing','Refinance'],
[5,'closing, survey, time, notary, company, date, title, day, close, cost','Origination','Loan closing'],
[9,'service, customer, keep, good, work, excellent, great, continue, job, company','Servicing','good service'],
[6,'loan, phone, call, process, person, email, contact, time, processor, communication','Servicing','phone call process'],
[4, 'loan, helpful, processor, officer, professional, staff, knowledgeable, hire, work, process','Servicing','Staff/Agent behaviour'],
[3, 'process, easy, nothing, refinance, entire, whole, experience, time, everything, start','Origination','OnBoarding'],
[8, 'great, experience, everything, job, overall, company, nothing, work, mortgage, everyone','Servicing','good service'],
[1, 'portal, improve, online, top, covid, web, deal, competitive, take, care','Servicing','websites'],
[2, 'communication, make, sure, process, rate, company, timely, interest, customer, know', 'Origination','OnBoarding'],
[7, 'process, anything, website, app, change, think, easy, thing, use, mobile', 'Servicing','websites']]
,columns=['Dominant_Topic','Topic_Keywords','Cate','SubCategory'])
output:
outdf=pd.DataFrame([[121,'Customer Comments xxxx ttttt','loan, mortgage, payment, refinance, rate, new, time, credit, pay, current','Servicing','Refinance',10,100],
[34,'Customer Comments xxxx','loan, mortgage, payment, refinance, rate, new, time, credit, pay, services','Servicing','Refinance',9,90],
[356,'Customer Comments xxxx','loss, make, payment, refinance, rate, new, time, credit, pay, current','Servicing','Refinance',8,80],
[908,'Customer Comments aaaaa','portal, improve, online, top, covid, web, deal, competitive, take, lost','Servicing','websites',9,90],
[4356,'Customer Comments aaassds','portal, improve, website, top, covid, web, deal, competitive, take, care','Servicing','websites',10,100],
[3333,'Customer Comments xxxx','communication, make, sure, process, rate, company, timely, interest, customer, know','Origination','OnBoarding',9,90],
[33456,'Customer Comments xxxx','communication, make, sure, process, rate, company, timely, interest, customer, lot','Origination','OnBoarding',9,90]],
columns=['Loan Number','Commetns','Topic_Keywords','Category','subCategory','String_match','match_score'])
I ran the topic modeling and got the topics from each comments and I want to assign the category and sub category from another data frame with help of maximum words matching scores.
Let me know if any queries plz don't devote
Read more here: https://stackoverflow.com/questions/67003238/match-the-word-frequency-and-assign-max-scores-category-and-sub-category-from-a
Content Attribution
This content was originally published by deepan1303 chakravarthi at Recent Questions - Stack Overflow, and is syndicated here via their RSS feed. You can read the original post over there.