How to pull quotes and authors from a website in Python?

I have written the following code to pull the quotes from the webpage:

#importing python libraries

from bs4 import BeautifulSoup as bs
import pandas as pd
pd.set_option('display.max_colwidth', 500)
import time
import requests
import random
from lxml import html

#collect first page of quotes

page = requests.get("https://www.kdnuggets.com/2017/05/42-essential-quotes-data-science-thought-leaders.html")

#create a BeautifulSoup object

soup=BeautifulSoup(page.content, 'html.parser')
soup

print(soup.prettify())

#find all quotes on the page

soup.find_all('ol')

#pull just the quotes and not the superfluous data

Quote=soup.find(id='post-')
Quote_list=Quote.find_all('ol')
quote_list

At this point, I now want to just show the text in a list and not see < li > or < ol > tags I've tried using the .get_text() attribute but I get an error saying

ResultSet object has no attribute 'get_text'

How can I get only the text to return?

This is only for the first page of quotes - there is a second page which I am going to need to pull the quotes from. I will also need to present the data in a table with a column for the quotes and a column for the author from both pages.

Help is greatly appreciated... I'm still new to learning python and I've been working to this point for 8 hours on this code and feel so stuck/discouraged.



Read more here: https://stackoverflow.com/questions/64900011/how-to-pull-quotes-and-authors-from-a-website-in-python

Content Attribution

This content was originally published by dhalton at Recent Questions - Stack Overflow, and is syndicated here via their RSS feed. You can read the original post over there.

%d bloggers like this: