how to correctly fetch dataframe from url in python?

I want to automate data fetching from USDA website, where I am specifically interested in few categories for data selection. To do so, I tried the following:

import io
import requests
import pandas as pd

url = 'https://www.marketnews.usda.gov/mnp/ls-report-retail?&repType=summary&portal=ls&category=Retail&species=BEEF&startIndex=1'

query_list = {"Report Type":"item","species":"BEEF","portal":"ls","category":"Retail", "Regions":"National", "Grades":"ALL", "Cut": "All", "Dates_from":"2019-03-01", "Dates_to":"2021-02-01"}
req = requests.get(url, params=query_list)
df = pd.read_csv(io.StringIO(req.text), sep="\s\s+", engine="python")
df.to_csv("usda_report.csv")

but I couldn't get the expected dataframe that I want, here is the output that after I tried to run above attempt:

ParserError: Expected 1 fields in line 117, saw 2. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.

desired output

I need to pass these queries to do correct data selection: Category = "Retail"; Report Type = "Item"; Species = "Beef"; Region(s) = "National"; Dates_from = "2019-03-01"; Dates_to = "2021-02-15".

ideally, I want to pass those queries and want to get the following dataframe (head of dataframe):

enter image description here

update

in my desired outputs, I need those columns: Date, Region, Grade, Cut, Retail Items, Outlets or number of stores, Weighted Avg

from the above attempt, I couldn't get the output dataframe like this. How should I fetch data correctly? Can anyone suggest possible of doing this right in pandas? any idea?



Read more here: https://stackoverflow.com/questions/66326122/how-to-correctly-fetch-dataframe-from-url-in-python

Content Attribution

This content was originally published by Hamilton at Recent Questions - Stack Overflow, and is syndicated here via their RSS feed. You can read the original post over there.

%d bloggers like this: