How do I solve Python XLS Format Problem?

My code (Python) is as below. It works with a correct .xls file. But my .xls files are a little bit different. Normally I read them as html. Accordingly, when I change "file_excel" or "pandas_read_excel" to "file_html" and "pandas_read_html" it does not work. Can you suggest anything to overcome this problem?

I have several excel files (.xls) in "deneme" folder. I can concentanate them into one excel with that code below. But I have another bunch of xls files in another folder called "ORNEKLEM". When I apply this code with this path, it turns back an error message saying that they are corrupted. I used to overcome this problem by writing read_html instead of read_excel. But this time it does not work.

ERROR Message I get is like that: Unsupported format, or corrupt file: Expected BOF record;

import glob
import pandas

df_all = pandas.DataFrame()

for file_excel in glob.glob("/Users/ooral/Desktop/deneme/*.xls"):
    print(file_excel)
    df_file = pandas.read_excel(file_excel, 'Sheet1')
    df_all = df_all.append(df_file, ignore_index=True)

df_all.to_excel('/Users/ooral/Desktop/deneme/all.xlsx', index=False)



Read more here: https://stackoverflow.com/questions/67016030/how-do-i-solve-python-xls-format-problem

Content Attribution

This content was originally published by OYTUN ORAL at Recent Questions - Stack Overflow, and is syndicated here via their RSS feed. You can read the original post over there.

%d bloggers like this: