Python Dataframe extract list of unique dates from a big datetimeindex of few million rows

My data frame has around 17 million rows. The index is DateTime. It is around one-second resolution one-year data. Now I want to extract a list of unique dates from it.

My code:

# sample df

df.index = DatetimeIndex(['2019-10-01 05:00:00', '2019-10-01 05:00:01',
               '2019-10-01 05:00:05', '2019-10-01 05:00:06',
               '2019-10-01 05:00:08', '2019-10-01 05:00:09',
               '2019-10-01 05:00:12', '2019-10-01 05:00:13',
               '2019-10-01 05:00:15', '2019-10-01 05:00:17',
               ...
               '2020-11-14 19:59:21', '2020-11-14 19:59:23',
               '2020-11-14 19:59:31', '2020-11-14 19:59:32',
               '2020-11-14 19:59:37', '2020-11-14 19:59:38',
               '2020-11-14 19:59:45', '2020-11-14 19:59:46',
               '2020-11-14 19:59:55', '2020-11-14 19:59:56'],
              dtype='datetime64[ns]', name='timestamp', length=17796121, freq=None)
dates = df.index.strftime('&Y-&m-%d').unique()

My above code gave the output. But it took around five minutes. Is there any better way by which I can get the dates much faster?



Read more here: https://stackoverflow.com/questions/64939852/python-dataframe-extract-list-of-unique-dates-from-a-big-datetimeindex-of-few-mi

Content Attribution

This content was originally published by Mainland at Recent Questions - Stack Overflow, and is syndicated here via their RSS feed. You can read the original post over there.

%d bloggers like this: