SUM function sometimes takes more than 30 seconds

I'm using POSTGRESQL database and have pretty simple query that calculates total revenue for selected time period (startDate and endDate input parameters).

Here is a query:

SELECT SUM(o."ConvertedTotal") as "totalRevenue"
FROM public.processedorder o
WHERE o."ReceivedDate" BETWEEN '2020-07-23T00:00:00.000Z' and '2021-06-05T00:00:00.000Z'

Here is public.processedorder table definition:

CREATE TABLE public.processedorder (
    "OrderId" int4 NOT NULL,
    "ChannelId" int4 NOT NULL,
    "ShippingId" int4 NOT NULL,
    "CountryId" int4 NOT NULL,
    "LocationId" int4 NOT NULL,
    "PackagingId" int4 NOT NULL,
    "ConvertedTotal" numeric(18,6) NOT NULL,
    "ConvertedSubtotal" numeric(18,6) NOT NULL,
    "ConvertedShippingCost" numeric(18,6) NOT NULL,
    "ConvertedShippingTax" numeric(18,6) NOT NULL,
    "ConvertedTax" numeric(18,6) NOT NULL,
    "ConvertedDiscount" numeric(18,6) NOT NULL,
    "ConversionRate" numeric(18,6) NOT NULL,
    "Currency" varchar(3) NOT NULL,
    "OriginalTotal" numeric(18,6) NOT NULL,
    "OriginalSubtotal" numeric(18,6) NOT NULL,
    "OriginalShippingCost" numeric(18,6) NOT NULL,
    "OrignalShippingTax" numeric(18,6) NOT NULL,
    "OriginalTax" numeric(18,6) NOT NULL,
    "OriginalDiscount" numeric(18,6) NOT NULL,
    "ReceivedDate" timestamp NOT NULL,
    "DispatchByDate" timestamp NOT NULL,
    "ProcessedDate" timestamp NOT NULL,
    "HoldOrCancel" bool NOT NULL,
    "CustomerHash" varchar(100) NOT NULL,
    "EmailHash" varchar(100) NOT NULL,
    "GetPostalCode" varchar(10) NOT NULL,
    "TagId" uuid NOT NULL,
    "timestamp" timestamp NOT NULL,
    "IsRMA" bool NOT NULL DEFAULT false,
    "ConversionType" int4 NOT NULL DEFAULT 0,
    "ItemWeight" numeric(18,6) NULL,
    "TotalWeight" numeric(18,6) NULL,
    "PackageWeight" numeric(18,6) NULL,
    "PackageCount" int4 NULL,
    CONSTRAINT processedorder_pkey PRIMARY KEY ("OrderId"),
    CONSTRAINT processedorder_tagid_unique UNIQUE ("TagId")
)
WITH (
    fillfactor=50,
    autovacuum_enabled=false
);
CREATE INDEX ix_processedorder_receieveddate ON public.processedorder USING btree ("ReceivedDate" DESC NULLS LAST) WITH (fillfactor='50');

Sometimes the query uses ix_processedorder_receieveddate index, sometimes not, I guess it depends on startDate and endDate params.

Here is a query plan of the query above:

Finalize Aggregate  (cost=92468.33..92468.34 rows=1 width=32) (actual time=371.130..371.131 rows=1 loops=1)
  ->  Gather  (cost=92468.11..92468.32 rows=2 width=32) (actual time=371.074..371.105 rows=3 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        ->  Partial Aggregate  (cost=91468.11..91468.12 rows=1 width=32) (actual time=367.789..367.790 rows=1 loops=3)
              ->  Parallel Seq Scan on processedorder o  (cost=0.00..91121.46 rows=138661 width=7) (actual time=0.021..268.513 rows=114050 loops=3)
                    Filter: (("ReceivedDate" >= '2020-07-23 00:00:00'::timestamp without time zone) AND ("ReceivedDate" <= '2021-06-05 00:00:00'::timestamp without time zone))
                    Rows Removed by Filter: 296245

Basically the summation is over ~200,000 rows

Do you have any ideas guys, could it be because of invalid format of timestamp and it's not uses the index or what?

Thanks.



Read more here: https://stackoverflow.com/questions/67921852/sum-function-sometimes-takes-more-than-30-seconds

Content Attribution

This content was originally published by Vlad Kuzmich at Recent Questions - Stack Overflow, and is syndicated here via their RSS feed. You can read the original post over there.

%d bloggers like this: