Just wondering if someone can help me with an access query that's causing me problems.
I have a few sub queries to identify a customers last order date and their second to last order date:
To identify their last order date (not necessary to have this separately I suppose):
Code:
SELECT T1.email, max(T1.order_date) AS order_date
FROM invoicedsalesoldsystem AS T1
GROUP BY T1.email;
To identify their second to last order date:
Code:
SELECT T1.email, max(T1.order_date) AS order_date
FROM invoicedsalesoldsystem AS T1
WHERE T1.order_date < (SELECT max(T2.order_date) FROM invoicedsalesoldsystem T2 WHERE T2.email=T1.email)
GROUP BY T1.email;
Finally, I have a third query which draws upon these two and does some other calculations:
Code:
SELECT DISTINCT
invoicedsalesoldsystem.email,
max(invoicedsalesoldsystem.firstname) as firstname,
max(invoicedsalesoldsystem.surname) as surname,
lookupMaxDate.order_date,
lookup2ndMaxDate.order_date,
datediff("d", lookup2ndMaxDate.order_date, lookupMaxDate.order_date) as daysvar,
min(invoicedsalesoldsystem.order_date) as firstorderdate,
datediff("d", min(invoicedsalesoldsystem.order_date), lookupMaxDate.order_date) as totaldaysvar,
round((datediff("d", min(invoicedsalesoldsystem.order_date), lookupMaxDate.order_date)/count(invoicedsalesoldsystem.order_number)),0) as frequency,
count(invoicedsalesoldsystem.order_number) as totalorders,
sum(invoicedsalesoldsystem.gbp_total) as totalrevenue,
max(invoicedsalesoldsystem.currency) as crncy
FROM (invoicedsalesoldsystem INNER JOIN lookupMaxDate ON invoicedsalesoldsystem.email = lookupMaxDate.email) LEFT JOIN lookup2ndMaxDate ON invoicedsalesoldsystem.email = lookup2ndMaxDate.email
GROUP BY
invoicedsalesoldsystem.email,
lookupMaxDate.order_date,
lookup2ndMaxDate.order_date;
This all works fine when I tested it over 100 records. However, when I test it over the full dataset (500,000+ records) it just runs indefinitely. I left it for an hour once and it still hadn't completed so I had to cancel the execution.
Can someone advise me on a way to get this to run over a large dataset within a few minutes rather than potentially several hours?
Thanks,
Grant.