I have added a query (Last5DatesForHorse) to your database.
Code:SELECT [running lines].horse, [running lines].rctrack, [running lines].RCDate, [running lines].rcRace, [running lines].[Date] FROM [running lines] WHERE ((([running lines].[date]) In ( select top 5 XX.[Date] from [running lines] as XX WHERE XX.horse = [running lines].horse Order By XX.[Date] desc )) ) GROUP BY [running lines].horse, [running lines].rctrack, [running lines].RCDate, [running lines].rcRace, [running lines].[Date] ORDER BY [running lines].horse, [running lines].[Date] DESC;
Thanks. I'll give it a test tonight. If this works, let me know if I can make a small contribution to the forum for all the help.
I tested it on the demo database and it worked quickly and just fine. I tweaked it slightly to change the sort sequence and add one other field. It worked fine again. I was thrilled. So I tested it on my full database which is about 60 thousand records and growing. It started grinding away, but then it seemed to just hang. I put it back into it's original form in case one of my tweaks caused an issue (which I doubt), but I got the same thing result. I'm not sure where to go with it from here. I may play around with it a little more using a demo database that's larger than the one I gave you but smaller than the full database to see what happens. I wouldn't think 60k records should kill it, but it does seem that way. If 60k is an issue, the eventual 500k or more will be impossible.
Last edited by wcrimi; 12-29-2014 at 08:41 PM.
Update.
I ran the query with about 30K records and it took about 30 minutes to run, but it did work.
There may be a way I can break up the full table into logical units (by Track, Race Distance, Class etc...) and run them separately. That won't be ideal, but it's at least a potential solution. Unless someone has an idea for making the query more efficient, I'm going to stop here and say thank you for the help.
I have a few ideas for manipulating my data that will hopefully improve the speed of this query, but as far as I am concerned this one can be closed. I want to thank you for all the help. My frustration must have been showing. Sorry about that.
No problem. Do you want to post the latest SQL you are using? 30 minutes is a long time, but it could be a lack of indexing (maybe), or the hardware you're using, or other load on network..... or post db with large amount of data.
I'm running it on my PC.
SELECT [running lines].rctrack, [running lines].RCDate, [running lines].rcRace, [running lines].horse, [running lines].Date, [running lines].Race
FROM [running lines]
WHERE ((([running lines].date) In (select top 5 XX.[Date] from
[running lines] as XX
WHERE XX.horse = [running lines].horse
Order By XX.[Date] desc )))
GROUP BY [running lines].rctrack, [running lines].RCDate, [running lines].rcRace, [running lines].horse, [running lines].Date, [running lines].Race
ORDER BY [running lines].rctrack, [running lines].RCDate, [running lines].rcRace, [running lines].horse, [running lines].Date DESC;
I ran your latest query on the sample database you gave before (demo).
using this
and getCode:Sub Checkspeed() Dim starttime As Date Dim endtime As Date starttime = Now Debug.Print starttime DoCmd.OpenQuery "LatestQueryByOP" endtime = Now Debug.Print endtime Debug.Print endtime - starttime & " Elapsed" End Sub
I'm running on a laptop. There is a noticeable time lag, but acceptable with 2200 records.Code:31/12/2014 11:32:12 AM 31/12/2014 11:32:18 AM 6.94444461259991E-05 Elapsed
Maybe someone else has some ideas to improve performance.
It seemed to do well up until around 10k records. There was a huge lag at 30K (I initially thought it was hung). At 30k, I think it took between 20-30 minutes. At full size just over 60k, it was about 45 minutes. I'm all ears if anyone has an idea.
I ran your latest query on the sample database you gave before (demo).
using this
and getCode:Sub Checkspeed() Dim starttime As Date Dim endtime As Date starttime = Now Debug.Print starttime DoCmd.OpenQuery "LatestQueryByOP" endtime = Now Debug.Print endtime Debug.Print endtime - starttime & " Elapsed" End Sub
I'm running on a laptop. There is a noticeable time lag, but acceptable with 2200 records.Code:31/12/2014 11:32:12 AM 31/12/2014 11:32:18 AM 6.94444461259991E-05 Elapsed
Maybe someone else has some ideas to improve performance.
wcrimi,
I was doing some googling and found a technique for improving Top N queries.
I made a copy of [running lines] table called MyRLWithID.
I added an autonumber field (ID) to the MyRLWithID table in the demo database.
I ran this query
And it was almost instantaneous.Code:SELECT * FROM ( SELECT a1.rctrack, a1.RCDate, a1.rcRace, a1.horse, a1.Date, a1.Race, COUNT(*) AS CategoryRank FROM [MyRLWithID] a1 INNER JOIN [MyRLWithID] a2 ON a1.horse = a2.horse AND a1.Id <= a2.ID GROUP BY a1.rctrack, a1.rcdate, a1.rcrace, a1.horse, a1.date, a1.race, a1.id ) AS RankingQuery WHERE CategoryRank <= 5 ORDER BY rctrack,horse, CategoryRank
Suggest you give it a try --just make sure you put in the proper table name.Code:31/12/2014 1:57:16 PM 31/12/2014 1:57:17 PM 1.15740695036948E-05 Elapsed
I've noticed that Access handles sub-selects very poorly compared to some of the other databases I've used.
One solution is to limit the number of Records the query looks at by including a semi-recent cut-off date. This would eliminate some of the data making your query faster. The downside is that you would lose some (older) data that might otherwise show up.
Another solution would be to use VBA to dump what your subquery normally finds into a temporary table and then just run your outer query on that temporary table. Worst case scenario, you would still get the huge delay when you regenerate your temporary table (which would have to be done every time new data is added to [running lines]), but any subsequent queries would be much faster.Code:SELECT [running lines].rctrack, [running lines].RCDate, [running lines].rcRace, [running lines].horse, [running lines].Date, [running lines].Race FROM [running lines] WHERE ((([running lines].date) In ( SELECT TOP 5 XX.[Date] FROM [running lines] as XX WHERE XX.horse = [running lines].horse AND XX.[Date] >= #6/1/2014# ORDER BY XX.[Date] DESC)) ) GROUP BY [running lines].rctrack, [running lines].RCDate, [running lines].rcRace, [running lines].horse, [running lines].Date, [running lines].Race ORDER BY [running lines].rctrack, [running lines].RCDate, [running lines].rcRace, [running lines].horse, [running lines].Date DESC;
I took Rawb's sql and made query "Rawb", the latest sql from wcrimi as query "LatestQueryByOP" and the ranking query "QuickerQueryJED" (adjusted as follow to get proper sorting)
I ran these with the following code using the demo database as supplied earlier.Code:SELECT * FROM ( SELECT a1.rctrack, a1.RCDate, a1.rcRace, a1.horse, a1.Date, a1.Race, COUNT(*) AS CategoryRank FROM MyRLWithID AS a1 INNER JOIN MyRLWithID AS a2 ON (a1.Id <= a2.ID) AND (a1.horse = a2.horse) GROUP BY a1.rctrack, a1.rcdate, a1.rcrace, a1.horse, a1.date, a1.race, a1.id) AS RankingQuery WHERE (((RankingQuery.[CategoryRank])<=5) ) ORDER BY RankingQuery.rctrack, RankingQuery.RCrace, RankingQuery.horse, RankingQuery.CategoryRank;
which calls the sub procedureCode:Sub testem() Dim i As Integer Dim q(2) As String q(0) = "LatestQueryByOP" q(2) = "Rawb" q(1) = "QuickerQueryJED" For i = 0 To UBound(q) Checkspeed (q(i)) Next End Sub
with these resultsCode:Sub Checkspeed(qName As String) Dim CTimer0 As MyTimer Set CTimer0 = New MyTimer Debug.Print vbCrLf & qName Debug.Print "Start " & Now CTimer0.StartTimer DoCmd.OpenQuery qName Debug.Print "End " & Now Debug.Print CTimer0.EndTimer & " ~millisecs" DoCmd.Close acQuery, qName Set CTimer0 = Nothing End Sub
Interesting numbers since the cut off in Rawb ofCode:LatestQueryByOP Start 31/12/2014 2:56:28 PM End 31/12/2014 2:56:32 PM 4453 ~millisecs QuickerQueryJED Start 31/12/2014 2:56:32 PM End 31/12/2014 2:56:32 PM 169 ~millisecs Rawb Start 31/12/2014 2:56:32 PM End 31/12/2014 2:56:37 PM 4993 ~millisecs
>= #6/1/2014#, returns 507 records and the "LatestQueryByOP" without the cutoff returns 607 records.
QuickerQueryJED also returns 607 records.
For reference the MyTimer class module is
Code:Option Compare Database Option Explicit ' 'This class never loads children, never does anything that should cause problems ' Private Declare Function apiGetTime Lib "winmm.dll" _ Alias "timeGetTime" () As Long Private lngStartTime As Long Private Sub Class_Initialize() 10 StartTimer End Sub 'THESE FUNCTIONS / SUBS ARE USED TO IMPLEMENT CLASS FUNCTIONALITY Function EndTimer() 'calculate duration by getting current time and subtracting start time 10 EndTimer = apiGetTime() - lngStartTime End Function Sub StartTimer() 'get the time when the Timer starts 10 lngStartTime = apiGetTime() End Sub
I'll review all this stuff this evening and tomorrow and let you know if I have any questions when I attempt it!! Thanks.
orange's results were really interesting. After seeing that, I went back and played around with my Query a little bit and found the problem.
The date check needs to be done in the outer query.
Updated Query:
Even with the speed improvement though, orange's solution is definitely the superior one:Code:SELECT [running lines].rctrack, [running lines].RCDate, [running lines].rcRace, [running lines].horse, [running lines].Date, [running lines].Race FROM [running lines] WHERE [running lines].date In (SELECT TOP 5 XX.[Date] FROM [running lines] as XX WHERE XX.horse = [running lines].horse AND XX.[Date] ORDER BY XX.[Date] DESC) AND [running lines].date)>=#6/1/2014# GROUP BY [running lines].rctrack, [running lines].RCDate, [running lines].rcRace, [running lines].horse, [running lines].Date, [running lines].Race ORDER BY [running lines].rctrack, [running lines].RCDate, [running lines].rcRace, [running lines].horse, [running lines].Date DESC;
Also, I just showed everyone that my computer is slower than orange'sCode:LatestQueryByOP Start 12/31/2014 5:31:39 PM End 12/31/2014 5:31:47 PM 7575 ~millisecs QuickerQueryJED Start 12/31/2014 5:31:37 PM End 12/31/2014 5:31:37 PM 334 ~millisecs Rawb Start 12/31/2014 5:31:37 PM End 12/31/2014 5:31:39 PM 2441 ~millisecs
Some of this stuff is a bit over my head. I would prefer not to use a date cutoff on this query because I actually have another query planned with a date cutoff that will be used in a different application. When it comes time to create that one, I can revisit it here. Basically, you guys are solving my next thread now. Thanks.
If I use the following (just with my database names) I should be in better shape?
SELECT *
FROM
(
SELECT a1.rctrack,
a1.RCDate,
a1.rcRace,
a1.horse,
a1.Date,
a1.Race,
COUNT(*) AS CategoryRank
FROM
[MyRLWithID] a1
INNER JOIN
[MyRLWithID] a2
ON a1.horse = a2.horse
AND a1.Id <= a2.ID
GROUP BY
a1.rctrack,
a1.rcdate,
a1.rcrace,
a1.horse,
a1.date,
a1.race,
a1.id
) AS RankingQuery
WHERE CategoryRank <= 5
ORDER BY rctrack,horse, CategoryRank
I created a new query RawbRevised based on Rawb's adjusted SQL. I also added an ID field autonumber to [running lines]. I did not change the composite Primary Key that wcrimi has set up. I copied my query to use [running lines] tables, as the other queries are doing.
My latest timings are:
wcrimi -- Once you add a field to your table, call it ID, make it autonumber. Then you should be able to useCode:LatestQueryByOP Start 31/12/2014 5:48:39 PM End 31/12/2014 5:48:45 PM 6034 ~millisecs QuickerQueryJED2 Start 31/12/2014 5:48:45 PM End 31/12/2014 5:48:45 PM 235 ~millisecs Rawb Start 31/12/2014 5:48:45 PM End 31/12/2014 5:48:52 PM 6968 ~millisecs RawbRevised Start 31/12/2014 5:48:52 PM End 31/12/2014 5:48:54 PM 1828 ~millisecs
which is my query QuickerQueryJED2.Code:SELECT * FROM ( SELECT a1.rctrack, a1.RCDate, a1.rcRace, a1.horse , a1.Date, a1.Race, COUNT(*) AS CategoryRank FROM [Running lines] AS a1 INNER JOIN [Running lines] AS a2 ON (a1.horse = a2.horse) AND (a1.Id <= a2.ID) GROUP BY a1.rctrack, a1.rcdate, a1.rcrace, a1.horse, a1.date, a1.race, a1.id) AS RankingQuery WHERE (((RankingQuery.[CategoryRank])<=5) ) ORDER BY RankingQuery.rctrack, RankingQuery.RCrace, RankingQuery.horse, RankingQuery.CategoryRank;
Good luck. It's been a learning effort for sure.