Page 1 of 2 12 LastLast
Results 1 to 15 of 18
  1. #1
    xcess is offline Novice
    Windows 7 32bit Access 2007
    Join Date
    Oct 2018
    Location
    Texas
    Posts
    9

    Query won't run on large database

    Hey all,

    I have a query I have proof-tested on a smaller database that either 1) won’t work or 2) is taking an incredible amount of time to run on the real (larger) database.

    I have two tables, one Master Table with ~1.5 million records (names, addresses, etc.) and one Names Table with ~1,500 records (just names).

    I want to know if a name from the Names Table matches a name in the Master Table, and vice-versa. If there’s a match, I want all of the information for that record in the Master Table to show up (name, address, etc.).

    The names likely won’t be exact. E.g. in the Master Table it might be listed as GRIFFEY KEN JR but in the Names Table it might be GRIFFEY KEN (missing the JR). Also, due to how the names are stored in the Master Table, there may be miscellaneous characters in front of the name, e.g. #121 GRIFFEY KEN JR %C/O.

    I have a query set up to perform this task. Due to the large # of records, I created a smaller test database and ran it, and it worked perfectly. However, when I run it on my actual database, it takes forever to run. I stop it after ~45 minutes, so I don’t know if it will actually finish.

    Is there a way to speed this up? Access didn’t show a progress bar, so I couldn’t tell if it was working or if it was going to crash.

    Thanks!

    My query is structured as show below:

    Name_Search: IIf(InStr([Names_Table]![Full Name],[Master_Table]![MAILTO])>0,[ Master_Table]![MAILTO],IIf(InStr([Master_Table]![MAILTO],[ Names_Table]![Full Name])>0,[ Master_Table]![MAILTO],0))



    Criteria
    <>0

    P.S. When I ran this query in the actual database, it used to give the error message data type mismatch, but I deleted all instances in both tables that weren’t text, so I don’t think this is the issue. I also ensured the field in each table format was set to TEXT. There are some records in the Master Table that begin with numbers, not sure if this has anything to do with it?

  2. #2
    June7's Avatar
    June7 is offline VIP
    Windows 10 Access 2010 32bit
    Join Date
    May 2011
    Location
    The Great Land
    Posts
    52,926
    You haven't provided a query statement, just one expression to calculate a field value.
    How to attach file: http://www.accessforums.net/showthread.php?t=70301 To provide db: copy, remove confidential data, run compact & repair, zip w/Windows Compression.

  3. #3
    isladogs's Avatar
    isladogs is offline MVP / VIP
    Windows 10 Access 2010 32bit
    Join Date
    Jan 2014
    Location
    Somerset, UK
    Posts
    5,977
    The obvious way to speed this up would be to have a unique primary key field that is consistent in each table!
    Do you have that?
    Colin, Access MVP, Website, email
    The more I learn, the more I know I don't know. When I don't know, I keep quiet!
    If I don't know that I don't know, I don't know whether to answer

  4. #4
    xcess is offline Novice
    Windows 7 32bit Access 2007
    Join Date
    Oct 2018
    Location
    Texas
    Posts
    9
    I will check if I have a unique primary key field consistent in each table when I work on it in the morning.

    I will also try scaling my test database to get time estimates, and try interpolating how long it should take. X minutes for 1,000 records, y minutes for 100,000 records, then approx z minutes for my 1.5 million records.

  5. #5
    Join Date
    Apr 2017
    Posts
    1,679
    Consider having back-end of your database as a SQL Server database.

  6. #6
    CJ_London is offline VIP
    Windows 10 Access 2010 32bit
    Join Date
    Mar 2015
    Posts
    11,430
    And are relevant fields indexed?

  7. #7
    orange's Avatar
    orange is online now Moderator
    Windows 10 Access 2010 32bit
    Join Date
    Sep 2009
    Location
    Ottawa, Ontario, Canada; West Palm Beach FL
    Posts
    16,726
    I don't think sql server as a back end would change things, and it doesn't seem that that is an available option. It seems the issue is with indexing as has been mentioned. Another issue in my view is the uncertainty of what exactly is in the Master Table and the Names table. Some review of things like #121 and %C/O should be analyzed. What exactly is the Master Table (purpose, contents, cleanliness..)? Same for Names? Getting the tables designed/indexed should reduce processing time.
    More samples of your data may help you get focused advice.

  8. #8
    CJ_London is offline VIP
    Windows 10 Access 2010 32bit
    Join Date
    Mar 2015
    Posts
    11,430
    Without any more data to go on, one has to assume you are using a cartesian query i.e. no joins. With 1.5m records in one, and 1,500 in the other, you are looking at 2,250,000,000 comparisons which is going to take a very long time regardless of what system you use. Even if you achieved 10,000 comparisons a second. that is still 225,000 seconds =62.5 hours.

    since you are trying to find GRIFFEY KEN in #121 GRIFFEY KEN JR %C/O., indexing won't be effective. I suggest you need to break your data into smaller chunks in indexed fields, at the very least lose the #121, then you can make use of indexing with something like this

    Name_Search: [Names_Table].[Full Name Red] Like [Master_Table].[MAILTO Red] & "*" OR [Master_Table].[MAILTO Red] Like [Names_Table].[Full Name Red] & "*"

    Criteria =True

    where Red is the reduced field

    Alternatively, perhaps there is another field such as town which you can link on to reduce the number of comparisons required. Or, since we are still in the dark about what the data really looks like, this may or may not be a worthwhile suggestion but you can perhaps create a query to determine the initials e.g. GK in each table and store these in indexed tables. Then link on these initial fields and make your full comparison then.

  9. #9
    xcess is offline Novice
    Windows 7 32bit Access 2007
    Join Date
    Oct 2018
    Location
    Texas
    Posts
    9
    Correct, I am not using joins.
    I can’t use a town or anything else to link on, as the only link between the two is the names.
    For initials, I’m intrigued, but not sure it will help due to the massive # of people I have to search.

    What my data looks like: (I'll send a screenshot of it once I get home if it's helpful)
    Table 1, ~1.5million records. A list of all property in my county, along with its homeowner, mailing address, and other misc info.
    Table 2, ~1,500 records. A list of names. I want to see if any of them own property in my county.

    So the names are the only link between the two.

    The names are sorted in the same format ([LAST NAME] [FIRST NAME] [MIDDLE NAME]) in both tables (e.g. KING MARTIN LUTHER). I deleted all records from table 1 that had bogus characters at the beginning (e.g. #121) and the info in table 2 is clean (i.e. no miscellaneous characters, just names).

    Table 1 may have extra letters (not #s or symbols) before or after the name due to several reasons: it’s a trust, a company, or a mistake at the county clerk office. E.g. 1) TRUST KING MARTIN LUTHER, TRUSTEE OF & 2) MARTIN LUTHER KING LLC (I don’t remember the exact format they use, but that’s the general idea).

    Some ways I can think of to reduce my 1.5 million records table, but I’m not sure how to implement it:

    • Delete/ignore any record with a mention of a company (LLC, INC, etc.) -> This will be substantial, maybe even a 50% reduction in records
    • Some people own multiple properties, so their names show up multiple times. Can I delete duplicates?


    I need to create a reduced field with the above parameters, likely more if y’all know of any (the initials idea, not sure how to use it?), and once that’s done use the LIKE and OR search that Ajax mentioned.

    Thoughts?

  10. #10
    orange's Avatar
    orange is online now Moderator
    Windows 10 Access 2010 32bit
    Join Date
    Sep 2009
    Location
    Ottawa, Ontario, Canada; West Palm Beach FL
    Posts
    16,726
    What would be very helpful is

    -table design info (field names, data types..) for both tables
    -sample data for each table?

    Are the properties uniquely identified?
    I wouldn't delete anything until I understood the problem and identified potential solutions.
    I would mock up an approach and test it before jumping in to the full data (1.5 MILLION RECORDS).

  11. #11
    xcess is offline Novice
    Windows 7 32bit Access 2007
    Join Date
    Oct 2018
    Location
    Texas
    Posts
    9
    Attached are two screenshots, one for each table. I made a test table with 1k records for Table 1, the 1.5 million record table. This test table contains actual data from that table, sorted from Z-A.

    The data type for all fields is Text. See last two screenshots.

    Click image for larger version. 

Name:	Table 1_Test_1K.PNG 
Views:	38 
Size:	44.3 KB 
ID:	35713Click image for larger version. 

Name:	Table 2.PNG 
Views:	37 
Size:	36.4 KB 
ID:	35714Click image for larger version. 

Name:	Table 1_Test_1K - Data Type.PNG 
Views:	37 
Size:	15.6 KB 
ID:	35715Click image for larger version. 

Name:	Table 2 - Data Type.PNG 
Views:	37 
Size:	16.0 KB 
ID:	35716

  12. #12
    isladogs's Avatar
    isladogs is offline MVP / VIP
    Windows 10 Access 2010 32bit
    Join Date
    Jan 2014
    Location
    Somerset, UK
    Posts
    5,977
    It appears that none of the records displayed will match so the screenshots don't actually help.
    Suggest showing a different screenshot which includes a couple of matching records!
    Colin, Access MVP, Website, email
    The more I learn, the more I know I don't know. When I don't know, I keep quiet!
    If I don't know that I don't know, I don't know whether to answer

  13. #13
    CJ_London is offline VIP
    Windows 10 Access 2010 32bit
    Join Date
    Mar 2015
    Posts
    11,430
    I agree with orange - don't delete any records (unless you can easily replace them) - suggest use an append query to append to another table.

    And you need to index your mailto field

    Your main table also appears to have duplicates - just with what is visible, I can see 25 and 26 are identical as are records 2-8 and 9 -10 are very close. You can get rid of the first 2 by using a group by query to append to a new table - (use first for the ID field, and group on the other two). With regards the latter you need something else, but can still de done. but if companies can be excluded it would not be an issue anyway. Either way exclude companies if not relevant

    For initials, I’m intrigued, but not sure it will help due to the massive # of people I have to search.
    it is a technique to make use of indexing so that instead of comparing one record with 1.5m you compare with a lower number. Do the maths - simplistically there are 26 x 26 two letter combinations (676). So by extracting the initials, indexing them and then linking on them, you are now comparing 1 record with 2200, a considerable improvement. So instead of 2,250,000,000 comparisons you are now down to 3,300,000 which at 10k/second is around 10 minutes. Note I suspect 10k/second is somewhat optimistic, but you get the drift

  14. #14
    xcess is offline Novice
    Windows 7 32bit Access 2007
    Join Date
    Oct 2018
    Location
    Texas
    Posts
    9
    I created a new table from my main data (1.5m) to remove duplicates. Now I need to filter out records that have certain key words, like LLC or LTD, etc. I have a list of about 5 words. What's a quick way to do this?

    Also, I am getting stuck on how to convert the names to initials. I believe I need to use the split command, but it's not a native function to Access (sort of), so I'd need to create my own function in a module? Any help with this?

    Sorry, I don't have a ton of database experience. Online research wasn't too much help either.

  15. #15
    orange's Avatar
    orange is online now Moderator
    Windows 10 Access 2010 32bit
    Join Date
    Sep 2009
    Location
    Ottawa, Ontario, Canada; West Palm Beach FL
    Posts
    16,726
    Send some sample data --real data in .txt or .csv file. And readers will show you some approaches.

Page 1 of 2 12 LastLast
Please reply to this thread with any new information or opinions.

Similar Threads

  1. Replies: 6
    Last Post: 08-17-2018, 12:04 PM
  2. Replies: 5
    Last Post: 03-01-2016, 01:01 PM
  3. Database - Property value too large
    By jenna36 in forum Database Design
    Replies: 1
    Last Post: 08-03-2015, 12:49 PM
  4. Replies: 8
    Last Post: 04-14-2014, 07:26 AM
  5. Large database query issues
    By jiimmyp in forum Access
    Replies: 10
    Last Post: 04-22-2013, 07:54 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Other Forums: Microsoft Office Forums