Results 1 to 8 of 8
  1. #1
    xcess is offline Novice
    Windows 7 32bit Access 2007
    Join Date
    Oct 2018
    Location
    Texas
    Posts
    9

    Like Criteria with Asterisk Returning False Positives

    Hello!

    I have a query that works as expected in very small controlled instances, but on a large scale returns false positives and I'm not sure why. I believe it has to do with the use of asterisks in the criteria. This is what I need help with.

    I have two tables. Table 1 is the master records table, around 1.2 million records, of all property owners in my county. It has many different fields.
    Table 2 is a list of names I want to find matches from in table 1, around 2.5k records (I want to see if any of them own property in my county).

    The names in table 2 are separated by last name, first name, etc. into separate fields. Names in table 1 are not: the names are in one field, sometimes with extra information before, after, or between the last & first names. E.g. the name field could be SMITH JOHN & MARY or SMITH JOHN & JOE MARY. The names are always in the format of [LAST NAME] [FIRST NAME].

    My query, as seen in the screenshot, is
    LIKE "*" & [Divorce_Names].[Last Name] & "*" & [Divorce_Names].[First Name] & "*"

    I attached two tables. Table real_acct-Query_2_Search is the exported table of results, which has 26k results (way too many!). Table Divorce Names is table 2 I was describing earlier, one of my inputs. I highlighted some rows in both tables. Yellow indicates a good search return, and red & purple indicate false positives.

    Any reasons why this is returning false positives? Thanks!

    Table 1 = [real_acct-Query_1_Ignore] in my database
    Table 2 = [Divorce_Names] in my database
    real_acct-Query_2_Search - Abridged.zip


    Divorce Names.zip

    P.S. I am using the like criteria versus a different method because of the # of fields I need to search through, and was told this was an efficient method (my friend helped me set this up). This query takes 8 hours or so to run, which is fine, but I don't want it to take much longer than that unless it has to. I set it in the morning and will only do it once a month.
    Last edited by xcess; 10-31-2018 at 05:13 PM.

  2. #2
    orange's Avatar
    orange is offline Moderator
    Windows 10 Access 2010 32bit
    Join Date
    Sep 2009
    Location
    Ottawa, Ontario, Canada; West Palm Beach FL
    Posts
    16,716
    In simple, plain English, what are you trying to achieve? What is the proposed end result --report???

    Where's the database? I see only excel.

    PropertyOwners - and- MiscNames that may or not be people, orgs, associates, representatives

  3. #3
    Micron is offline Virtually Inert Person
    Windows 10 Access 2016
    Join Date
    Jun 2014
    Location
    Ontario, Canada
    Posts
    12,737
    The need for more info notwithstanding, there is only one set of data/attachments not two. The data posted doesn't shed any light on the problem - there are just highlighted rows with no explanation. Even so, it would seem that by using wild card searches you are slowing things down tremendously, as well as returning every combination of the same records that satisfies the broad criteria.

    I suspect you'd be a lot happier if you had a table for modified data. You might have to put this into a separate db due to file size just for this purpose but can't say because I have no idea where you're at size-wise on your current file. In the modified data would be extra fields to split the first and last names into last, first and co-owner name (spouse/partner/whatever). The idea would be to use this modified table joined to your other table and eliminate the LIKE situation. Obviously this would depend on the primary data being divided according to some reliable pattern.
    The more we hear silence, the more we begin to think about our value in this universe.
    Paraphrase of Professor Brian Cox.

  4. #4
    xcess is offline Novice
    Windows 7 32bit Access 2007
    Join Date
    Oct 2018
    Location
    Texas
    Posts
    9
    Q: What am I trying to achieve?
    A: I am trying to determine if any people from my list of names, table [Divorce_Names], owns property in Harris County. If they do, I want to know what that property is.

    My table 1, or [real_acct-Query_1_Ignore], in my database has information about all properties in Harris County I am interested in. This is the table I am searching through.

    Q: Where's the database:
    A: I can't upload it, as it's about 1.5GB. If my database has only table 1 in it, it's about 1.5 GB due to the # of records it contains & # of other fields.
    If it'd help, I can upload a screenshot of the table?

    Q: Only 1 set of data attached?
    A: There was two sets, but it looked like one because it was on the same line. I edited my first post so you can see both spreadsheets now. Sorry about that!

    Q: Can I divide my primary data name field into last name, first name, etc.?
    A: Possibly, would definitely need to be sent to a new DB. When I get home I'll mess around with it and see. It gets tricky because not all of the names come in the exact same format.
    *Any suggestions with this? Splitting my one field with names into multiple? It would need to have two sets of last name, first name, and middle name. I can think of how to do it in excel, something with LEFT & FIND, but not sure if that carries over to ACCESS.

    Some formats I know the name field comes in are shown below. I can send a snippet of my actual data once I get home.

    1: SMITH JOHN DOE
    2: SMITH JOHN & MARY
    3: SMITH JOHN & TRUMP IVANKA

    Sometimes there may be a random character at the front or back end of the data, like a % or some numbers, but I can probably sort the data and just edit it myself if it's not too bad.

    Thanks!

  5. #5
    CJ_London is online now VIP
    Windows 10 Access 2010 32bit
    Join Date
    Mar 2015
    Posts
    11,397
    for the benefit of responders, this ground has been covered in part in the OP's previous thread

    https://www.accessforums.net/showthread.php?t=73969

    no idea whether the OP took up the suggestions made. From Orange's comments post #2, I suspect not. Do not have time to contribute to this one

  6. #6
    Micron is offline Virtually Inert Person
    Windows 10 Access 2016
    Join Date
    Jun 2014
    Location
    Ontario, Canada
    Posts
    12,737
    Thank you Ajax.
    @xcess - it seems you haven't followed through with the suggestion in post 15 of the linked thread but elected to start a new thread instead. Or is this one a separate problem and the other one still needs to continue??

    If I were to continue, it would only be if you provided a slimmed down copy of your db with some records and zipped/posted that. Shouldn't be a problem with 1 or 2 thousand in each table. Personally I don't bother much with screen shots of data, tables, queries etc. when there's a lot involved. It puts the onus on me to recreate them in order to have something to work with and I don't get paid enough here for that . It might also help shed light on some of your comments, such as " It would need to have two sets of last name, first name, and middle name."

  7. #7
    orange's Avatar
    orange is offline Moderator
    Windows 10 Access 2010 32bit
    Join Date
    Sep 2009
    Location
    Ottawa, Ontario, Canada; West Palm Beach FL
    Posts
    16,716
    Thanks Ajax for link to original thread. It doesn't appear that xcess has acted on suggestions.

    @xcess,
    Why start a new thread without a link to the original???

    With database, every table should have a primary key. The lack of PK and index(es)(structure) is your main issue for slow performance.
    It would be helpful if you could tell us what each of the fields represents.
    If you want to identify people in your miscellaneous names list who may be owners of property(ies) in your Property table, you have to know who owns each property. To start ,you need a properly structured table tblProperty that uniquely identifies each property (in your county). If a property can have multiple owners then you'll need a table tblPropertyOwner - again with PK and appropriate index(es).

    If your data contains addresses not in Harris County, you could reject those immediately as Ajax suggested.

    Matching on name alone is not 100%. It is an approximation at best ---many John Smiths is the often noted issue.

    Work with some subset of your data to get proper table structures. Research table design, normalization; get that working; then apply a working method to your data.

    Good luck.
    Last edited by orange; 11-01-2018 at 07:22 AM. Reason: spelling

  8. #8
    xcess is offline Novice
    Windows 7 32bit Access 2007
    Join Date
    Oct 2018
    Location
    Texas
    Posts
    9
    Hi all,

    I have attached slimmed-down versions of my databases, and a small text file with how to use them. More details about them below.

    How to Use HCAD Databases.txt
    Hcad - Search.zip
    Hcad - Filter.zip

    What I need help with is:
    1) How to create a new table from my raw data, with the different homeowners (if there's more than one), and split it into last name, fist name, and middle initial. Again, I can write something in Excel for this, but not sure if that'll work or be efficient in Access. My real database will have >1 million records.
    2) How to prevent false positives from appearing. This is found in my database "Hcad - Search" table "real_acct-Query_2_Search".


    **Database explanation**
    There are two different databases. The first, "Hcad - Filter", contains the raw data & two queries to slim down the data. It 1) removes all records that contain any of the words from the "Ignore" table and 2) groups the records to eliminate duplicates. This result is exported as table "real_acct-Query_1_Ignore&Group" in the second database "Hcad - Search".

    My second database has a query to search from table "Divorce Names" for matches in the newly created "real_acct-Query_1_Ignore&Group" table. This is saved as table "real_acct-Query_2_Search".


    **Q&A**

    Q: You didn't follow through with the suggestions from the previous thread. Why start a new thread?
    A: I worked with a friend about two weeks ago and implemented several of the suggestions from the other thread. Most of what I need is working, but I get false positives when I do the final search, which is the problem.

    Also, sorry for the confusion with the two threads. I was thinking my new problem would have a quick solution, and I didn't want to confuse this new problem with the original thread. I won't do that in the future!

    Q: Why do you have addresses not in Harris County?
    A: Each record has two different slots for addresses: the property address (in Harris County) and the mailing addresss for the homeowner. Some people own property in Harris County, but live somewhere else (think AirBnB). I need to keep homeowners that live out of the county included in the list, which I have done.

    Q: Matching on name alone is approximation at best.
    A: Name is the only identifier I have. If there are many John Smiths, I will likely discredit that name, because I don't want to waste my time trying to figure out which of the 15 John Smiths is the one I want, if he's even on the list. I will make that decision after I have the searched data on a case-by-case basis.

    Q: Your tables lack primary key & index(es)(structure)
    A: I will add a primary key to my tables after sorting out the above problems. Not sure what the indexes (structure) is referring to? Just having a different table for the names after they're been separated by lastname, firstname, etc. & first homeowner, second homeowner, etc.?

Please reply to this thread with any new information or opinions.

Similar Threads

  1. Replies: 6
    Last Post: 09-26-2016, 10:36 PM
  2. Replies: 2
    Last Post: 10-20-2015, 02:01 PM
  3. Criteria - Returning more than specified
    By twildt in forum Queries
    Replies: 5
    Last Post: 05-05-2014, 11:23 AM
  4. Query criteria using IIF when false
    By james28 in forum Queries
    Replies: 10
    Last Post: 03-01-2014, 03:03 PM
  5. Replies: 7
    Last Post: 01-11-2012, 12:24 PM

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Other Forums: Microsoft Office Forums