Page 2 of 2 FirstFirst 12
Results 16 to 18 of 18
  1. #16
    CJ_London is offline VIP
    Windows 10 Access 2010 32bit
    Join Date
    Mar 2015
    Posts
    11,411


    What's a quick way to do this?
    because the LLC/LTD is at the end of the name you cannot make use of indexing

    Having said that it should not take more than a few minutes to identify them with a criteria

    not like "*LTC" or not like "*LTD"

    alternative, use the strreverse function to populate another indexed field and then look for "CTL*" or "DTL*"

    to get the initials you do need a udf - something like

    Code:
    function initials(fullname as string) as string
    dim sArr() as string
    
        sArr-split(fullname," ")
        initials=left(sarr(1),1) & left(sarr(0),1)
    
    end function
    you have provided very little with regard the range of possible options, the above code is basic without any error checking (e.g. it assumes there is a space between first name and last name and the order is last name, first name, initial (if any)) So you may have to adapt it in some way. Clearly it will not identify Ajete in row 23

  2. #17
    xcess is offline Novice
    Windows 7 32bit Access 2007
    Join Date
    Oct 2018
    Location
    Texas
    Posts
    9
    See below for ZIP file. It contains four .csv files:
    1. Names.csv - This is Table 2, the list of names I want to search (# of records is slightly larger now, >2k)
    2. Source_Data_1K.csv - Table 1, but I selected only 1k/1.5million records in descending order (Z-A)
    3. Source_Data_100K.csv - Table 1, same as above, but 100k records
    4. Source_Data_Small_All_Fields.csv - Table 1, just a few random records/1.5million, but it contains all of the columns/fields the real data has. The above 2 tables don't, just the most important ones.

    Thanks!

    Data.zip

  3. #18
    CJ_London is offline VIP
    Windows 10 Access 2010 32bit
    Join Date
    Mar 2015
    Posts
    11,411
    Table 1, ~1.5million records. A list of all property in my county, along with its homeowner, mailing address, and other misc info.
    looking at your small_all_fields csv there are many states - so the quote is inaccurate - the 1.5m records is all states, not just your state, let alone county. So filtering out those records that are not your state will be a start.

    with regards filtering out LLC, LTD etc I see plenty more City/County/partnership/bank/trust/development/current owner/LP/Inc/Management/Holdings/PTNSH just in the first couple of hundred rows.

    from what I can see, individuals do present as lastname, firstname. However before applying the initials route, I suspect you have plenty to go at to remove records not required

    Suggest try the following to find rows which are not City/County etc

    create a table called tblIgnore with a single field call Ignore

    populate this table with the above words, plus any others you discover (when I ran it, I found words like State, Community, Foundation, Company, Investment in just your small list).

    then, assuming your table is called tblMaster (your source data) use this query. It will be worth adding an index to the Mail_State field

    Code:
    SELECT tblMaster.*
    FROM tblMaster LEFT JOIN tblIgnore ON " " & tblMaster.MAILTO & " " like "*" & replace(tblIgnore.Ignore,"_"," ") & "*"
    WHERE tblIgnore.Ignore is null AND Mail_State=[Enter your State code eg TX]
    note this query cannot be seen in the query window, only in sql, because the query window can only display simple = joins. There is a cheat if you lose it - create the left join as normal, then go into the sql window and change to the above, shown in red.

    You might want to not ignore words like LP, Inc, Co since these can be part of someone's name - and I guess you could have a name like George Banks who would be excluded because of Bank. If so, don't put them in the ignore list. However you will notice the join includes a replace to replace the underscore with a space and also a space is added to both ends of MailTo. This is so if you decided you wanted to exclude LP as a word putting _LP in the ignore list will exclude BLOCK 7 SSBB LP but not HALPEN GEORGE. Similarly, including one at the end for example CO_ will exclude OCCIDENTAL INVESTMENT CO but not COKINS JOHN H

    populate the ignore list, run the query, note other words that can be used to identify records that can be ignored and add them to the list and run the query again. Repeat until happy you have excluded as much as you can. This will then form the basis of your comparison list. Perhaps the query is fast enough on it's own, but if not use a make table on this query to create a temporary table

    Edit: Running the above code (excluding the state criteria and including all the ignore words listed above and including DISTINCT to eliminate duplicates) on your 100k file took about 5 seconds and excluded 26k records

Page 2 of 2 FirstFirst 12
Please reply to this thread with any new information or opinions.

Similar Threads

  1. Replies: 6
    Last Post: 08-17-2018, 12:04 PM
  2. Replies: 5
    Last Post: 03-01-2016, 01:01 PM
  3. Database - Property value too large
    By jenna36 in forum Database Design
    Replies: 1
    Last Post: 08-03-2015, 12:49 PM
  4. Replies: 8
    Last Post: 04-14-2014, 07:26 AM
  5. Large database query issues
    By jiimmyp in forum Access
    Replies: 10
    Last Post: 04-22-2013, 07:54 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Other Forums: Microsoft Office Forums