Results 1 to 10 of 10
  1. #1
    johnmerlino is offline Advanced Beginner
    Windows XP Access 2007
    Join Date
    Oct 2010
    Posts
    97

    parse on the fly when running a query

    Hey all,

    I have this query:

    Code:
    SELECT possibles.fullname, uniSelectedContacts.addresses, uniSelectedContacts.cities, uniSelectedContacts.us_states_and_canada, uniSelectedContacts.zip_codes INTO PrepareForDuplicateCheck
    FROM uniSelectedContacts INNER JOIN possibles ON uniSelectedContacts.TheName = possibles.fullname;
    The problem is that uniSelectedContacts may have a middle initial and possibles may not or visa versa. Or uniSelectedContacts may have no middle initial but possibles may or visa versa. Because of this, above query ignores certain records that I want to pull. This is really the only issue that I have come across that causes a large number of records to not be pulled. All other situations like misspelled names I am not worried about.

    So I am trying to figure out a way to parse the first and last name to get middle initial and compare it with the two tables on the fly:

    Code:
    (first name) Left([txtname],InStr([txtname],",")-1)
    (last name) Mid([txtname],InStrRev([txtname],",")+1)
    Any suggestions on how to do this in Access?



    Thanks for response.

  2. #2
    ajetrumpet is offline VIP
    Windows Vista Access 2007
    Join Date
    Mar 2010
    Location
    N/A
    Posts
    2,694
    this sounds like the same problem you were posting on a few weeks ago, wasn't it John? Did the solution then not cover the middle initial issue?

    what does the data look like again? if I remember right, on the left side of the last name was a space and on the right side of the first name was a space. yes? out of curiosity, I'm wondering if you could trim a strlen() function those two combined parts of the field and then compare that length with the length of the field. If different, middle initial most likely present? (e.g. - pull entire field)

  3. #3
    johnmerlino is offline Advanced Beginner
    Windows XP Access 2007
    Join Date
    Oct 2010
    Posts
    97
    Thanks for response.

    Basically, the issue is in the following situations:

    uniSelectedContacts.TheName
    Smith,John A

    possibles.fullname
    Smith,John

    OR

    uniSelectedContacts.TheName
    Smith,John

    possibles.fullname
    Smith,John B

    OR

    uniSelectedContacts.TheName
    Smith, John Harold

    possibles.fullname
    Smith,John H

    OR

    uniSelectedContacts.TheName
    Smith, John Harold

    possibles.fullname
    Smith,John H

    OR

    uniSelectedContacts.TheName
    Smith, John H

    possibles.fullname
    Smith,John Harold

    None of these records are being pulled because middle initial is different, so this query:
    Code:
    SELECT possibles.fullname, uniSelectedContacts.addresses, uniSelectedContacts.cities, uniSelectedContacts.us_states_and_canada, uniSelectedContacts.zip_codes INTO PrepareForDuplicateCheck
    FROM uniSelectedContacts INNER JOIN possibles ON uniSelectedContacts.TheName = possibles.fullname;
    ignores above cases.

    I want to preserve INNER JOIN because I don't want to have to wait 45 minuntes for a query to happen with 800,000 records.

    So the solution is to adjust for the middle names on the fly when query runs:
    Code:
    SELECT possibles.fullname, uniSelectedContacts.addresses, uniSelectedContacts.cities, uniSelectedContacts.us_states_and_canada, uniSelectedContacts.zip_codes, possibles.[firstname] AS Expr1, possibles.[lastname] AS Expr2 INTO PrepareForDuplicateCheck
    FROM uniSelectedContacts INNER JOIN possibles ON uniSelectedContacts.TheName = possibles.fullname
    WHERE (((possibles.firstname)=Left([fullname],InStr([fullname],",")-1)) And ((possibles.lastname)=Mid([fullname],InStrRev([fullname],",")+1)));
    This gives "Enter Parameter Value possibles.firstname" message.

    Thanks for response.

  4. #4
    ajetrumpet is offline VIP
    Windows Vista Access 2007
    Join Date
    Mar 2010
    Location
    N/A
    Posts
    2,694
    i'm confused john. what exactly do you want? do you want all of the records returned that you listed in your last post?

    please post all of those examples you just did as a single recordset, and then what you would want a query output to look like. thanks.

    I think using my ESPLIT() function would help, but I won't know until I see a desired output.

  5. #5
    johnmerlino is offline Advanced Beginner
    Windows XP Access 2007
    Join Date
    Oct 2010
    Posts
    97
    Thanks for response.

    Let's say in possibles I have this:
    Smith,John
    Dough,John B
    Ward,Jane Karon
    Eggert,Samantha R

    In uniSelectedContacts I have this:
    Smith,John A 1552 1st Ave
    Dough,John 1111 2st Ave
    Ward,Jane K 2222 3st Ave
    Eggert,Samantha Rachel 3333 1st Ave

    I want to find where first and last name is identifical. However, due to variation in middle initial, if there's no middle initial in possibles, then I want to return all instances of the first and last name from uniSelectedContacts using INNER JOIN. If possibles has a middle initial(name), regardless of its middle initial or name, I want to return the record(s) where first and last name of uniSelectedContacts anmd possibles are identical to each other but also the first letter of middle initial(name) of uniSelectedContacts matches the first letter of middle initial(name) of possibles. So in the above example, this should be returned:


    Smith,John 1552 1st Ave
    Dough,John B 1111 2st Ave
    Ward,Jane Karon 2222 3st Ave
    Eggert,Samantha R 3333 1st Ave

    This query will skip all the above records:
    Code:
    SELECT possibles.fullname, uniSelectedContacts.addresses, uniSelectedContacts.cities, uniSelectedContacts.us_states_and_canada, uniSelectedContacts.zip_codes INTO PrepareForDuplicateCheck
    FROM uniSelectedContacts INNER JOIN possibles ON uniSelectedContacts.TheName = possibles.fullname;
    So I'm trying to do something like this:
    Code:
    SELECT possibles.fullname, uniSelectedContacts.addresses, uniSelectedContacts.cities, uniSelectedContacts.us_states_and_canada, uniSelectedContacts.zip_codes, possibles.[firstname] AS Expr1, possibles.[lastname] AS Expr2 INTO PrepareForDuplicateCheck
    FROM uniSelectedContacts INNER JOIN possibles ON uniSelectedContacts.TheName = possibles.fullname
    WHERE (((possibles.firstname)=Left([fullname],InStr([fullname],",")-1)) And ((possibles.lastname)=Mid([fullname],InStrRev([fullname],",")+1)));
    But it's not working.

  6. #6
    ajetrumpet is offline VIP
    Windows Vista Access 2007
    Join Date
    Mar 2010
    Location
    N/A
    Posts
    2,694
    ok john, it sounds like you're looking for a hierarchy that's going to be driven by nested conditionals. if that doesn't make sense to you, consider this set of statements ("EO" = "either or":
    Code:
    if middle name (full) in EO then
          find first/last matches WITH middle name included in output
    elseif middle initial only in EO then
          find first/last matches only WITH middle initial included in output
    else
          find first/last matches only in output
    end if
    obviously you have to use some code here, because the variables are too complex in the situation. if what I said above is correct John, the first thing i think you should realize is that your current code really doesn't do anything. it evaluates portions of a field value and compares it to another field in the same table (possibles). unless of course the '[fullname]' syntax actually refers to the 'fullname' field in the "contacts" table, which I guess it's doing huh? If that's the case, then yes, your query is good, but it doesn't cover near what it has to. there are no conditionals that check for the possibilities. that's what is missing.

    just to let you know, your first query fails and skips the records because the fullname fields in every record between your 2 tables in your example are not equal. so that's a given. your second attempt fails partially because the instr() related functions are failing on records where there is no comma present in uniSelectedContacts.fullname. the functions are returning values of '0', and thus the records in uniSelectedContacts.fullname that have no comma are being evaluated by this sql criteria:

    Code:
    WHERE (((possibles.firstname) = Left([fullname], -1)) And 
    ((possibles.lastname) = Mid([fullname], 1)));
    that might be irrelevant though, if all the records have at least one comma in them (as it seems to be this way). but even for recs WITH a comma, your last name comparison will fail when middle 'anythings' are present.

    but at any rate, again, if what I said at the beginning of the this post is true, you might simply have to call vba to cover the possibilities here. one such solution I can personally come up with might be:

    Code:
    SELECT possibles.fullname, uniSelectedContacts.addresses, 
    uniSelectedContacts.cities, uniSelectedContacts.us_states_and_canada, 
    uniSelectedContacts.zip_codes, possibles.[firstname] AS Expr1, 
    possibles.[lastname] AS Expr2 INTO PrepareForDuplicateCheck
    
    FROM uniSelectedContacts INNER JOIN possibles ON 
    uniSelectedContacts.TheName = possibles.fullname
    
    WHERE possibles.firstname = GetFirst([fullname]) And
    
    possibles.lastname = GetLast([fullname]);
    You can do it that way, OR...the other option is to simply call one function twice for every record and use the code to analyze the fullname in each of the two 'fullname' fields. If you do it that way, you can simply ignore the middle 'anything' (initial, name, etc...) with regard to comparing the 2 names (first and last).

    So, I will give you an example of using 1 procedure. I noticed that the data you posted has the first and last names separated by a comma. I will also assume that these 2 names will ALWAYS be separated by a comma. So let's take your sample data:

    Let's say in possibles I have this:
    Smith,John
    Dough,John B
    Ward,Jane Karon
    Eggert,Samantha R

    In uniSelectedContacts I have this:
    Smith,John A 1552 1st Ave
    Dough,John 1111 2st Ave
    Ward,Jane K 2222 3st Ave
    Eggert,Samantha Rachel 3333 1st Ave

    The output should be:
    Smith,John 1552 1st Ave
    Dough,John B 1111 2st Ave
    Ward,Jane Karon 2222 3st Ave
    Eggert,Samantha R 3333 1st Ave
    The common ground between the two tables is obviously the comma, as it always appears between first and last. so, why not use the split() function combined with trim() for comparisons? so in the query, you would have:
    Code:
    WHERE possibles.firstname = GetNamePart([fullname], "first") And
    possibles.lastname = GetNamePart([fullname], "last");
    So, 1 function with a varying argument in the second slot. So the function might be something like:
    Code:
    function GetNamePart(fullname as string, which as string) as string
    
    dim parts() as string 'array needed regardless of arguments
         
          parts() = split(fullname, ",")
    
    SELECT CASE which
    
       case = "first" 'code for first name extraction
               GetNamePart = trim(parts(0)) 'this is EASY.
    
       case = "last" 'this is almost as EASY
               GetNamePart = iif(instr(trim(parts(1)), " ") > 0, _
                           left(trim(parts(1)), instr(trim(parts(1)), " ") - 1), _
                           trim(parts(1)))
    
    END SELECT
    
    end function
    You need those criteria functions in the query john, but the other thing you need is a conditional SELECT statement, because from what I can see, you want either the rec out of table1 or table2, depending on which one fits what you want, right? That can't be done through a WHERE clause (easily anyhow). So write another function that returns the most comprehensive of the two 'fullname' fields. Here is what I would go with as the complete solution if I were doing this:

    Code:
    SELECT GetFinalName(possibles.fullname, uniSelectedContacts.TheName) as FullName, 
    uniSelectedContacts.addresses, 
    uniSelectedContacts.cities, uniSelectedContacts.us_states_and_canada, 
    uniSelectedContacts.zip_codes, possibles.[firstname] AS Expr1, 
    possibles.[lastname] AS Expr2 INTO PrepareForDuplicateCheck
    
    FROM uniSelectedContacts INNER JOIN possibles ON 
    uniSelectedContacts.TheName = possibles.fullname
    
    WHERE possibles.firstname = GetNamePart([fullname], "first") And
    possibles.lastname = GetNamePart([fullname], "last");
    FUNCTIONS NEEDED:
    Code:
    function GetNamePart(fullname as string, which as string) as string
    
    dim parts() as string 'array needed regardless of arguments
         
          parts() = split(fullname, ",")
    
    SELECT CASE which
    
       case = "first" 'code for first name extraction
               GetNamePart = trim(parts(0)) 'this is EASY.
    
       case = "last" 'this is almost as EASY
               GetNamePart = iif(instr(trim(parts(1)), " ") > 0, _
                           left(trim(parts(1)), instr(trim(parts(1)), " ") - 1), _
                           trim(parts(1)))
    
    END SELECT
    
    end function
    
    function GetFinalName(possible as string, contact as string) as string
    
    dim ptemp as string
    dim ctemp as string
    
    'get last part of fullname for both
    ptemp = mid(possible, instr(possible, ",") + 1)
    ctemp = mid(contact, instr(contact, ",") + 1)
    
    'are middle names or inits present?
    if instr(trim(ptemp), " ") > 0 then 'MIDDLE content present
       ptemp = mid(trim(ptemp), instr(trim(ptemp), " ") + 1)
    else
       ptemp = ""
    end if
    
       if instr(trim(ctemp), " ") > 0 then 'MIDDLE content present
          ctemp = mid(trim(ctemp), instr(trim(ctemp), " ") + 1)
       else
          ctemp = ""
       end if
    
          if ctemp = "" and ptemp = "" then
             GetFinalName = contact 'OR possible.  if no middle content, return value doesn't matter
          else
             if len(ctemp) > len(ptemp) then
                GetFinalName = contact
             else
                GetFinalName = possible
             end if
    
    end function
    I may have over complicated how to get to the end result, and I'm very sorry for the long post, but I do like to tell stories. The first function too, can be written as a multi-nested IIF() statement right in the query sql, but it's much more readable in vba code. So that part can be simplified. I did not test any of this out, just wrote it by hand, so a few syntax errors may have to be fixed if you want to try these methods. The other thing too, is that it's not advisable to run this sort of complexity on a large dataset. if it's large, you might consider splitting it before using function calls on it.

  7. #7
    johnmerlino is offline Advanced Beginner
    Windows XP Access 2007
    Join Date
    Oct 2010
    Posts
    97
    The data set is over a million records. Should I split it into sections of 250,000? And then can they all run at the same time (using an intel dual core). If they run at same time looking through 250,000, will this take under 10 minutes?

    Thanks for response.

  8. #8
    ajetrumpet is offline VIP
    Windows Vista Access 2007
    Join Date
    Mar 2010
    Location
    N/A
    Posts
    2,694
    Quote Originally Posted by johnmerlino View Post
    The data set is over a million records. Should I split it into sections of 250,000? And then can they all run at the same time (using an intel dual core). If they run at same time looking through 250,000, will this take under 10 minutes?

    Thanks for response.
    I have no idea John. 250K splits will result in 40 new select queries wouldn't it? and no, don't run them at the same time. how would you, anyway? I don't think processor really matters, unless it's quite old. The concern I think would be with the jet engine and how it would handle it.I suppose you could also write a vba loop to split the one table into 20 or 40, then run the select queries and consolidate them, then delete all the old objects created.

    with a million recs, I would guess that a process like that though, would take 5 minutes at least. you would also have to break your code several times to ensure that it doesn't run forever. when the program freezes due to long execution, corruption of the memory usage is very likely.

  9. #9
    johnmerlino is offline Advanced Beginner
    Windows XP Access 2007
    Join Date
    Oct 2010
    Posts
    97
    Thanks for response. I tried your query and it immediately gave me a popup saying "Enter Parameter Value possibles.firstname"

    Was this firstname meant to be extracted on the fly because by default possibles has a fullname field which is comprised of first last and middle name.

  10. #10
    ajetrumpet is offline VIP
    Windows Vista Access 2007
    Join Date
    Mar 2010
    Location
    N/A
    Posts
    2,694
    the criteria portions of the query sql that I posted are exact replicas of what you posted earlier John. So I would assume that your query would have given the same parameter error.

    if you look at my post and yours too, I believe they both use 'possibles.firstname' on the left side of the WHERE clause statement,, don't they?

    or maybe it's coming from the function input? yeah, i bet that's what it is. Hmmm...I don't really know how you could fix that, because you can't really put in "[firstname]", as that could reference both fields, one for both tables, couldn't it? Your bet might be to change the field names so none of them have the same names. I'd try that, and then put the [firstname] field in the function arg just like that, without the dot notation. The engine will not get confused then, because only one field will have that name, and there will be no "." in the arg either, which might be a problem. I'm not 100% that IS the problem, but it's certainly not a good practice.

Please reply to this thread with any new information or opinions.

Similar Threads

  1. Query with a running sum
    By is49460 in forum Queries
    Replies: 3
    Last Post: 09-07-2013, 11:11 PM
  2. Running an update query
    By markod in forum Queries
    Replies: 3
    Last Post: 11-11-2010, 01:24 PM
  3. Running a query from another database
    By manicamaniac in forum Access
    Replies: 1
    Last Post: 06-11-2010, 04:54 PM
  4. Running Count Query
    By monkey2003 in forum Queries
    Replies: 0
    Last Post: 09-21-2009, 12:24 PM
  5. Parse a File from a Directory and write data to table
    By galahad in forum Database Design
    Replies: 0
    Last Post: 04-23-2009, 08:38 AM

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Other Forums: Microsoft Office Forums