Results 1 to 9 of 9
  1. #1
    cwillson is offline Novice
    Windows 7 64bit Access 2010 64bit
    Join Date
    Sep 2017
    Posts
    17

    Inner join with additional criteria

    Hi peeps,

    I'm a newbie to Access and SQL, so would greatly appreciate some help with this as I am tearing my hair out!!

    I have two tables, S1_C and S1_AC which are successfully joined on [Author_name] and [Author Names]. S1_C contains a number of duplicate [Author_name]s.

    I'm trying to produce a list of unique entries, based on the following:

    1. All records in S1_C where S1_C.[Person Attribute Type] = "ANB Identifier"
    2. All records which can be found in both, irrespective of S1_C.[Person Attribute Type]
    3. Where multiple matching records are found in S1_C, the record with the ANB Identifier trumps any others, if it exists. If not, then I will evaluate the duplicates manually.

    I have the following so far, but it's returning ~480 records because of the duplicates described in 3. I have run a similar comparison in Excel (which I'm far more familiar with!) and got ~455 records, but need this to be an automated query in Access because there is a lot of additional queries I need to write once I have the results of this!



    SELECT DISTINCT S1_C.[Person ID], S1_C.[First Name], S1_C.[Last Name], S1_AC.[Author Names], S1_C.[Person Attribute Type]
    FROM S1_C
    INNER JOIN S1_AC ON S1_C.[Author_name] = S1_AC.[Author Names]
    UNION
    SELECT DISTINCT S1_C.[Person ID], S1_C.[First Name], S1_C.[Last Name], S1_AC.[Author Names], S1_C.[Person Attribute Type]
    FROM S1_C
    WHERE S1_C.[Person Attribute Type] = "ANB Identifier";

    Please help!!


  2. #2
    June7's Avatar
    June7 is offline VIP
    Windows 7 64bit Access 2010 32bit
    Join Date
    May 2011
    Location
    The Great Land
    Posts
    53,771
    The Person Attribute Type is different but what about Person ID?

    Post example raw data and desired output.
    How to attach file: http://www.accessforums.net/showthread.php?t=70301 To provide db: copy, remove confidential data, run compact & repair, zip w/Windows Compression.

  3. #3
    cwillson is offline Novice
    Windows 7 64bit Access 2010 64bit
    Join Date
    Sep 2017
    Posts
    17
    The Person ID will differ for each entry in S1_C, yes.

    Sample data below

    S1_C Table
    [Person ID] [First Name] [Last Name] Author_Name [Person Attribute Type]
    1 Billy Bob Bob, Billy
    2 Billy Bob Bob, Billy
    3 Billy Bob Bob, Billy ANB Identifier
    4 Susan Smelly Smelly, Susan
    5 Susan Smelly Smelly, Susan
    6 Peter Piper Piper, Peter ANB Identifier
    7 Rolf Martin Martin, Rolf ANB Identifier

    S1_AC Table
    [Author Names]
    Bob, Billy
    Smelly, Susan
    Piper, Peter

    Desired Output
    [Person ID] [First Name] [Last Name] [Author Names] [Person Attribute Type]
    3 Billy Bob Bob, Billy ANB Identifier
    4 Susan Smelly Smelly, Susan
    5 Susan Smelly Smelly, Susan
    6 Peter Piper Piper, Peter
    7 Rolf Martin Martin, Rolf ANB Identifier

    Thanks

  4. #4
    orange's Avatar
    orange is offline Moderator
    Windows 10 Access 2010 32bit
    Join Date
    Sep 2009
    Location
    Ottawa, Ontario, Canada; West Palm Beach FL
    Posts
    16,870
    Please tell us why 4 and 5 are both in your desired output.

  5. #5
    cwillson is offline Novice
    Windows 7 64bit Access 2010 64bit
    Join Date
    Sep 2017
    Posts
    17
    Good question....they're not really! I need to have a think about this in more detail, to decide what other fields I can use to determine which record should take priority.

    At this stage though, I'm just keen to be able to see the output to even know how many will be duplicated without the ANB Identifier priority clearing them up.

    There is a column in the S1_AC table which contains a Person ID, however this isn't always directly correspond with the Person ID associated with [Author Names]. It feels like it might be the best thing to use as a further filter to determine the true set of distinct values...but again, I'd still need to evaluate the results from the initial query.

    Hope that makes sense!

  6. #6
    Join Date
    Apr 2017
    Posts
    1,792
    Create saved queryes, p.e. :
    qEntriesWithattributes = SELECT * FROM S1_C WHERE [Person Attribute Type] Is Not Null;
    qEntriesInBothWithoutAttributes = SELECT a.* FROM S1_C a, S1_AC b WHERE a.[Author Names]=b.[Author Names] AND a.[Author Names] Not In (SELECT [Author Names] FROM qEntriesWithAttributes)

    You can add DISTINCT clause into both of saved queries, when you may really have double entries with same ID in S1_C or double author names in S1_AC, but don't do it simply for cause of - this slows your query down considerabely. And in case you have more fields i S1_AC and they may cause double entries, then maybe add 3rd saved query qAuthors = SELECT DISTINCT [Author Names] from S1_AC and use it instead of S_1AC in qEntriesInBothWithoutAttributes.

    You query will be now
    SELECT * FROM qEntriesWithAttributes UNION SELECT * FROM qEntriesInBothWithoutAttributes

  7. #7
    June7's Avatar
    June7 is offline VIP
    Windows 7 64bit Access 2010 32bit
    Join Date
    May 2011
    Location
    The Great Land
    Posts
    53,771
    Why have names in both tables? Why doesn't S1_AC table have Person_ID?

    Names are very poor keys.
    How to attach file: http://www.accessforums.net/showthread.php?t=70301 To provide db: copy, remove confidential data, run compact & repair, zip w/Windows Compression.

  8. #8
    cwillson is offline Novice
    Windows 7 64bit Access 2010 64bit
    Join Date
    Sep 2017
    Posts
    17
    Thanks so much for this ArviLaanemets

    I'd actually managed to get much closer (753) to the number of records I have obtained as a reference using excel (750), then saw your post. Your method is soooo much more elegant than mine, I have flipping LOADS to learn about SQL yet (this only being my 2nd week playing with it!), however it still retains a number of duplicates, so the total records remain at 771.

    I'd come up with the following (please be gentle with me, I know it's probably very ugly!!!):

    S1_C ANB Flag only
    SELECT DISTINCT S1_C.[Person ID] AS Legacy_ID_contributor, S1_C.[Author_Name], S1_C.[First Name] AS Forename, S1_C.[Middle Name] AS Initials, S1_C.[Last Name] AS Surname
    FROM S1_C WHERE [Person Attribute Type] = "ANB Identifier";

    S1_C ANB Contributors (in both)
    SELECT DISTINCT IIF(S1_C.[Person Attribute Type] = "ANB Identifier" OR S1_C.[Person ID] = S1_AC.[Submitting Author Person ID], S1_C.[Person ID], S1_AC.[Submitting Author Person ID]) AS Legacy_ID_contributor, S1_C.[Author_Name], S1_C.[First Name] AS Forename, S1_C.[Middle Name] AS Initials, S1_C.[Last Name] AS Surname
    FROM S1_C INNER JOIN S1_AC ON S1_C.[Author_name] = S1_AC.[Author Names];

    Full List
    SELECT DISTINCT * FROM [S1_C ANB Flag only] UNION SELECT DISTINCT * FROM [S1_C ANB Contributors (in both)]

    I think the 3 additional records in my queries are because the 750 excel records excluded the following:

    1 record for (blank) Author Names
    1 record where there are two records in S1_C with the same [Author Name] & "ANB Identifier" but differing [Person ID]
    1 record where the IIF statement in the 2nd query created a duplicate (I think the [Submitting Author Person ID] had been set to the Person ID of a duplicate record in S1_C which didn't have "ANB Identifier", but another record did have this flag)

    I'd love to find a way to account for these anomalies within the queries, but to be honest, it might just be simpler to delete the additional records manually!!

    Thanks again for your help. Really appreciate it.

  9. #9
    cwillson is offline Novice
    Windows 7 64bit Access 2010 64bit
    Join Date
    Sep 2017
    Posts
    17
    Quote Originally Posted by June7 View Post
    Why have names in both tables? Why doesn't S1_AC table have Person_ID?

    Names are very poor keys.
    Yup, I completely agree! Trouble is, this work is part of a complex, multi-asset data consolidation/migration and the source data simply is how it is....

Please reply to this thread with any new information or opinions.

Similar Threads

  1. Replies: 4
    Last Post: 08-01-2016, 06:41 AM
  2. Join fields in 2 tables and adding a criteria
    By elenaluiza in forum Access
    Replies: 2
    Last Post: 11-09-2015, 03:25 PM
  3. Replies: 1
    Last Post: 07-24-2014, 08:35 PM
  4. Changing join variables on different criteria
    By ltcarlisle in forum Queries
    Replies: 1
    Last Post: 07-01-2014, 11:56 AM
  5. RC Notation and additional criteria
    By mkc80 in forum Access
    Replies: 3
    Last Post: 11-05-2012, 03:30 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Other Forums: Microsoft Office Forums