Results 1 to 9 of 9
  1. #1
    taholmes160 is offline Advanced Beginner
    Windows 7 64bit Access 2016
    Join Date
    Nov 2018
    Location
    Lester, Ohio
    Posts
    67

    Using a query to find and eliminate duplicate (or near duplicate) entries in a table.

    in the attached db, i have a table, TblCommodities, that contains 11,000+ commodities that might be shipped by rail. each of them has a unique commodityID attached, which as it should be, and the commoditiesIDs are in TblConsignees and TblShippers indicating which commodities, which companies ship or receive.

    The problem is that some commodities are identical except for differences in capitalization, or other extremely minor differences. What I would like to do is to eliminate one member of each pair, AND to update the shippers or receivers containing that commodity ID to the ID of the remaining commodity.

    Consider the following example:

    From TblCommodities:

    CommodityID Commodity
    14 68 rolls
    15 72 rolls
    16 81 rolls
    17 93 rls
    18 96 rolls
    19 abrasive grains
    20 abrasives
    21 AC
    22 Acadian)
    23 Accessories
    24 accy


    From a find duplicate query in access:

    Commodity CommodityID
    abrasives 20
    Abrasives 3067
    AC 3072
    AC 21
    Accessories 3073
    Accessories 23
    acetic acid 25
    Acetic acid 3078
    acid 26
    Acid 3085



    Note that abrasives has ID 20 and Abrasives has an ID of 3067 -- I'd like to eliminate one of the (lets say 20 for the sake of argument and update all records having commodityID of 20 to CommodityID of 3067


    From TblConsignees - these are the companies that receive commodityID 20 - abrasives
    ID ConCompany CommodityID
    52829 Carborundum Co 20
    52830 Behr Manning Corp 20
    29376 Mn Mining & Mfg Co Main Plt 20
    51129 Stan-Blast Abrasives Div US Minerals 20
    19016 Herfine Realty/ Elwood Shotblasting 20
    19018 Washington Mills 20

    and from TblShippers - these are the companies that SHIP commodityID 20 - abrasives

    ID ShCompany CommodityID
    52831 Green Diamond Products 20
    52832 Pacific Abrasives Supply 20
    52828 Cdn Carborundum 20
    29377 National Metal Abrasives 20
    19017 Washington Mills Abrasives Coo 20
    19015 Norton Co 20
    19014 Harrison Abrasive Co 20
    19012 Behr Manning Corp. 20
    19011 Barton Mines Corp. 20
    19013 Barton Mines Corp. 20

    There are somewhere north of 2000 near duplicate commodities -- if at all possible, I'd like this purge and update process to be automatedSLRG_Waybills - to send.zip

  2. #2
    ssanfu is offline Master of Nothing
    Windows 7 32bit Access 2010 32bit
    Join Date
    Sep 2010
    Location
    Anchorage, Alaska, USA
    Posts
    9,664
    You really should fix some things before moving on:

    Use only letters and numbers (exception is the underscore) for object names.
    Do not use spaces, punctuation or special characters in object names.
    Do not begin an object name with a number.
    Do not use Look up FIELDS, multi-Value fields or Calculated fields in tables.


    PK fields (if numeric) should be Long Integers <<<----
    See Microsoft Access Tables: Primary Key Tips and Techniques


    **** USE A COPY OF YOUR DB TO TEST THIS CODE****

    I say again **** USE A COPY OF YOUR DB TO TEST THIS CODE****

    Here is the code I came up with. Create a standard module ("Module1"??). Paste the code below into the module.
    Click anywhere in the code and press the F5 key.
    Code:
    Option Compare Database   '<<-- should be at the top of EVERY module
    Option Explicit           '<<-- should be at the top of EVERY module
    
    
    Public Sub RemoveDuplicates()
        Dim d As DAO.Database
        Dim r As DAO.Recordset
        Dim sSQL As String
        Dim Commodity2Keep As Long
        Dim Commodity2Delete As Long
        Dim StTime As Date
        Dim EndTime As Date
        Dim knt As Long
    
        StTime = Now()
    
        Set d = CurrentDb
    
        Set r = d.OpenRecordset("FindDuplicatesForTblCommodities")
        If Not r.BOF And Not r.EOF Then
            r.MoveLast
            r.MoveFirst
    
            knt = 0
            Do
                knt = knt + 1
                'get first Commodity ID
                Commodity2Delete = r!CommodityID
                r.MoveNext
                'get second Commodity ID
                Commodity2Keep = r!CommodityID
    
                'update TblConsignees
                sSQL = "UPDATE TblConsignees"
                sSQL = sSQL & " SET TblConsignees.CommodityID = " & Commodity2Keep
                sSQL = sSQL & " WHERE TblConsignees.CommodityID = " & Commodity2Delete
                '            Debug.Print sSQL
                d.Execute sSQL, dbFailOnError
    
                'update TblShippers
                sSQL = "UPDATE TblShippers"
                sSQL = sSQL & " SET TblShippers.CommodityID = " & Commodity2Keep
                sSQL = sSQL & " WHERE TblShippers.CommodityID = " & Commodity2Delete
                '            Debug.Print sSQL
                d.Execute sSQL, dbFailOnError
    
                'delete from  TblCommodities
                sSQL = "Delete *"
                sSQL = sSQL & " FROM TblCommodities"
                sSQL = sSQL & " WHERE TblCommodities.CommodityID = " & Commodity2Delete
                '            Debug.Print sSQL
                d.Execute sSQL, dbFailOnError
    
                r.MoveNext
            Loop Until r.EOF
        End If
    
    
        'clean up
        r.Close
        Set r = Nothing
        Set d = Nothing
    
        EndTime = Now()
    
        MsgBox "Done!" & "  Elapsed time = " & DateDiff("s", StTime, EndTime) & " seconds for " & knt & " loops"
    End Sub

  3. #3
    taholmes160 is offline Advanced Beginner
    Windows 7 64bit Access 2016
    Join Date
    Nov 2018
    Location
    Lester, Ohio
    Posts
    67
    Hey Steve:

    Thanks a bunch -- It will be saturday before I can mess with this, but it looks great

    TIM

  4. #4
    taholmes160 is offline Advanced Beginner
    Windows 7 64bit Access 2016
    Join Date
    Nov 2018
    Location
    Lester, Ohio
    Posts
    67
    HEy Steve -- its running right now, I'll let you know how it does, thanks a bunch --

    while its running, I had another question for you.

    You mentioned that primary keys should be long int values, does that exclude the using of autonumber fields? I always set my primary keys as autonumber and leave it go, what different should I be doing?

    thanks
    TIM

  5. #5
    taholmes160 is offline Advanced Beginner
    Windows 7 64bit Access 2016
    Join Date
    Nov 2018
    Location
    Lester, Ohio
    Posts
    67
    Hey Steve:

    Thanks so much -- it worked great -- I really appreciate it a lot

    TIM

  6. #6
    ssanfu is offline Master of Nothing
    Windows 7 32bit Access 2010 32bit
    Join Date
    Sep 2010
    Location
    Anchorage, Alaska, USA
    Posts
    9,664
    Glad the code worked for you....





    Quote Originally Posted by taholmes160 View Post
    You mentioned that primary keys should be long int values, does that exclude the using of autonumber fields? I always set my primary keys as autonumber and leave it go, what different should I be doing?
    "autonumbers" ARE Long Intgers.


    I tried to set RI on your tables but the naming of fields was inconsistent and some of the field types were not correct.
    Table name
    PK field
    PK field type
    Comment
    Locomotives LocoID - good Integer default value of 0
    TblCommodities CommodityID - good Double
    TblConsignees ID - poor Double Not set as PK
    TblShippers ID - poor Double Not set as PK
    TblRoads RoadID Long Integer Not set as PK

    I always set the PK field type as autonumber (even if I don't think I'll need a PK field).
    And "ID" is really not a good name for a field. Can you imagine having all fields with a PK field of "ID"?? How confusing would that be?


    Compare these two images of tables in your dB :

    How you set your field names & relationships
    Click image for larger version. 

Name:	Relationship1.png 
Views:	11 
Size:	50.1 KB 
ID:	36574


    How I would set the field names & relationships
    Click image for larger version. 

Name:	Relationship2.png 
Views:	11 
Size:	52.1 KB 
ID:	36575

    I use "_PK" and "_FK" suffixes (some don't like the underscore), "ID" means (for me) the field is an autonumber type and the PK field has the same root name as the table. The FK field has the same name as the related PK field (but with FK)

    In the "WaybillTable", which other fields are FK fields?

    Also note that I removed the spaces, "#" and "/" in the field names. And the FK fields are at the top of the field names.



    I realize the above is off topic a little from the original question, but.......... you asked.
    See Microsoft Access Tables: Primary Key Tips and Techniques

  7. #7
    taholmes160 is offline Advanced Beginner
    Windows 10 Access 2016
    Join Date
    Nov 2018
    Location
    Lester, Ohio
    Posts
    67
    Ive cleaned things up a lot, and re-worked my data (I think its properly normalized now). Ive also started a new thread about it -- here

    Thanks for all the help

    TIM

  8. #8
    ssanfu is offline Master of Nothing
    Windows 7 32bit Access 2010 32bit
    Join Date
    Sep 2010
    Location
    Anchorage, Alaska, USA
    Posts
    9,664
    I looked at the dB in your other thread.

    Why would you ever use a Double type as a PK field type??? Or an Integer???

    IMHO, you still have issues with your dB design.
    I'm backing out. Good luck with your project......

  9. #9
    taholmes160 is offline Advanced Beginner
    Windows 10 Access 2016
    Join Date
    Nov 2018
    Location
    Lester, Ohio
    Posts
    67
    Hey Steve:

    First off -- thanks for all the help -- I really do appreciate it a lot

    Secondly, Im sorry, In changing a bunch of stuff, I missed fixing those PK values -- Just plain slipped past me -- you are totally right, and I'll be fixing them shortly, I didnt mean to overlook it, but I got side tracked and just plain missed doing it.

    Thanks again for all your help -- I really do appreciate it a lot

    TIM

Please reply to this thread with any new information or opinions.

Similar Threads

  1. Replies: 6
    Last Post: 01-08-2018, 12:39 PM
  2. Replies: 7
    Last Post: 04-11-2015, 10:19 AM
  3. Replies: 4
    Last Post: 04-21-2014, 11:44 AM
  4. Replies: 4
    Last Post: 11-17-2012, 03:07 PM
  5. Replies: 1
    Last Post: 12-09-2011, 07:34 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Other Forums: Microsoft Office Forums