Page 1 of 2 12 LastLast
Results 1 to 15 of 18
  1. #1
    NSearch is offline Advanced Beginner
    Windows 10 Access 2007
    Join Date
    Mar 2019
    Posts
    62

    Remove Duplicate Records

    Hi All,

    Again, thank you in advance for help with this.

    I am trying to find the fastest way to delete duplicate records from a very large table (rows, not columns).

    Table Sample Attached: table_sample.zip

    I would like to keep 1 record, but delete all records that have duplicates (for every field).

    Thanks!

  2. #2
    June7's Avatar
    June7 is online now VIP
    Windows 10 Access 2010 32bit
    Join Date
    May 2011
    Location
    The Great Land
    Posts
    52,902
    Build an aggregate query and save to a new table.

    SELECT DISTINCT Sheet1.* INTO test FROM Sheet1;
    How to attach file: http://www.accessforums.net/showthread.php?t=70301 To provide db: copy, remove confidential data, run compact & repair, zip w/Windows Compression.

  3. #3
    NSearch is offline Advanced Beginner
    Windows 10 Access 2007
    Join Date
    Mar 2019
    Posts
    62
    Hi June,

    Thanks for the message, but I am actually looking to remove the duplicate lines, not create a new table.

    If all fields of a record match all fields from another record, I would like to delete 1 instance of the record.

    If all fields of a records match all fields from 10 other records, I would like to delete 9 instances of the record.

    I hope that makes sense.

    I added an AutoNumber (ID). Please see attached: table_sample_v2.zip

    Thanks for the help!

  4. #4
    isladogs's Avatar
    isladogs is offline MVP / VIP
    Windows 10 Access 2010 32bit
    Join Date
    Jan 2014
    Location
    Somerset, UK
    Posts
    5,974
    I haven't looked at your example

    The fastest way is:
    1. Rename your table e.g. with OLD at the end of the name
    2. Copy the table design & save with the original table name
    3. Add a unique composite index to the fields which are getting duplicated.
    4. Export your data to the new table. Due to the indexes, it will only allow one of each duplicate record
    5. Check your new table & if all is OK delete the OLD table

    If you want more control over which duplicate record is retained, there is another way which I prefer ...but it involves more work
    Colin, Access MVP, Website, email
    The more I learn, the more I know I don't know. When I don't know, I keep quiet!
    If I don't know that I don't know, I don't know whether to answer

  5. #5
    NSearch is offline Advanced Beginner
    Windows 10 Access 2007
    Join Date
    Mar 2019
    Posts
    62
    Thanks isladogs for the reply.

    This isn't going to be a one time occurrence -- it will be a query used in table maintenance, possibly run up to 10+ times a day.

  6. #6
    JoeM is offline VIP
    Windows 7 32bit Access 2007
    Join Date
    Jun 2012
    Posts
    3,904
    I would do this in the following manner:
    1. Make sure that your table has a Unique Autonumber field in it
    2. Create an Aggregate Query, that groups on ALL fields except for the Autonumber field (and have it return the MIN value of the Autonumber field)
    3. Create a Delete Query that deletes all records from your table where the Autonumber field is not found in the Aggregate Query you created in step 2

    Once you have set this up, you only need to run the Delete Query each time (one step).

  7. #7
    isladogs's Avatar
    isladogs is offline MVP / VIP
    Windows 10 Access 2010 32bit
    Join Date
    Jan 2014
    Location
    Somerset, UK
    Posts
    5,974
    Unless I'm misunderstanding by not reading this properly, once you've setup composite indexes in your table and removed existing duplicates as described, it will be impossible for new duplicates to occur in the future.
    Colin, Access MVP, Website, email
    The more I learn, the more I know I don't know. When I don't know, I keep quiet!
    If I don't know that I don't know, I don't know whether to answer

  8. #8
    NSearch is offline Advanced Beginner
    Windows 10 Access 2007
    Join Date
    Mar 2019
    Posts
    62
    This is what I came up with, but it takes a very long time to run the query. Does anyone have any suggestions for a faster way to accomplish this.

    Code:
    DELETE *
    FROM tracking_number_history
    WHERE id NOT IN (SELECT min_id 
                      FROM   (SELECT Min(id) AS min_id, 
                                     client_code, 
                                     carrier_code, 
                                     tracking_number, 
                                     invoice_number, 
                                     account_number, 
                                     freight_charges, 
                                     invoice_date, 
                                     transaction_date, 
                                     pickup_record_number, 
                                     receiver_name, 
                                     receiver_company_name, 
                                     receiver_address_line_1, 
                                     receiver_address_line_2, 
                                     receiver_city, 
                                     receiver_state, 
                                     receiver_postal, 
                                     receiver_country_territory 
                              FROM   tracking_number_history 
                              GROUP  BY client_code, 
                                        carrier_code, 
                                        tracking_number, 
                                        invoice_number, 
                                        account_number, 
                                        freight_charges, 
                                        invoice_date, 
                                        transaction_date, 
                                        pickup_record_number, 
                                        receiver_name, 
                                        receiver_company_name, 
                                        receiver_address_line_1, 
                                        receiver_address_line_2, 
                                        receiver_city, 
                                        receiver_state, 
                                        receiver_postal, 
                                        receiver_country_territory));

  9. #9
    JoeM is offline VIP
    Windows 7 32bit Access 2007
    Join Date
    Jun 2012
    Posts
    3,904
    If you go with Colin's approach, one you have gone through those initial steps, it will not allow you to add duplicate records going forward (those will be rejected).
    So there wouldn't be any need to run any code or processes at all.

  10. #10
    NSearch is offline Advanced Beginner
    Windows 10 Access 2007
    Join Date
    Mar 2019
    Posts
    62
    Quote Originally Posted by isladogs View Post
    Unless I'm misunderstanding by not reading this properly, once you've setup composite indexes in your table and removed existing duplicates as described, it will be impossible for new duplicates to occur in the future.
    isladogs,

    Once I have the composite index setup for the table and I use an insert statement from VBA (excel), will the duplicate key be ignored, while inserting all other non-duplicate rows OR will the query fail because a key violation occurred?

    What I would like is for it to ignore the duplicate records and proceed with inserting all non-duplicated records.

    Thanks!

  11. #11
    JoeM is offline VIP
    Windows 7 32bit Access 2007
    Join Date
    Jun 2012
    Posts
    3,904
    Once I have the composite index setup for the table and I use an insert statement from VBA (excel), will the duplicate key be ignored, while inserting all other non-duplicate rows OR will the query fail because a key violation occurred?

    What I would like is for it to ignore the duplicate records and proceed with inserting all non-duplicated records.
    When you import the Excel file manually, it gives you a warning that some records will fail because of a key violation, and it asks you if you want to proceed anyway. If you click "Yes", it imports the new unique records and drops the duplicates. Depending on how you are writing your VBA, it should behave in the same manner. There is an easy way to find out - test it and see it works!

  12. #12
    isladogs's Avatar
    isladogs is offline MVP / VIP
    Windows 10 Access 2010 32bit
    Join Date
    Jan 2014
    Location
    Somerset, UK
    Posts
    5,974
    I think Joe has already answered your question about future imports.
    If there are duplicates in the import, the indexes will only allow the first record to be imported.
    The important thing is to design the composite index correctly to ensure this works properly.

    Once you know that is true by testing it, you can change from running an append query to executing an equivalent sql statement.
    That means you will no longer see the messages about 'X records were not imported. Do you want to continue?'
    Colin, Access MVP, Website, email
    The more I learn, the more I know I don't know. When I don't know, I keep quiet!
    If I don't know that I don't know, I don't know whether to answer

  13. #13
    NSearch is offline Advanced Beginner
    Windows 10 Access 2007
    Join Date
    Mar 2019
    Posts
    62
    I setup a composite index on the following 10 (max allowed) fields:

    Click image for larger version. 

Name:	index.jpg 
Views:	20 
Size:	124.3 KB 
ID:	38631

    ..and duplicates are still being added to the table.

    Any ideas?

    Thanks

  14. #14
    isladogs's Avatar
    isladogs is offline MVP / VIP
    Windows 10 Access 2010 32bit
    Join Date
    Jan 2014
    Location
    Somerset, UK
    Posts
    5,974
    Which fields now contain duplicates?,
    Colin, Access MVP, Website, email
    The more I learn, the more I know I don't know. When I don't know, I keep quiet!
    If I don't know that I don't know, I don't know whether to answer

  15. #15
    NSearch is offline Advanced Beginner
    Windows 10 Access 2007
    Join Date
    Mar 2019
    Posts
    62
    Quote Originally Posted by isladogs View Post
    Which fields now contain duplicates?,
    client_code,
    carrier_code,
    tracking_number,
    invoice_number,
    account_number,
    freight_charges,
    invoice_date,
    transaction_date,
    pickup_record_number,
    receiver_name,
    receiver_company_name,
    receiver_address_line_1,
    receiver_address_line_2,
    receiver_city,
    receiver_state,
    receiver_postal,
    receiver_country_territory

Page 1 of 2 12 LastLast
Please reply to this thread with any new information or opinions.

Similar Threads

  1. Replies: 2
    Last Post: 10-16-2015, 10:15 PM
  2. remove duplicate
    By abraham30 in forum Access
    Replies: 5
    Last Post: 04-23-2015, 04:01 AM
  3. Replies: 3
    Last Post: 03-22-2014, 04:32 PM
  4. Remove duplicate but keep the latest record
    By Compufreak in forum SQL Server
    Replies: 2
    Last Post: 06-12-2013, 11:55 AM
  5. Remove Duplicate
    By rgparpaccessforums in forum Access
    Replies: 3
    Last Post: 04-24-2011, 07:24 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Other Forums: Microsoft Office Forums