Results 1 to 4 of 4
  1. #1
    TerraEarth is offline Advanced Beginner
    Windows 10 Access 2016
    Join Date
    Mar 2018
    Posts
    86

    Best practices for cleaning import data?

    I was wondering, in your esteemed experiences, what is preferable when it comes to cleaning data imports: After or Before the import?

    For example, I import a series of excel files from various users and use those to update live table data in the Access Database via a SQL INNER JOIN/UPDATE query. However, a lot of this data is not suitable for flowing into the database for many various reasons. For example, statuses might have been left blank, or improper codes used. It is difficult for me to control validation for these different sources across users so my only option at the moment is to clean the data before the import.



    Do you guys think it is better to clean the data pre import or post? Cleaning it post would involve querying as well as updating the data (probably with sql or VBA) whereas cleaning it pre-import would involve using vba/R/python/etc. to tear things up and set things straight before funneling it in. Thoughts?

  2. #2
    ranman256's Avatar
    ranman256 is offline VIP
    Windows Vista Access 2010 32bit
    Join Date
    Apr 2014
    Location
    Kentucky
    Posts
    9,550
    I cleaned before import.
    I had user select the XL file, then using VB code I had to open the file,
    remove the 1st blank row so the headers would be 1st,
    then some fields were both numeric and string so I had to convert the column to hard string.
    Then it would import correctly.

  3. #3
    pbaldy's Avatar
    pbaldy is offline Who is John Galt?
    Windows XP Access 2007
    Join Date
    Feb 2010
    Location
    Nevada, USA
    Posts
    22,652
    You can also import into a local table, clean/manipulate the data, then import into the "real" table.
    Paul (wino moderator)
    MS Access MVP 2007-2019
    www.BaldyWeb.com

  4. #4
    JoeM is offline VIP
    Windows 7 32bit Access 2007
    Join Date
    Jun 2012
    Posts
    3,904
    Alternatively, here is a different option.

    One option I have used is to import the data into a "temporary" holding table, which is structured very similar to the final desintaion table I am importing it to.
    Then, you can create a Query on this "temporary" table, adding criteria to only select the records that you want to import to your final table.
    Then, change this query to an Append Query to write those records to the final table.

    If you then delete all the data out of the Temporary table afterwards, then you can re-use this process over and over (once you have set up the temporary table and query the first time, you should have to do that again - simply re-use it).

    Just note that if you delete a lot of data out of Access tables, it is important to regularly "Compact and Repair" the database to purge those deleted records, and keep the database size down (even though you no longer see the records in the table after your delete them, they really aren't completely deleted from the database until they are purged like that).


    EDIT: Just saw Paul's reply. So I just described in detail what he is recommending (I am just a slow typist, and he posted that while I was trying to type this all up!).

Please reply to this thread with any new information or opinions.

Similar Threads

  1. Best Practices for Regular Ongoing CSV Import
    By WallyJ in forum Import/Export Data
    Replies: 3
    Last Post: 03-29-2017, 03:40 PM
  2. Help data cleaning
    By baronqueefington in forum Queries
    Replies: 1
    Last Post: 01-05-2016, 04:20 PM
  3. Replies: 8
    Last Post: 10-13-2014, 12:19 PM
  4. Cleaning Data
    By Sck in forum Queries
    Replies: 1
    Last Post: 07-22-2010, 12:43 AM
  5. Cleaning Up Data - Need Help
    By NeedHelp in forum Access
    Replies: 2
    Last Post: 06-05-2010, 10:06 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Other Forums: Microsoft Office Forums