Results 1 to 11 of 11
  1. #1
    SteveFlash is offline Advanced Beginner
    Windows 7 64bit Access 2010 64bit
    Join Date
    Jan 2013
    Posts
    39

    Talking Advice for a begineer

    Hi! I am working on a database hobby project and I am very much a beginner. I really need some advice because I feel like I am spinning my wheels.




    I have two large sources of data. One is about 750,000 rows of user inputted data, the other is about 2.5 million rows of standardized error free data with a 10 digit ID code.


    Every of the 750,000 rows of data that is user inputted can be find within the other sheet but it often has human errors so I need to match a lot of inexact data. I need to link them all up the best I can and tag all of the user inputted data with the 10 digit code. That way I do not have to relay on the user inputted data with errors and everything is standardized. The two sheets share the following columns Fname, Lname, Address and City.


    Up until this point I have been trying to match all of the data through Excel but the biggest problem is how large the data is. You can't have sheets any bigger then a million rows so it becomes near impossible at one point. My plan was always to clean the data in Excel and then load it into Access where it could become user friendly.


    Can anyone tell me if I am headed down the right path? Can I use Access to match all of the data easier then Excel? Am I on the right path?


    Please any advice at all would be so helpful this thing is driving me crazy.

    EDIT: Ha speaking of human errors, I spelled
    beginner wrong.

  2. #2
    stmoong is offline Competent Performer
    Windows XP Access 2007
    Join Date
    Sep 2011
    Location
    Singapore
    Posts
    108
    You could import the raw data into Access. Then, do the cleanup and updates from there.

  3. #3
    June7's Avatar
    June7 is offline VIP
    Windows XP Access 2010 32bit
    Join Date
    May 2011
    Location
    The Great Land
    Posts
    53,630
    A query in Access can probably help with this. Be aware Access has a 2gb size limit.
    How to attach file: http://www.accessforums.net/showthread.php?t=70301 To provide db: copy, remove confidential data, run compact & repair, zip w/Windows Compression.

  4. #4
    SteveFlash is offline Advanced Beginner
    Windows 7 64bit Access 2010 64bit
    Join Date
    Jan 2013
    Posts
    39
    Ok. Can you just send me in the direction of what to Google?

    I know how to import data in to Access and work around a but but does Access really have a Fuzzy Logic of sorts like Excel?

  5. #5
    stmoong is offline Competent Performer
    Windows XP Access 2007
    Join Date
    Sep 2011
    Location
    Singapore
    Posts
    108
    A couple of ways. One way is to make your own using regular expression (Regex) which allows you to do pattern matching. This can be done from Access or Excel.

    The other way I could think of using Excel, is to write some VBA code which will read in the raw data files, create new workbook for processing, split the million of rows into many "system input" worksheets, and do the matching from the "human input" worksheet(s) with the "system input" worksheets. The "human input" worksheet(s) could also have more than one worksheet. Perhaps, this may not be the cleanest approach.

    Then try this add-in for Excel from Microsoft.
    http://blogs.msdn.com/b/business_int...for-excel.aspx

    More discussion about this subject over here: http://social.msdn.microsoft.com/For...b-fb0d45713e1d

    Another way is using SoundEx described here:
    http://allenbrowne.com/vba-Soundex.html

    You could probably test it out using a small subset of data to see which way offer the best way to match.

  6. #6
    June7's Avatar
    June7 is offline VIP
    Windows XP Access 2010 32bit
    Join Date
    May 2011
    Location
    The Great Land
    Posts
    53,630
    Start with Find Unmatched query. Access has wildcards but won't help you in running a simple Find Unmatched query. VBA code would be required to do anything more complicated than Find Unmatched query could perform.
    How to attach file: http://www.accessforums.net/showthread.php?t=70301 To provide db: copy, remove confidential data, run compact & repair, zip w/Windows Compression.

  7. #7
    SteveFlash is offline Advanced Beginner
    Windows 7 64bit Access 2010 64bit
    Join Date
    Jan 2013
    Posts
    39
    Whoa thanks for the tips. See I would be able to match all of it in Excel but there is just WAY to much data. I mean for me to match the Names I needs to sort like 20 sheets in two ABC order by Lname and then re-sort everything by City to search via address and so on and so forth.

    If anyone is curious this is kinda what some of the cleaned data looks like: http://i.imgur.com/pezUn.jpg

    That first outlined block on the far left is the user inputted data with errors. The next four outlined blocks of data are matches which were copied from the clean table with no errors. Right now I am having a tough time deciding when a match is "good". I need to create some kinda formula to run down the first outlined block of data and decide which one of the four match cells are the best.

  8. #8
    June7's Avatar
    June7 is offline VIP
    Windows XP Access 2010 32bit
    Join Date
    May 2011
    Location
    The Great Land
    Posts
    53,630
    Google: Access VBA fuzzy matching

    Review http://allenbrowne.com/vba-Soundex.html
    How to attach file: http://www.accessforums.net/showthread.php?t=70301 To provide db: copy, remove confidential data, run compact & repair, zip w/Windows Compression.

  9. #9
    SteveFlash is offline Advanced Beginner
    Windows 7 64bit Access 2010 64bit
    Join Date
    Jan 2013
    Posts
    39
    Thank you June7!!! I think that is just what I need.

  10. #10
    SteveFlash is offline Advanced Beginner
    Windows 7 64bit Access 2010 64bit
    Join Date
    Jan 2013
    Posts
    39
    When I try to use the SoundEx VBA it doesn't seem to work. When I try to compile it I get an error window that says "sub or function not defined". Any thoughts?

  11. #11
    June7's Avatar
    June7 is offline VIP
    Windows XP Access 2010 32bit
    Join Date
    May 2011
    Location
    The Great Land
    Posts
    53,630
    Would have to know more about how you implemented the function code. If you strictly followed Allen's instructions, should work. His stuff usually does. However, I have never used this one.

    If you want to provide your effort for analysis, follow instructions at bottom of my post.
    How to attach file: http://www.accessforums.net/showthread.php?t=70301 To provide db: copy, remove confidential data, run compact & repair, zip w/Windows Compression.

Please reply to this thread with any new information or opinions.

Similar Threads

  1. New-and need help as well as advice
    By bjd1020 in forum Access
    Replies: 8
    Last Post: 11-16-2012, 02:25 PM
  2. Need advice about dashboards
    By riaarora in forum Access
    Replies: 7
    Last Post: 06-28-2012, 01:38 PM
  3. Need some advice and help
    By winterh in forum Access
    Replies: 9
    Last Post: 04-18-2012, 06:41 AM
  4. Form advice
    By rvangend in forum Forms
    Replies: 4
    Last Post: 01-17-2011, 01:59 PM
  5. Need advice on what I have so far
    By rumplestiltskin in forum Database Design
    Replies: 2
    Last Post: 05-25-2006, 12:48 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Other Forums: Microsoft Office Forums