Results 1 to 5 of 5
  1. #1
    alvinsmode is offline Novice
    Windows 10 Access 2016
    Join Date
    Nov 2017
    Posts
    4

    Data Mining - text frequency?

    Hi everyone,



    I'm handling a data set with over a million of rows(purchase history). I'm thinking what is the simplest way to find the pattern behind it, such as what words or combinations of words are the most popular from the data set? Any advise would be greatly appreciated. Thank you!

  2. #2
    June7's Avatar
    June7 is offline VIP
    Windows 10 Access 2010 32bit
    Join Date
    May 2011
    Location
    The Great Land
    Posts
    53,644
    Pattern of what? Vendor names, product descriptions? You could do a GROUP BY aggregate query on the data.

    You have defined set of values you want to search for? Build a table of those values and use DCount() expression in query.
    How to attach file: http://www.accessforums.net/showthread.php?t=70301 To provide db: copy, remove confidential data, run compact & repair, zip w/Windows Compression.

  3. #3
    alvinsmode is offline Novice
    Windows 10 Access 2016
    Join Date
    Nov 2017
    Posts
    4
    Because the data was entered by many different employees...so very often we'd see something like "Blue Pen 12pc", "Pen 12pc Blue" or even "12piece Pen Blu", but they are the same exact product. So I'm trying to see if there is any easy way I can find the text frequency or the "N-grams?"? so I can find what are the common product descriptions regardless of the cleanliness of the descriptions.

  4. #4
    June7's Avatar
    June7 is offline VIP
    Windows 10 Access 2010 32bit
    Join Date
    May 2011
    Location
    The Great Land
    Posts
    53,644
    You've already asked this question https://www.accessforums.net/showthread.php?t=69205
    How to attach file: http://www.accessforums.net/showthread.php?t=70301 To provide db: copy, remove confidential data, run compact & repair, zip w/Windows Compression.

  5. #5
    orange's Avatar
    orange is online now Moderator
    Windows 10 Access 2010 32bit
    Join Date
    Sep 2009
    Location
    Ottawa, Ontario, Canada; West Palm Beach FL
    Posts
    16,850
    Alvin,
    What exactly have you done other than posting the question(s) in the forum? Google/Bing?
    Have you identified what you're going to do with the data and the related tables/files when you get it cleansed?

Please reply to this thread with any new information or opinions.

Similar Threads

  1. SQL searching similar names mechanism (data mining)
    By piotrek.bach in forum Access
    Replies: 4
    Last Post: 04-21-2017, 12:19 PM
  2. Replies: 5
    Last Post: 10-29-2014, 12:12 PM
  3. Report Based on Field Frequency
    By thegnome in forum Reports
    Replies: 1
    Last Post: 03-12-2013, 12:28 PM
  4. Frequency of Words in Memo Fields
    By Angrybox in forum Queries
    Replies: 1
    Last Post: 05-07-2012, 03:54 PM
  5. Generate reports by frequency
    By MFS in forum Programming
    Replies: 2
    Last Post: 11-18-2010, 08:09 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Other Forums: Microsoft Office Forums