Results 1 to 9 of 9
  1. #1
    julia.ranauro is offline Novice
    Windows 7 64bit Access 2010 64bit
    Join Date
    Jun 2013
    Posts
    4

    Combining Rows with Duplicate Values


    Hi all,

    I am posting as someone completely unfamiliar with Microsoft Access so I apologize if I word something incorrectly or am unaware of the capabilities of the program.

    I am working with an Excel file of raw data aggregated from an annual customer relationship survey that has been sent out since 2010. The file has approximately 20,000 rows, meaning the survey has been taken around 20,000 times. Unfortunately, each time a customer takes the survey, it is included in the raw data as a separate entry. Therefore the file has numerous duplicate email addresses, corresponding with unique data for each time the survey was taken. Another issue regarding the data, is that in the first year the survey was sent out, the distribution mechanism "broke" and the survey was sent out multiple times (and completed multiple times) in the same year by the same customer, so the surveys are not necessarily uniformly distributed, if that makes sense. I have been interested in isolating the common respondents (those who have taken the survey across multiple years, albeit not necessarily consecutive years). Up to this point, these respondents have been isolated manually using a pivot table, however I am now looking to enter the raw data into SPSS (a statistical analysis package), in order to view the drivers of these common respondents.

    Therefore, I would like to be able to isolate these common respondents and the data corresponding with their surveys from the raw data in a separate worksheet or file. I have tried various formulas to do so in Excel to no avail. Is there anyway to accomplish this in Access or would a more complex database be needed?

    Ideally the final product would have the common respondents' information from multiple surveys in one row and would be able to be filtered by feedback date (found in the raw data), so that hypothetically one could select a month and be able to tell how many customers considered to be common respondents completed a survey within that month.
    Last edited by julia.ranauro; 06-18-2013 at 09:14 AM.

  2. #2
    Join Date
    May 2013
    Location
    Dallas TX
    Posts
    1,742
    Yes, access can support this easily.
    In essence, you will import the data into a temporary table, and then append the data from the temporary table onto the permanent tables of a database with the structure that you want. As part of that process, you would identify any duplicate responders and connect all their surveys to a single copy of their name record, then eliminate the unneeded name records.

    How is the spreadsheet structured? Is every column the answers to the same question, or have the questions changed over time?

  3. #3
    julia.ranauro is offline Novice
    Windows 7 64bit Access 2010 64bit
    Join Date
    Jun 2013
    Posts
    4
    Thank you very much for your reply! I'm not sure if this makes things simpler but no the questions have not changed over time. If it is not too much trouble, would you be able to post the steps needed for me to do this?

  4. #4
    Join Date
    May 2013
    Location
    Dallas TX
    Posts
    1,742
    Can you post the heading lines of the excel spreadsheet? You can fake the data, it's just easier to speak rationally when I can see what I am talking about.

  5. #5
    julia.ranauro is offline Novice
    Windows 7 64bit Access 2010 64bit
    Join Date
    Jun 2013
    Posts
    4
    Participant Name Participant Email Company Invitation Date Reminder1 Date Reminder2 Date Feedback Date Primary Addr Country Name Primary Phone Number Recommend score Survey Month # of Surveys Taken Common Respondent Fiscal Year Common Respondent According to FY
    John Doe johndoe@email.com ACME Inc. 3/24/2010 3/25/2010 US 555-353-5555 10 201006 2 Yes FY1 FY1 & FY3
    Sally Smith ssmith@email.com Dunder Mifflin 5/14/2011 5/14/2011 US 555-555-5555 7 201108 1 No FY2 N/A
    John Doe johndoe@email.com 3/10/2012 3/11/2012 US 201206 2 Yes FY3

    I apologize if this format is not the best. Let me know if you need more information than what I have provided. There are over 200 columns in the actual Excel file, but for the purpose of simplicity I included only the most important here.

  6. #6
    Join Date
    May 2013
    Location
    Dallas TX
    Posts
    1,742

    High Level Description, Part One

    Okay, but you've cut out the fields that actually have the survey results, right? Give me a couple of examples of the Q's and A's, please.
    Code:
    Participant Name 
    Participant Email 
    Company 
    Invitation Date 
    Reminder1 Date 
    Reminder2 Date 
    Feedback Date 
    Primary Addr Country Name 
    Primary Phone Number 
    ----------((5 invisible fields))
    Recommend score 
    ----------((7 invisible fields))
    Survey Month 
    ----------((9 invisible fields))
    # of Surveys Taken 
    Common Respondent 
    Fiscal Year 
    Common Respondent According to FY
    All right, here's an ultra-high-level description of the process:
    Before you start, you'll group the data columns into meaningful groups -
    * data about the participant, name, company, country
    * data about a particular survey itself (like dates, survey month),
    * data that is answers to the questions, and
    * data that represents analysis of other data (# of surveys taken, common resp).
    You'll also have to decide if you have enough respondents from the same company and/or address that you want to have a company-level record. You know your data, I don't.

    Then, you'll begin the import process.
    First, you'll do some setups and import your excel data into a temp table.
    Next, you'll massage the Participant data to identify and eliminate duplicates, without losing the ability to connect the deduped data back to the individual rows of data.
    Then, you'll move the survey questions/answers off into a table of their own, retaining the links back to their unique row.
    Finally, you'll verify that you can replicate the analysis information from the actual data. If so, you'll be ready to delete all the redundancies and put the database into action.

    Okay, here's the high-level description of importing the data, down to breaking out the participant data -
    1) Before starting, create a play version of your excel workbook and change the column name in the top of each column to what you really want each field to be named. Short, simple, distinct, no spaces in the name. Add a Rowkey field in column A, with = row() as the value.
    If you have more than 255 columns of data, there will be special work to get the last few. Let me know and I'll describe that too.

    2) From Access, in a new blank "play" database, import the worksheet into a working table
    Code:
    tblTempSurvey
       RowKey 
       PartName
       PartEMail
       PartCompany
       (etc)
    3) Create an empty table with this structure. It should include every field that has to do with the participant, but nothing that has to do with individual surveys or summary or analysis:
    Code:
    tblTempParticipant
       TempPartID    (autokey)
       RowKey
       PartName
       PartEmail
       PartCompany
       PrimaryAddr
       PrimaryPhone
       (etc)
    Run an SQL INSERT query similar to this code to copy the participant data into that table
    Code:
    INSERT INTO tblTempParticipant
    FIELDS (RowKey, PartName, PartEmail, PartCompany, ... other fields...)
    (SELECT TTS.RowKey, TTS.PartName, TTS.PartEmail, TTS.PartCompany, ... other fields...)
    FROM tblTempSurvey AS TTS)
    4) Next, You'll do analysis on that table. For instance, see how many data problems you have with people changing email addresses, changing companies, two different names using the same email address, and so on. Here's a sample query for the same person with two different email addresses.
    Code:
    SELECT T1.PartName, T1.PartEmail, T2.PartName, T2.PartEmail
    FROM tblTempParticipant AS T1, tblTempParticipant AS T2
    WHERE T1.PartName = T2.PartName 
    AND T1.PartEmail <> T2.PartEmail;
    Do the analysis first, to see if you have BIG problems. If you've only got onesies and twosies, then you can proceed. On the other hand, if you have LOTS of weird stuff regarding email multiple email addresses and people switching companies, then the design will need some tweaking before you go to the trouble of importing.

    5) If all is well, then we'll end up creating a little translation table with the from and to TempPartIDs for the survey.

  7. #7
    Join Date
    May 2013
    Location
    Dallas TX
    Posts
    1,742
    This is a rough cut on SQL to create a record for each row with the lowest of the appropriate duplicate Participant records. You could join on as many of the participant fields as you wanted.

    Code:
    INSERT INTO tblPartOldNew
    FIELDS (RowKey, NewPartID)
      (SELECT T1.RowKey, MIN(T2.PartID)
      FROM tblTempParticipant AS T1 , tblTempParticipant AS T2 
      WHERE T1.PartName = T2.PartName 
      AND T1.PartEmail = T2.PartEmail
      AND T1.PartCompany = T2.PartCompany,
      GROUP By T1.RowKey);

  8. #8
    julia.ranauro is offline Novice
    Windows 7 64bit Access 2010 64bit
    Join Date
    Jun 2013
    Posts
    4
    Wow this is great, thank you so much for your help. I have not had the time yet to test this process out but as soon as I do I will comment again. I did cut out some of the survey questions. Most are simply rankings, on a scale of 1-10, as to how likely they are to recommend the product to someone, how satisfied they are with a certain service, with 10 indicating the most likely or most satisfied etc. Here are a few of the open-ended questions and sample answers:

    Q: Is there anything you feel [Our Company] does particularly well?
    A: Support services are great.
    A: The product itself never disappoints

    Q: Is there an area you feel that [Our Company] can improve upon?
    A: Timeliness
    A: Ensuring the consumer receives the appropriate documentation

    Hopefully this helps!

  9. #9
    Join Date
    May 2013
    Location
    Dallas TX
    Posts
    1,742

    Structure for Questions and Answers

    Okay, the reason I asked is that the standard for storing survey questions and answers looks something like this:
    Code:
     tblSurvey
    SurveyID        AutoKey
    ParticipantID   Foreign Key to tblParticipant
    (all those dates and survey info)
    
    tblQuestions
    QuestID     Autokey
    QuestText   Text
    
    tblAnswers
    AnswerID   Autokey
    SurveyID   Foreign Key to tblSurvey 
    QuestID    Foreign Key to tblQuestion
    AnswerText (the answer itself)
    This structure makes analyzing the answers, and adapting to changes in the questions, much easier. Some people will duplicate the participantID on tblAnswer, or make other slight alterations of convenience.

    Basically, you just do exactly the same kind of insert for each column independently.
    Code:
    INSERT INTO tblAnswers
    FIELDS (SurveyID, QuestID, AnswerText)
      (SELECT T1.RowKey, 1, T1.Q1Answer
      FROM tblTempParticipant AS T1);
    
    INSERT INTO tblAnswers
    FIELDS (SurveyID, QuestID, AnswerText)
      (SELECT T1.RowKey, 2, T1.Q2Answer
      FROM tblTempParticipant AS T1);
    
    INSERT INTO tblAnswers
    FIELDS (SurveyID, QuestID, AnswerText)
      (SELECT T1.RowKey, 3, T1.Q3Answer
      FROM tblTempParticipant AS T1);
    And you'll put the matching questions into tblQuestion:
    Code:
    INSERT INTO tblQuestions
    FIELDS (QuestID, QuestionText)
    VALUES  (1, "Is there anything you feel [Our Company] does particularly well?");
    
    INSERT INTO tblQuestions
    FIELDS (QuestID, QuestionText)
    VALUES  (2, "Is there an area you feel that [Our Company] can improve upon?");

Please reply to this thread with any new information or opinions.

Similar Threads

  1. combining records(rows)
    By pradeep.sands in forum Queries
    Replies: 4
    Last Post: 06-13-2013, 12:19 PM
  2. Replies: 5
    Last Post: 01-29-2013, 03:38 PM
  3. combining rows.. again
    By jerryb in forum Access
    Replies: 16
    Last Post: 05-03-2012, 08:34 PM
  4. Combining rows
    By Bing in forum Queries
    Replies: 1
    Last Post: 06-09-2011, 12:54 PM
  5. Combining columns into rows
    By steeveepee33 in forum Queries
    Replies: 5
    Last Post: 04-30-2009, 09:18 PM

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Other Forums: Microsoft Office Forums