Page 2 of 3 FirstFirst 123 LastLast
Results 16 to 30 of 34
  1. #16
    johnseito is offline Competent Performer
    Windows 7 64bit Access 2010 64bit
    Join Date
    Aug 2013
    Posts
    419


    Quote Originally Posted by John_G View Post
    Hi -

    Referring to your post #12 - ("more like this!")

    That makes life a whole lot simpler! A million rows sounds like a lot, but isn't really in the overall scheme of things. One crosstab query could generate your diagram as above for the whole dataset. (Not as pretty of course, and how long it would take is another issue entirely).

    But with one ID - Event per row, it's easy to come up with counts for various analyses and decisions.

    John

    you mean generate the diagram I had in pivot table ? I presume it will take very long if is done in query.
    Actually is not just one ID and one Event, one ID can have many events (more than one, two, three events).

  2. #17
    John_G is offline VIP
    Windows 7 32bit Access 2010 32bit
    Join Date
    Oct 2011
    Location
    Ottawa, ON (area)
    Posts
    2,615
    What data is in each row of the input (source) table?

    Is it ONE ID and ONLY ONE event per row, or is it ONE ID and MORE THAN ONE event? You have given different answers in different places.
    If you have MORE THAN ONE event in each row, then a lot of extra work (= processing time) may be required to reformat the data first.

    Which is it?

    You have described the process you would need reasonably well in the previous post, and it is more or less what I would do to select a sample, given the percentage required.

    It would help if you could post a screenshot of the table description, showing what the fields are, and we can go from there.

    John

  3. #18
    johnseito is offline Competent Performer
    Windows 7 64bit Access 2010 64bit
    Join Date
    Aug 2013
    Posts
    419
    What data is in each row of the input (source) table?
    Hi John,

    Nice to see your message again. We are only focusing on two columns.
    One is ID and the other is Event. The other columns although in the table, are there and can be
    anything but I don't think it will affect our program in is calculation for the final result of the outcome.


    Is it ONE ID and ONLY ONE event per row, or is it ONE ID and MORE THAN ONE event?


    We are only focusing on two columns so it should be ONE ID and ONLY ONE event per row.

    and it is more or less what I would do to select a sample, given the percentage required.


    yes you are correct, but meeting all the rules.

    Let me know if other clarification could be made ! Thanks ! :-)



  4. #19
    johnseito is offline Competent Performer
    Windows 7 64bit Access 2010 64bit
    Join Date
    Aug 2013
    Posts
    419
    then a lot of extra work (= processing time) may be required to reformat the data first.
    The event is in a single column and the ID is in a single column.
    So that is one event per row.

    I believe even if the events are in one row per event, it would still take a lot of processing time because you could have duplicated events
    in different rows for the same ID.

    For example, you have a total of ten rows.

    row 2 you have ID A, you have event 1a, row 5 you have ID A, you can have event 1a,
    row 10, you have ID A, you have event 1a.

    So total 10 rows, for ID A, you have three duplicated Event.

  5. #20
    johnseito is offline Competent Performer
    Windows 7 64bit Access 2010 64bit
    Join Date
    Aug 2013
    Posts
    419
    Actually is not just one ID and one Event, one ID can have many events (more than one, two, three events).
    Let make it clear, what I am saying here is different rows, not columns. One ID can have many and duplicated events in different rows. We are only focusing on two columns.

  6. #21
    johnseito is offline Competent Performer
    Windows 7 64bit Access 2010 64bit
    Join Date
    Aug 2013
    Posts
    419
    I will give you a screen shot of the table in example. I'll do it later today. Thanks for your patience. :-)

  7. #22
    John_G is offline VIP
    Windows 7 32bit Access 2010 32bit
    Join Date
    Oct 2011
    Location
    Ottawa, ON (area)
    Posts
    2,615
    Hi -

    Glad we got that all straightened out. Access queries will do a lot of what you need, including duplicate removal if the query is set up properly, and if the tables are properly indexed they can be quite efficient time-wise.

    The difficulty you will encounter will be in meeting the criteria of All ID and All Events being represented. The ID is easy enough - either it is in the source table or it isn't, but ensuring that you get at least one of each event is probably not easy.

    How many different ID's and different events do you have - you might be able to do something with that.

    John

  8. #23
    orange's Avatar
    orange is offline Moderator
    Windows XP Access 2003
    Join Date
    Sep 2009
    Location
    Ottawa, Ontario, Canada; West Palm Beach FL
    Posts
    16,870
    johnseito,

    See if this helps or "fits" your situation. I'm finding your post is dealing with intangibles and or theory -- nothing concrete.
    If you can show us some real terms of what your trying to do, then perhaps more focused responses are possible.

    The link I'm suggesting deals with Random Top N from a group.

    Hope the link is helpful.

  9. #24
    johnseito is offline Competent Performer
    Windows 7 64bit Access 2010 64bit
    Join Date
    Aug 2013
    Posts
    419
    Hi Orange,

    Thanks for answering.

    How do you know that this is not concrete I provided examples. I will look at your link but your link is top N % however, is more than that because the rules need to be met.

  10. #25
    johnseito is offline Competent Performer
    Windows 7 64bit Access 2010 64bit
    Join Date
    Aug 2013
    Posts
    419
    Orange

    I thought I had provided well detailed and concret examples. If you think somewhere that I didn't then let me know and I will be as clear and as detailed and concrete as possible.

    I think your link with the top N and randomized is good but the rule is to select top N randomized with weighted for every ID and their associated events.

    The end goal is if a user enter a percent, in this case 21.24% that 21.24% (and it has to be an exact number rounding up) is of the entire pool from the table meeting every ID and event that is weighted according to how many ID and how many event it has. All ID and all events is in our final pool when the export is given to our user.

    Again thanks for the link, it is helpful and I am going to look more into it. I think your link provide partial solution with more work needed and there may be obstacle, just my opinion.

    Thanks. :-)

  11. #26
    johnseito is offline Competent Performer
    Windows 7 64bit Access 2010 64bit
    Join Date
    Aug 2013
    Posts
    419
    I made this thread to see how difficult it would be to do this in access and how efficient (speed in minutes) it is to do this in access. I believe query in itself can't do this.

    I haven't implemented this in access at all but would like to, however I think it will be difficult and there will be challenges as I built it as I am not an expert in access, although I wish I was. :-)

    Not sure what you guys think, and how efficient and how difficult it would be to do this in access. Is it even possible ? I am thinking that you may work with one example, let's say 500 in a pool from a table and it works fine, now the program is given 5000, or 50000. It may not work for the 5000 or 50000 but it worked perfectly for the tested 500 example.

  12. #27
    John_G is offline VIP
    Windows 7 32bit Access 2010 32bit
    Join Date
    Oct 2011
    Location
    Ottawa, ON (area)
    Posts
    2,615
    Hi -

    Some more thought on what you want to do. First, I think Access will do what you want to do. Queries are actually quite efficient if written properly. I did a quick test on a table I had of almost 600,000 records - to do a count of the number of occurrences of each value in a field took about 5 seconds, and that was over a network. So it's not bad.

    A question about your sampling requirement. In the data tables you want to sample, is every ID associated with every possible event at least once, or will some events be associated with only a few ID's? In terms of how to solve your problem, it's not a major issue, but the answer to that question will help clarify your criteria.

    Above, in post #19, you stated that you can't have duplicates in the sample, i.e.no duplicate ID - event combinations. Did I understand that correctly? If that is correct, it will make a difference in how to proceed.

    John

  13. #28
    johnseito is offline Competent Performer
    Windows 7 64bit Access 2010 64bit
    Join Date
    Aug 2013
    Posts
    419
    is every ID associated with every possible event at least once, or will some events be associated with only a few ID's?
    No, not every ID will have every possible event (referring to all unique IDs), For example there are 5 unique events, some ID will have only 3 of those events,
    and other ID might have all the events and some might just have 2.

    Above, in post #19, you stated that you can't have duplicates


    Looks like I said you could have duplicate events per ID but in different rows in a table.
    You can also have same event for one ID that other ID can have too.

    i.e.no duplicate ID - event combinations.


    It can have duplicate ID - event combinations. ID can appear many times in the table in different rows,
    and events for that ID can also appear many times (same or different events) in different rows.

  14. #29
    John_G is offline VIP
    Windows 7 32bit Access 2010 32bit
    Join Date
    Oct 2011
    Location
    Ottawa, ON (area)
    Posts
    2,615
    Hi -

    It's a quiet Sunday afternoon, so I did some experimenting with this.

    I made a simple table, with three fields:

    ID - Text Type
    Issue - Numeric Integer
    Sample_Data - Numeric Single
    Created Indexes on ID and Issue.

    For the test, I assumed 26 different ID's (A-Z) , and 50 different events (1-50).
    Using Visual Basic (VBA) I populated this table with 1,000,000 rows, using the random number function to generate the ID, Issue and Sample_Data values.

    Then using queries and VBA, I created a sample data set in another table, using 32.257% as the selection percentage, and adhering to the requirements of all ID's, and all events as you outlined in post #15. But I added to that by taking 32.257% of all the records in each combination of ID + Issue. That automatically looks after the weighting requirement.

    The time required to create the 32.257% sample data set? About 1 minute, 15 seconds!

    My conclusion is that Access will meet your requirements, better than I expected it might.

    HTH

    John

  15. #30
    johnseito is offline Competent Performer
    Windows 7 64bit Access 2010 64bit
    Join Date
    Aug 2013
    Posts
    419
    cool, that is pretty good.

    However, could I see the program and test other percentages too ? Thanks !!

Page 2 of 3 FirstFirst 123 LastLast
Please reply to this thread with any new information or opinions.

Similar Threads

  1. Access Program from Win xp to win 7
    By loijoc08 in forum Import/Export Data
    Replies: 5
    Last Post: 05-19-2014, 02:00 PM
  2. Old Access Program issues
    By Melicious in forum Access
    Replies: 1
    Last Post: 10-18-2012, 08:22 PM
  3. Help: Create a program access
    By uronmapu in forum Access
    Replies: 2
    Last Post: 06-02-2012, 07:18 AM
  4. Program a 30-day trial into my Access Program?
    By genghiscomm in forum Programming
    Replies: 1
    Last Post: 05-26-2011, 02:14 PM
  5. Is Access the right program for me?
    By Cole in forum Access
    Replies: 1
    Last Post: 08-07-2010, 08:47 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Other Forums: Microsoft Office Forums