Results 1 to 13 of 13
  1. #1
    EvanRosenlieb is offline Advanced Beginner
    Windows 7 64bit Access 2010 64bit
    Join Date
    Jun 2011
    Location
    Fort Collins, CO
    Posts
    37

    How to bin a continuous data set in SQL

    Okay, so I have a large dataset of almost 700,000 records. The field of interest is a continuous variable that ranges form about 3.3 to 5.6, a very high resolution (i.e. high decimal precision). What I would like to do is write a query that will create a column that will take the value in this column, determine which one of forty-six bins (with a bin width of .05) that it belongs in, and simply displays a bin number in the column. So, for each of the 70,000 records, in addition to the original continuous value, it will also have a value in the new field somewhere between one and forty-six describing what bin it falls in.



    Due to the high number of records and bins I cannot thing of any manual way of doing this that would not take a very long time. Does anybody have suggestions? Thank you very much!

    --Evan

  2. #2
    pbaldy's Avatar
    pbaldy is offline Who is John Galt?
    Windows XP Access 2007
    Join Date
    Feb 2010
    Location
    Nevada, USA
    Posts
    22,641
    One way would be to create a table with 3 fields: your numeric value (1-46) and the upper and lower ranges for that each value. Then you can use any number of methods to get the numeric value for any given number. The most efficient for the query you're describing would likely be a non-equi join.
    Paul (wino moderator)
    MS Access MVP 2007-2019
    www.BaldyWeb.com

  3. #3
    EvanRosenlieb is offline Advanced Beginner
    Windows 7 64bit Access 2010 64bit
    Join Date
    Jun 2011
    Location
    Fort Collins, CO
    Posts
    37
    Thanks very much for the response! I have never heard of a non-equi join before, and a quick google search did not yield any links that seem especially instructive. Do you perhaps have a link that has a more detailed step by step way of using a non-equi join?
    Thanks,

    --Evan

  4. #4
    pbaldy's Avatar
    pbaldy is offline Who is John Galt?
    Windows XP Access 2007
    Join Date
    Feb 2010
    Location
    Nevada, USA
    Posts
    22,641
    Can't think of one offhand. It needs to be done in SQL view because the designer can't represent it. Basically where a regular join might look like

    SELECT...
    FROM TableA INNER JOIN TableB ON TableA.FieldName = TableB.FieldName

    note the "=", which is a normal join. The non-equi join for your situation would look like:

    FROM TableA INNER JOIN TableB ON TableA.FieldName >= TableB.LowerValue AND TableA.FieldName <= TableB.UpperValue

    TableB being the new table I described. As you might infer from the join, it is going to find the record the TableA value falls between, enabling you to return the 1-46 field in the SELECT clause.
    Paul (wino moderator)
    MS Access MVP 2007-2019
    www.BaldyWeb.com

  5. #5
    EvanRosenlieb is offline Advanced Beginner
    Windows 7 64bit Access 2010 64bit
    Join Date
    Jun 2011
    Location
    Fort Collins, CO
    Posts
    37
    Thanks a lot for your reply!

    I did this and it worked pretty well, except that it only worked for the top bin and the bottom bin. The query returned all the records that belonged in the highest bin with the correct bin ID, and all the records that belonged in the lowest bin with the correct bin ID, but no records at all from any of the 44 bins in between. This is very perplexing to me, do you have any idea what might cause this?

    --Evan

    --Evan

  6. #6
    pbaldy's Avatar
    pbaldy is offline Who is John Galt?
    Windows XP Access 2007
    Join Date
    Feb 2010
    Location
    Nevada, USA
    Posts
    22,641
    No, as long as the values are appropriate it should do them all. Can you post the db here, or a representative sample?
    Paul (wino moderator)
    MS Access MVP 2007-2019
    www.BaldyWeb.com

  7. #7
    EvanRosenlieb is offline Advanced Beginner
    Windows 7 64bit Access 2010 64bit
    Join Date
    Jun 2011
    Location
    Fort Collins, CO
    Posts
    37
    Certainly,

    However, for whatever reason, the attachments utility in the forum is not working for me, so I had to upload it to an outside site, and for that I apologize. I have used it before and I haven't had any problems with it.

    I pared down the DB to just the relevant parts: there is a table with the Continuous variable, a table with the bins, and the attempted query. Hopefully this is self explanatory.

    Thanks!

    --Evan

  8. #8
    pbaldy's Avatar
    pbaldy is offline Who is John Galt?
    Windows XP Access 2007
    Join Date
    Feb 2010
    Location
    Nevada, USA
    Posts
    22,641
    I don't like going to download sites. Can you email it to

    pbaldy
    gmail
    com
    Paul (wino moderator)
    MS Access MVP 2007-2019
    www.BaldyWeb.com

  9. #9
    EvanRosenlieb is offline Advanced Beginner
    Windows 7 64bit Access 2010 64bit
    Join Date
    Jun 2011
    Location
    Fort Collins, CO
    Posts
    37
    That's quite understandable. In order to make it e-mail attachment friendly, I deleted most of the records of the continuous variable, whereas it used to have almost 700,000, I just used the first 50,000. In doing so, I got rid of all the records that fell in the lowest bin, so now my query attempt only turns up the records that belong in the highest bin and nothing else.

    --Evan

  10. #10
    pbaldy's Avatar
    pbaldy is offline Who is John Galt?
    Windows XP Access 2007
    Join Date
    Feb 2010
    Location
    Nevada, USA
    Posts
    22,641
    All of your bins except the first and last have the same value in the lower and upper fields. Only an exact match would fall in these bins. For example the first bin is 3.3 to 3.35. The next bin is 3.35 to 3.35, the third is 3.4 to 3.4. Presumably the second should be 3.35 to 3.4.

    Further, there should be no exact matches between fields. If the first bin upper is 3.35, the second bin lower should be 3.36, or 3.3500000001, or whatever is appropriate to your data. The point is, if a value in the other table was exactly 3.35, it would match both bins and cause you a problem.
    Paul (wino moderator)
    MS Access MVP 2007-2019
    www.BaldyWeb.com

  11. #11
    EvanRosenlieb is offline Advanced Beginner
    Windows 7 64bit Access 2010 64bit
    Join Date
    Jun 2011
    Location
    Fort Collins, CO
    Posts
    37
    Boy do I feel stupid! I can't believe that I somehow didn't catch that. Sometimes a fresh pair of eyes helps I guess.

    Anyway, thank you so much for your help. In regards to:

    Further, there should be no exact matches between fields. If the first bin upper is 3.35, the second bin lower should be 3.36, or 3.3500000001, or whatever is appropriate to your data. The point is, if a value in the other table was exactly 3.35, it would match both bins and cause you a problem.
    I was planning on having the SQL code say
    Code:
    FROM LogFluxMu INNER JOIN Fluxes ON (LogFluxMu.[Log(Flux)] <= Fluxes.[Upper]) AND (LogFluxMu.[Log(Flux)] > Fluxes.[Lower]);
    So that the lower limit is closed and the upper limit is open. This should work, yes?

    Thank you so much!

    --Evan

  12. #12
    pbaldy's Avatar
    pbaldy is offline Who is John Galt?
    Windows XP Access 2007
    Join Date
    Feb 2010
    Location
    Nevada, USA
    Posts
    22,641
    Now it's my turn to say Duh! Yes, that change should work for you.
    Paul (wino moderator)
    MS Access MVP 2007-2019
    www.BaldyWeb.com

  13. #13
    EvanRosenlieb is offline Advanced Beginner
    Windows 7 64bit Access 2010 64bit
    Join Date
    Jun 2011
    Location
    Fort Collins, CO
    Posts
    37
    Great! Thread Solved!

Please reply to this thread with any new information or opinions.

Similar Threads

  1. Replies: 3
    Last Post: 03-09-2012, 09:00 AM
  2. Non-continuous form
    By alyon in forum Access
    Replies: 6
    Last Post: 07-19-2011, 12:25 PM
  3. Replies: 6
    Last Post: 07-21-2010, 05:25 PM
  4. Replies: 2
    Last Post: 10-16-2009, 02:47 PM
  5. Continuous Form
    By duckie10 in forum Access
    Replies: 13
    Last Post: 06-09-2009, 11:15 AM

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Other Forums: Microsoft Office Forums