This is a technique to select a random percentage or random TOP N from a Group.
The post that prompted this entry was
How do I select random rows by a specific column using Rnd() function?
I've been asked to create a sample of 50% from a dataset weighted by a category. For example, in the following dataset, I would need to randomly pick 3 EE IDs in Payband A, 1 in Payband B, and 5 in Payband C. I know how to randomly pick 50% of the entire dataset using this SQL:
SELECT TOP 50 PERCENT * from myTable ORDER BY rnd(EE ID), but I don't know how to make the query run the 50% sample by the field Payband. I could run the above SQL for each payband, but I have 62 categories, and I really don't want to do that. I'm an HR person and not a developer/programmer, so any help would be appreciated. Thanks!
Payband |
EE ID |
A |
12345 |
A |
12346 |
A |
12347 |
A |
12348 |
A |
12349 |
A |
12350 |
A |
12351 |
B |
12352 |
B |
12353 |
C |
12354 |
C |
12355 |
C |
12356 |
C |
12357 |
C |
12358 |
C |
12359 |
C |
12360 |
C |
12361 |
C |
12362 |
C |
12363 |
|
|
The response showing Select Random 50% of records grouped by Payband follows:
I mocked up your data and used a technique that was discussed in a different forum for Selecting Top N from a Group.
Take you original table and add a field with Number datatype to hold a Random number
I added a field xtrafield to the Table Payband.
Then, Populate the xtrafield with a Random number
Code:
'---------------------------------------------------------------------------------------
' Procedure : PayBandStuff
' Author : Jack
' Date : 20/02/2014
' Purpose : To put a random number in a field (xtrafield) in a table
'
'This is a technique when dealing with random TOP N records per Group
'
'In the sql you Select the Top N or Percent
' Group By the grouping field and
' Order By The Random field (xtrafield in this example)
'---------------------------------------------------------------------------------------
'
Sub PayBandStuff()
'populate xtrafield with random number
Dim sql As String
Dim rs As DAO.Recordset
Dim i As Integer
10 On Error GoTo PayBandStuff_Error
20 Randomize 'must use this to getnew random numbers
30 Set rs = CurrentDb.OpenRecordset("payBand")
40 Do While Not rs.EOF
50 rs.Edit
60 rs!xtrafield = Rnd() * 300 'just a number
70 rs.Update
80 rs.MoveNext
90 Loop
100 On Error GoTo 0
110 Exit Sub
PayBandStuff_Error:
120 MsgBox "Error " & Err.number & " (" & Err.Description & ") in procedure PayBandStuff of Module AWF_Related"
End Sub
So now my table values are:
Code:
Payband |
EEID |
XtraField |
A |
12345 |
267.7998046875 |
A |
12346 |
99.4330291748047 |
A |
12347 |
32.2006149291992 |
A |
12348 |
282.146148681641 |
A |
12349 |
193.814682006836 |
A |
12350 |
255.543670654297 |
A |
12351 |
136.402862548828 |
B |
12352 |
126.936866760254 |
B |
12353 |
262.250152587891 |
C |
12354 |
116.814460754395 |
C |
12355 |
127.859062194824 |
C |
12356 |
1.76260471343994 |
C |
12357 |
265.189910888672 |
C |
12358 |
100.744163513184 |
C |
12359 |
272.822875976563 |
C |
12360 |
251.612380981445 |
C |
12361 |
236.416961669922 |
C |
12362 |
199.733917236328 |
C |
12363 |
12.8409328460693 |
Then use a select query to get the TOP 50 Percent of each Payband Grouped By Payband and Ordered By the RandomField
Code:
SELECT TOP 50 PERCENT Payband.payband, Payband.EEID, Payband.XtraField
FROM Payband
GROUP BY Payband.payband, Payband.EEID, Payband.XtraField
ORDER BY Payband.XtraField
Then select only the fields you want in the order you want them.
Code:
SELECT payband, EEID
FROM (SELECT TOP 50 PERCENT Payband.payband, Payband.EEID, Payband.XtraField
FROM Payband
GROUP BY Payband.payband, Payband.EEID, Payband.XtraField
ORDER BY Payband.XtraField) AS [%$##@_Alias]
ORDER BY Payband;
Giving this result:
Code:
payband |
EEID |
A |
12349 |
A |
12351 |
A |
12346 |
A |
12347 |
B |
12352 |
C |
12355 |
C |
12354 |
C |
12358 |
C |
12363 |
C |
12356 |
Nice technique, right.