I am doing a data cleansing project to eliminate duplicate serial numbers for an item of materiel in a registry. For one item there are over 600,000 records and 23,000 duplicates. For this reason I am making my first foray into Access.<br>
Many fields will be in the report, but the key fields are serial number (SN), unique record identifier (UI), and registration date (REG-DT). For this request for help I will call the table TEST.<br>
I used the simple Find Duplicates Query wizard to identify the dupes, however, for audit reasons I cannot just delete them. I need to produce a report (or new table) that indicates for each pair, which record to keep and which to delete (a new field in the table called ACTION). Usually the record to keep is the oldest of the pair.<br>
Sample report:<br>
SN UI REG-DT ACTION<table style="width: 494px" class="wysiwyg_dashes" width="494"><tbody><tr valign="top" class="wysiwyg_dashes_tr"><td class="wysiwyg_dashes_td">54613648</td><td class="wysiwyg_dashes_td">D27777245500-90154613648</td><td class="wysiwyg_dashes_td">10/19/2007</td><td class="wysiwyg_dashes_td">KEEP</td></tr><tr valign="top" class="wysiwyg_dashes_tr"><td class="wysiwyg_dashes_td">54613648</td><td class="wysiwyg_dashes_td">D14850A325633754613648</td><td class="wysiwyg_dashes_td">8/30/2012</td><td class="wysiwyg_dashes_td">DELETE</td></tr><tr valign="top" class="wysiwyg_dashes_tr"><td class="wysiwyg_dashes_td">54613652</td><td class="wysiwyg_dashes_td">D27777245500-90154613652</td><td class="wysiwyg_dashes_td">10/19/2007</td><td class="wysiwyg_dashes_td">KEEP</td></tr><tr valign="top" class="wysiwyg_dashes_tr"><td class="wysiwyg_dashes_td">54613652</td><td class="wysiwyg_dashes_td">D14850A325633754613652</td><td class="wysiwyg_dashes_td">8/30/2012</td><td class="wysiwyg_dashes_td">DELETE</td></tr></tbody></table><br>
The terms I think I need to use are SELECT, HAVING Count(*)>1, CASE, INSERT, GROUP BY and ORDER BY.<br>
I have no idea about syntax. Need help!<br>
<br>
SELECT TEST.SN, TEST.UI, TEST.REG-DT<br>
FROM TEST<br>
WHERE TEST.SN HAVING Count(*)>1<br>
CASE<br>
WHEN REG-DT = min_date THEN INSERT TEST.ACTION = KEEP<br>
WHEN REG-DT > min_date THEN INSERT TEST.ACTION = DELETE<br>
GROUP BY TEST.SN<br>
ORDER BY TEST.REG-DT.