Hi again everyone,
This is the big one. The most complex part of the project I was driving towards.
I don't even know if this is possible - but maybe one of you out there can tell me (and hopefully show me how!).
I have uploaded the dataset, again for simplicity and ease of explanation.
The dbase has a table called Attribute, which contains records where there is more than one identical Constituent ID (i.e. there are multiple records) and one or more missing dates (the 'Constituent Specific Attributes Contact Preferences Date' field).
The point of this exercise is to populate the missing dates with correct information.
I need to write a query that:
- sorts the file by 'Constituent ID' and 'Constituent Specific Attributes Contact Preferences Import ID' order
- using this sort order, populate the 'Constituent Specific Attributes Contact Preferences Date' field where it is Null with the information following this logic:
- (a) where there is a date(s) populated and the 'Constituent Specific Attributes Contact Preferences Import ID' is a lower number (i.e. appears earlier in the sequence) compared to the 'Constituent Specific Attributes Contact Preferences Import ID' of the record with the Null date, find the most recent date within that sequence and populate a date a month after that date, and carry on populating the records with Null dates, continuing with a monthly sequence, until all the records for that constituent have been populated
- (b) I would need it to also do the inverse, i.e. populate dates where the Import ID of the dated record is higher than the Import ID of the Null record with sequential monthly dates, basing the first date upon the oldest date (but this would be impossible, because if there are 5 missing dates, it would need to start applying a monthly date in the 1st missing date field which is 5 months previous to the currently closest existing date, but if there are 3 missing dates, it would need to start applying a monthly date in the 1st missing date field which is 3 months previous to the currently closest existing date, to make sure the dates for the entire constituent run sequentially and there are no duplicates)
Of course, if there are existing dates before AND after the missing dates, these will have to be ignored because I don;t know how we could create a rule for these.
I don;t know if I have explained myself very well, because what I need to do is quite complex. I hope you can understand the basis of what I am trying to achieve.
A example of how the results would be for Constituent ID 6034 (where existing dates are populated before the missing dates) and 6211 (where exisiting dates are populated after the missing dates):
Constituent ID Constituent Specific Attributes Contact Preferences Import ID Constituent Specific Attributes Contact Preferences Description Constituent Specific Attributes Contact Preferences Date
6034
00001-528-0000029121
Email opt out
19/02/2015
6034 00001-528-0000029122 Telephone opt out
19/02/2015
6034 00001-528-0000029123 Postal opt out
19/02/2015
6034 00001-528-0000029583 Email opt out 19/03/2015
6034 00001-528-0000029583 Postal opt out 19/04/2015 6034 00001-528-0000029583 Telephone opt out 19/05/2015 6034 00001-528-0000029583 SMS opt out 19/06/2015 6211 00001-528-0000028921 Email opt in 28/12/2016 6211 00001-528-0000028922 Telephone opt in 28/01/2017 6211 00001-528-0000537728 Email opt out 28/02/2017 6211 00001-528-0000677398 Email Research Round-Up Newsletter Opt In 20/04/2018
Please Note: the fields populated in red are the data I need to have populated.
As always, any help would be really appreciated. This should be the last question for this mini project.
Thanks,
Antonio