Experts:
I need some basic recommendations (**without going into the weeds**) on a very general/fundamental WRT "data transformation".
Allow me to provide some background first (please note that the info below is generic):
- I am working on a project that requires to migrate legacy data (various organizations with various sources) into a new (target) system.
- The various organizations (I should say "subsidiaries") have similar data.
- Organization_A (Org_A) has a field "Category Code". Org_A stores [Category Code] as *text* such as "Transportation Services" or "Utility Services".
- Organization_B (Org_B) also has a field "Category Code". However, right, wrong, or indifferent, they store their values also as "text" but they use coded values such as "001" or "002" instead.
- Organization_C (Org_B) also has a field "Category Code". And in this hypothetical example, they store their values as "numbers" such as "1" or 2".
- ... where "Transportation Services" AND "001" AND "1" (Org_A, Org_B, Org_C, respectively) will be translated into "SomethingElse_ABC" in the target system.
So, ultimately, I have the following in the 3 sources tables (Org_A | Org_B | Org_C):
Code:- Transportation Services | 001 | 1 - Utility Services | 002 | 2
Next, lookup tables:
- For the data value transformation from "legacy system" to "target system", we are going to use **lookup tables**.
- So, for Org_A and Org B, I could create a field [SourceValue] with data type = "Text" and include 4 records:
1. Transportation Services
2. Utility Services
3. 001
4. 002
Then, as part of the data transformation, we could have a 2nd column with
However, given that Org_C stores their values as a "number", I cannot enter "1" and "2" (as 5th or 6th record) in the lookup table. So, at this moment, I am wondering if we end up with two (2) lookup tables where legacy data from Org_A and Org_B are added to "LookupTable_CategoryCode_Text" while Org_C's data should be added to "LookupTable_CategoryCode_Number".Code:[SourceValue] [TargetValue] 1. Transportation Services SomethingElse_ABC 2. Utility Services SomethingElse_XYZ 3. 001 SomethingElse_ABC 4. 002 SomethingElse_XYZ
So, again, I am curious about some *general recommendations* how to best handle *alike* fields but w/ different data types. Please do keep in mind the above is a dummy representation of the scenario. Also, keep in mind that we cannot change the legacy systems... ultimately, they are what they are! What I do have some input over is the process for transforming the legacy data into the new system.
So, here my question: which of these three solutions appears to be best?
1. Either use multiple lookup tables (for alike fields) but suffix them with "_text" or "_number"?
2. Use a Function (module) where I could use CASE statements and convert "Transportation Services" OR "001" OR "1" to be converted into "SomethingElse_ABC"? If so, can I mix and match text & numbers in the same function?
3. Or, upon receipt of the legacy data, should I convert Org_C's '1' (numeric) into "1" (text) and then expand the lookup table (text) to the following:
Thank you in advance,Code:[SourceValue] [TargetValue] 1. Transportation Services SomethingElse_ABC 2. Utility Services SomethingElse_XYZ 3. 001 SomethingElse_ABC 4. 002 SomethingElse_XYZ 5. 1 SomethingElse_ABC 6. 2 SomethingElse_XYZ
EEH