Results 1 to 4 of 4
  1. #1
    GAtkins is offline Novice
    Windows 7 64bit Access 2010 32bit
    Join Date
    Nov 2013
    Posts
    11

    Question On Normalization Benefit Versus Effort Given A Possibly Unique Situation

    Forum members,



    Suppose I want to design a database in which the data will be imported from Excel every day or two. The data will come from a portfolio (investment) accounting/management system that contains multiple clients and each portfolio will contain multiple securities, each of various asset classes (cash, fixed income, equity), security types (agency bonds, corporate bonds, common stock), industries (materials, capital goods, consumer non-durable), and sector types (health care, industrials, information technology).

    Multiple transactions will occur in multiple portfolios every day among multiple securities. Consider that a draft ERD for this database might look like this.

    Click image for larger version. 

Name:	ERD Diagram.jpg 
Views:	19 
Size:	92.6 KB 
ID:	14749


    My question is two-fold:

    1. With the exception of “ClientRep” in the Clients table (because there can be a 1:M relationship between ClientRep and Clients, does this look like a normalized database? I also recognize and concede that the "TotalCost" attribute in the "PositionsOwned" entity is a calculated value (UnitCost x SecQuantity) and should (could) be removed and calculated dynamically. If this is not a good ERD, how might the normalization be improved?

    2. Given that this data is maintained in a separate portfolio accounting system with many transactions among many clients per day, and that a table or multiple tables will be exported from this accounting system into Excel and that these Excel tables will then be imported into the Access database every day or two (not appended, but entirely replaced), at what point does the normalization process result in diminishing returns?

    The way I have it structured now is with two entities, Clients with its respective attributes, and another entity called Master that contains every attribute not already in Clients and some that are (there is obviously much data duplication in this structure). I have this set-up working on several queries now that provide much needed information.

    I want an Access database with this information because the portfolio accounting system will not do some of the calculations and data display that I would like to see as a portfolio manager. I do not need to see from this database transactional information over time, but rather point in time data that is current, but updated frequently.

    So, is the ERD sound and stable and would I gain more from this normalization than the effort would require given the dynamic nature of the underlying data?

    Thanks in advance for any ideas/comments/suggestions.

    Glenn

  2. #2
    Dal Jeanis is offline VIP
    Windows 7 32bit Access 2010 32bit
    Join Date
    May 2013
    Location
    Dallas TX
    Posts
    1,742
    1) There isn't really a link between the client and the security. The positionsOwned is the junction table for that.

    2) I'm not worried about the TotalCost field, I'm more worried about YieldOnCost. Isn't that something that you'll be repeatedly calculating based on today's prices? If not, what does it mean?

    3) I wonder if Industry Type and Sector Type are truly independent properties. If you say they are, then I'll believe it. If not, then maybe there's some normalization to be done there.

    4) I'm assuming that the fields on the Client table are targets rather than fields calculated from the sum of securities owned. Is cash held in the account treated as a type of security, or does it show up on the client record instead?

  3. #3
    GAtkins is offline Novice
    Windows 7 64bit Access 2010 32bit
    Join Date
    Nov 2013
    Posts
    11
    1) Yes, correct, I was trying to design and use the positionsOwned table as a junction table.

    2) YieldOnCost is a static value based on a yield to maturity calculation at the time the bond is purchased and does not change over time, so it is unique to the position at the time it is acquired. Even if it were not, YieldOnCost and YldToMat (which does change over time and with the market value of the security) are beyond the ability of Access to calculate directly. It might be possible with a function call to Excel from VBA but I am not sure.

    3) Sector is more broadly defined and drills down to industry, so in this example they are independent.

    4) CashPct, BondPct, and StockPct are indeed asset allocation targets that are unique to the client, in this example, PortCode. MinCash is a fixed dollar amount that each client has requested remain in cash or money market assets, and thus not invested into stocks or bonds. It can change for each client over time and can be zero. Cash is also treated as a type of security, but it is different than MinCash.

    I'm still wondering if the gains from a high and correct degree of normalization will outweigh the effort to do it, especially since the data changes daily and this particular database is not for data entry, but rather data analysis, aggregation, summary, and crosstab type analysis.

    In short, I'm trying to figure out if the effort of normalization is worth it in this case and if it is something that can be recreated easily and often, and automated to the degree possible. If it is, then I will certainly do it. As it is now, I download three spreadsheets from the portfolio accounting software to Excel, manually combine two of these into one table, and then import both into Access, delete the existing tables, rename the just uploaded new tables, recreate on relationship and all the queries still function. If there are huge gains to be had from the perfect database and the process of reuploading tables from Excel to Access remains easy and quick once designed, then I am all for it. Just trying to get a sense of the cost/benefit/effort tradeoff in this specific situation and also the proper ERD to make it happen. Again, not arguing at all with a properly designed database, just trying to determine the best course in this example.

    Thanks for your comments. I appreciate them and any others you may have to offer.

    Edited to add that I have about two weeks experience with Access and databases, but extensive experience with Excel proper and Excel VBA.

    Glenn



    Quote Originally Posted by Dal Jeanis View Post
    1) There isn't really a link between the client and the security. The positionsOwned is the junction table for that.

    2) I'm not worried about the TotalCost field, I'm more worried about YieldOnCost. Isn't that something that you'll be repeatedly calculating based on today's prices? If not, what does it mean?


    3) I wonder if Industry Type and Sector Type are truly independent properties. If you say they are, then I'll believe it. If not, then maybe there's some normalization to be done there.


    4) I'm assuming that the fields on the Client table are targets rather than fields calculated from the sum of securities owned. Is cash held in the account treated as a type of security, or does it show up on the client record instead?

  4. #4
    Dal Jeanis is offline VIP
    Windows 7 32bit Access 2010 32bit
    Join Date
    May 2013
    Location
    Dallas TX
    Posts
    1,742
    In short, I'm trying to figure out if the effort of normalization is worth it in this case and if it is something that can be recreated easily and often, and automated to the degree possible. If it is, then I will certainly do it.
    Normalization is about value and ease of use. A well-designed database in the real world will often have some denormalized features intended to improve performance as the database is intended to be used. Data may be duplicated alongside other data that it is commonly used with, or dependent data may be left in place due to the fact that it arrives that way.

    Industry/Sector is a great example. If you knew the industry, could you uniquely determine the sector? Or are there particular industries that span sectors? Assuming you COULD uniquely determine sector from industry, then sector would redundant in that layout, and a fully normalized database would have industry on the security record, and a separate table to find sector from industry.

    But, who cares?

    Even if Sector is fully dependent on Industry, and therefore redundant, you should probably leave it as is if this is the form the data comes to you. Writing a routine to eliminate the redundant data just doesn't have any benefit for your application, and you don't need the disk space, so what's the point?

    Don't reorganize the data (beyond the format you've shown) unless you have a technical reason.

    manually combine two of these into one table, and then import both into Access,
    Okay, stop. It's a tossup whether it's easier, as a process, to (A) manually merge two spreadsheets and upload only two to Access, or instead to (B) upload three spreadsheets to Access and programmatically merge two of them to one table.

    Personally, I'd tend to program the latter, since once programmed it's a single click, as long as you always put the three downloaded spreadsheets in the same place.
    delete the existing tables, rename the just uploaded new tables, recreate on relationship
    This should be two near-trivial chunks of VBA, one which deletes the old data from the real tables and the second of which inserts the worktable data into the real tables. The relationship is either unaffected, or can be recreated programmatically.

    two weeks experience
    If you've used Excel VBA, then the biggest learning curve for Access is going to be learning what Access will do for you automatically, so you don't end up trying to push the river. My architecture advice is to avoid macros, and program directly in VBA. This is not because macros are intrinsically bad, it's because I know how to read VBA code and how to find what VBA is doing, and macros are not nearly so transparent to me.

Please reply to this thread with any new information or opinions.

Similar Threads

  1. Query Help, Possibly DLookup?
    By UTLee in forum Access
    Replies: 36
    Last Post: 12-10-2013, 05:17 PM
  2. Replies: 11
    Last Post: 10-24-2012, 03:17 PM
  3. Another unique count question
    By atran in forum Reports
    Replies: 2
    Last Post: 05-24-2012, 10:18 AM
  4. Will MS Access benefit me??
    By brandon-castleton in forum Access
    Replies: 5
    Last Post: 01-18-2012, 12:09 PM
  5. Newb question (but possibly complex)
    By MavisCruet in forum Database Design
    Replies: 3
    Last Post: 11-29-2011, 07:16 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Other Forums: Microsoft Office Forums