Results 1 to 8 of 8
  1. #1
    ironfelix717 is offline Competent Performer
    Windows 10 Access 2016
    Join Date
    Oct 2018
    Posts
    150

    Design advice for record versioning

    Hi,

    My current schema in my application uses record versioning. Every element of the system may have multiple versions associated with it's element ID. The reason for this schema is
    because the DB will be time conscious... If i re-open a quote from 5 months ago, I would like to see it exactly as it was 5 months ago...

    When an element is being created for the first time, its version is set to 1. When an element is being updated, a new record is inserted into the table with the max version plus 1.

    Problems with this Schema:

    *note: this is an entirely unbound application

    1.) The growth rate of the system just exponentially increased.

    2.) The development time has increased greatly (must always query for max version to get latest data)

    3.) The cost to query mass amount of records simply to retrieve the latest data can great.

    4.) Where one thing is versioned, everything must be versioned... A versioned schema only makes sense when almost every orphan is versioned. This goes back to exponential growth.


    The only two solutions I can think of are:


    1.) Manage redundant saves (do not write new version if no changes are present)



    2.) Create a History Table for each master table. When a new version is created, transfer previous to history table - this way when needing to query all the current versions, there's far less data.

    3.) Re-think about how and when versions are needed. For example:

    The current methodology is: *note: put in forum quotes only for formatting purpose

    Lets say I have a quote. I add a customer ID to the quote. This pulls all of the customer info into the quote.

    When the quote is saved (which is also versioned) it saves the customer ID and version of that customer ID.

    Reopening the quote 5 months from now will load the same customer data that existed 5 months ago even if the latest customer version changed. -- good (will warn if out of date)
    A possible alternative is:
    When a user is modifying/viewing a Customer ID in the Customers module and they click save, this does NOT create a new version. It modifies the current version.

    However.... when a users adds a Customer ID to the quote, that version is remember by the quote but also a NEW version is created in the customer table.

    So, the process repeats.

    The difference between these two methods are that versions are created on the fly only when needed versus creating a new version every single time a customerID is saved.
    I have not entirely thought this second approach through yet. It may be less efficient in the long run.



    Ultimately, I am inquiring on the best approach to design this schema. The scope is not massive for this project, but the data could be if I do not be careful.

    Thanks for your help.
    Regards

  2. #2
    CJ_London is online now VIP
    Windows 10 Access 2010 32bit
    Join Date
    Mar 2015
    Posts
    11,397
    note: this is an entirely unbound application
    if you mean unbound forms, then you are not taking advantage of the benefits of binding so you will will significantly increase your development time

    The development time has increased greatly (must always query for max version to get latest data)
    I believe this was the subject of a recent thread of yours. Using a separate query to get the max and linking back should be pretty much instantaneous. But will take time if you are getting every single 'max' record and then filtering is a form.

    Where one thing is versioned, everything must be versioned... A versioned schema only makes sense when almost every orphan is versioned. This goes back to exponential growth.
    use dates rather than version numbers. However not sure why it is leading to exponential growth - implies a non normalised database

    Manage redundant saves (do not write new version if no changes are present)
    not required if your forms are bound
    Create a History Table for each master table. When a new version is created, transfer previous to history table - this way when needing to query all the current versions, there's far less data.
    better on change or new to insert to history table and update 'current' table

    In the case of a invoice where you don't want anything to change once the invoice has been approved is to store that data with the invoice e.g. product code/description/price, tax rates, maybe other fields. In other cases you might reference a history table for say addresses.

    Don't think you can says there is one way that is 'best' to apply to all situations. In the case of the invoice in the previous paragraph, there is a legal requirement to be able to reproduce an invoice for a period of time - 6 years? but this could also be achieved by saving a pdf. However if you are required to regularly analysis data over a long period of time, repeat the relevant data in the final document tables. It goes against normalisation rules but has a significant improvement in performance. It is a matter of balance - performance v rigidly sticking to normalisation rules

  3. #3
    ironfelix717 is offline Competent Performer
    Windows 10 Access 2016
    Join Date
    Oct 2018
    Posts
    150
    Ajax:

    if you mean unbound forms, then you are not taking advantage of the benefits of binding so you will will significantly increase your development time
    This is obvious. My comment on development time was not in regards to bound or unbound design, it was in regards to a versioned schema.

    However not sure why it is leading to exponential growth - implies a non normalised database
    Exponential growth will be a phenomenon that occurs with a versioned schema... Inserting records every time a save happens, instead of updating a record.
    When you spread that schema over a complex application, things begin to grow. Its just a necessary phenomenon that i am trying to tame. If a company generates 1000 quotes per year,
    On average, there are 25 versions per quote. My data just increased by a factor of 25. That is exponential.

    but this could also be achieved by saving a pdf. However if you are required to regularly analysis data over a long period of time, repeat the relevant data in the final document tables
    The data within the quote may be critical and used elsewhere and a pdf was never produced.

  4. #4
    CJ_London is online now VIP
    Windows 10 Access 2010 32bit
    Join Date
    Mar 2015
    Posts
    11,397
    Inserting records every time a save happens, instead of updating a record.
    seems you are fixed in the way you want to do things - only change I can suggest is only insert a record if there is a change to the data rather than when the user clicks the save button.

    You might want to google 'audit schema' as an alternative way - basically you save the record as current but have a table of changes on a field by field basis

    PK...TableName...FieldName..NewValue...ChangeDate. ..

    so you can wind back if necessary

  5. #5
    ironfelix717 is offline Competent Performer
    Windows 10 Access 2016
    Join Date
    Oct 2018
    Posts
    150
    Ajax,

    I will consider the audit schema by field method.

    Thanks

  6. #6
    CJ_London is online now VIP
    Windows 10 Access 2010 32bit
    Join Date
    Mar 2015
    Posts
    11,397
    be aware the process to rebuild an historic view can be relatively slow if there are lots of changes and from a db size perspective don't forget all fields with the exception of newvalue will need to be indexed. data size becomes important with indexing so you may want a metadata table so you are indexing by a number rather than a table/field name.

    long numbers require 4 bytes

    text requires 2 bytes plus two for each character - so a 10 char table name will require 22 bytes - 5.5 times bigger.

    This affects performance because of the way the index algorithm works - very simplistically it reads 4096 bytes from the disk (the slowest part of the operation) so with an index length of 4 bytes that is 1024 index rows returned, for 22 bytes just 186 index rows. Someone once said that with modern indexing algorithms any index value can be found within 8 disk reads - clearly the longer the index value, the more records the more the performance will be towards the higher end of that range. So potentially up to 8 times slower when compared against relatively small amounts of test data.

  7. #7
    ironfelix717 is offline Competent Performer
    Windows 10 Access 2016
    Join Date
    Oct 2018
    Posts
    150
    Ajax,

    My need to do roll-backs is not within the scope of a full system restore. I don't wish to see the whole DB at previous points in time. If that were my intention, I think a snapshot restore or something would be possible in SQL server. But this schema is only required to maintain time accurate data... For example, a customer ID with information. If the customer ID is updated later and user re-opens a quote from 1 year ago, if the customer ID changed information (updated record version), this can leave out a lot of information that is otherwise valuable to the user ("This doesn't make sense. The customer is from a different country than this product was quoted to."). So you can see this small, but significant feature has some big merit in keeping child information time accurate.

    My current plan as of now, is for each element (ex: Customers) have a master table (latest version) and a history table (previous versions). I rarely will need to mass-query previous versions of an element because its now obsolete, but I will need to query the current version table quite often. When a query does need to be ran on the history table, it will be a very well defined select statement returning only 1 or a few records (mostly only ever 1). So, i think the performance will be OK. Downside to this design is that i now have double the tables to manage. If i need to add a field to one table, i must do to the history table.

    But over time, I suspect i can gain performance by developing a clean-up manager in the app, which deletes all versions of elements that are NOT members of related parent elements (wasted space). This will keep the data smaller... Yes, deleting data is never good, but this is a consideration in the future if somehow the DB grows to affect performance.

    Versions will be stored as a Number (integer) type. A DateModified field is present also. I prefer to have a version field, instead of solely using the DateModified field because I am not sure of the DB engine performance for querying numbers versus dates. I assume numbers would be faster (max version on any element is probably like 50 before its just no longer updated again). But also, numbers are really easy to use in code, rather than dates.


    Thanks

  8. #8
    CJ_London is online now VIP
    Windows 10 Access 2010 32bit
    Join Date
    Mar 2015
    Posts
    11,397
    I assume numbers would be faster
    dates are numbers as well - assuming you are using TSql (you mention sql server) if you use a field of type of smalldatetime, that is also only 4 bytes. You might find this link useful https://docs.microsoft.com/en-us/sql...ql-server-2017.

    good luck with your project


Please reply to this thread with any new information or opinions.

Similar Threads

  1. Seeking Design Advice
    By DSProductionz in forum Database Design
    Replies: 1
    Last Post: 09-15-2019, 02:35 PM
  2. Advice On Form Design
    By archie1_za in forum Forms
    Replies: 4
    Last Post: 12-28-2017, 07:01 PM
  3. Advice on Database Design
    By JoeM in forum Database Design
    Replies: 1
    Last Post: 08-15-2013, 08:50 AM
  4. Need some design advice
    By messingerjc in forum Database Design
    Replies: 3
    Last Post: 06-18-2012, 06:16 PM
  5. design advice
    By Sarge, USMC in forum Database Design
    Replies: 4
    Last Post: 10-06-2010, 07:53 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Other Forums: Microsoft Office Forums