Page 1 of 2 12 LastLast
Results 1 to 15 of 23
  1. #1
    cazeek is offline Novice
    Windows 10 Access 2016
    Join Date
    Sep 2018
    Posts
    11

    General time series/date-based database design questions

    Hello,

    I am about to embark on a pretty large overhaul of the way I manage my data, and I'd like to get some help on the design. I generate a lot of financial and analytics data through various imports, Excel spreadsheets, etc. All of the data is dated, and I'd like to develop a system where everything is "linked" by date, and I can ultimately look up any given data point by referencing a date and a unique code (assigned to the relevant series). My general game plan would be the following.


    1. Create Access DBs for each broad category of data. Each DB uses the date as the Primary Key. Some examples..
      • Prices DB: Each day, I have 10 fields worth of price data for each asset. I created one Access DB w/ 62 tables (one for each asset), each with the same 10 fields/structure.
      • Models DB: Each day, I have about 50 fields of analytics data for each model. Each asset has several models. I'm thinking to create one Access DB w/ a table for all of the models (several hundred)- all with the same fields/structure.
      • Other DBs: Same idea.. dated, but different data



    2. Create some kind of connection b/w everything, where I can create a report, retrieve data for specified dates, or other wise interact b/w data items on any of the tables. They all share the "Date" field as a primary key, but everything would be dispersed across many tables and databases.


    The goal is to be able to retrieve something like Prices.EURUSD.Close for 10/10/18, which knows it will look in the "Prices" DB, the "EURUSD" table, and retrieve the "Close" value for 10/10/18. Or Models.EURUSD_M.Factor1 would know to look in the "Models" DB, "EURUSD_M" table, "Factor1" value for 10/10/18. Ideally, I would be able to perform further operations and create reports based on them.. like [Prices.EURUSD.Close] x [Models.EURUSD_M.Factor1] for 10/1/18 - 10/31/18.

    More advanced, I'd like to be able to "tag" each table with some sort of "parent" identifier. For example, the "EURUSD" would be a parent that has "links" to each table that falls under the "EURUSD" category. This would include the "EURUSD" table in the "Prices" DB, 5 tables in the "Models" DB, etc. I'm not sure if this type of "tagging" / hierarchical structure exists for databases at all, but I'm just throwing it all out there. Given that desire, I would also consider creating a separate DB for each asset that has all of the tables (prices, models, etc.) within it. That would give me 62 Access DBs (rather than the handful I described above).

    Is there a proper/efficient way to go about this through Access? My other thought is to just create all of these databases and then create some mapping/import system in Excel and run reports from there, but I feel like there should be a better way to manage.

    Thanks very much for the help!

  2. #2
    cazeek is offline Novice
    Windows 10 Access 2016
    Join Date
    Sep 2018
    Posts
    11
    Did a little more thinking- to better explain why I want to "tag", let me add the following. I will want to generate reports where I can enter the asset name, and it will automatically pull the relevant fields from the appropriate DBs/tables WITHOUT having to manually create a report for each asset. So if I want to pull the closing price, factor1, x, y, z for 10/10/18 for three different assets, I can just enter the assets codes, and the report will generate. I'm thinking, rather than "tagging" the databases on creation, I can have some sort of definitional mapping table with the asset code as the primary key. In that table, it will list the table names for each database.

    The idea would then be:
    1. I enter the asset name (e.g. EURUSD)
    2. I enter the data series I'm looking for (e.g. Close Price, Factor1)
    3. I enter the dates I want the data for
    4. The query looks to the Mapping table, determines the table names for the "FutPx" and "Models" DBs based on the EURUSD
    5. The query retrieves the Close Price and Factor1 values for the specified dates from the tables determined through (3) and generates the appropriate reports


    Hope my objectives are somewhat clear.. it's tough when I don't have the appropriate lingo down

  3. #3
    Minty is offline VIP
    Windows 10 Access 2016
    Join Date
    Sep 2017
    Location
    UK - Wiltshire
    Posts
    3,157
    Quote Originally Posted by cazeek View Post

      • Prices DB: Each day, I have 10 fields worth of price data for each asset. I created one Access DB w/ 62 tables (one for each asset), each with the same 10 fields/structure.


    This sounds completely un-normalised. You should have a single table with an additional field to store the asset type.
    That way when asset 63 arrives you simply add it as data, not a new table that would require a reworking of all your queries.

    Building queries to accommodate 63 tables with the same structure will be a nightmare.

    It might be an idea to post up a picture of your tables from the relationship window to give us a better "picture" of where you are at the moment.
    DLookup Syntax and others http://access.mvps.org/access/general/gen0018.htm
    Please use the star below the post to say thanks if we have helped !
    ↓↓ It's down here ↓↓

  4. #4
    cazeek is offline Novice
    Windows 10 Access 2016
    Join Date
    Sep 2018
    Posts
    11
    Thanks for the help Minty. So you are suggesting I just create one table per "type" of record I store? So if I have 62 assets that I need 10 fields of dated price data, I should throw that all in one table with additional fields for "Date" and "Asset" and a separate primary key? Would this significantly slowdown queries/retrievals of data? Currently, I have over 5,000 records/dates for each asset, so the table would be about 350k records if combined.

    Relationships pic attached- those first few links were just me playing around, trying to link the "TRADE_DATE" field.
    Click image for larger version. 

Name:	Screen Shot 01-14-19 at 03.45 PM.jpg 
Views:	26 
Size:	185.3 KB 
ID:	36932

  5. #5
    Micron is online now Very Inert Person
    Windows 10 Access 2016
    Join Date
    Jun 2014
    Location
    Ontario, Canada
    Posts
    13,421
    I see a nightmare evolving. "Yikes!" "Each DB uses the date as the Primary Key" NO and NO!

    I'm with Minty - you have gotten off on the wrong foot - totally. You need to study relational database concepts; especially normalization. To put it in simple terms, a table is an ENTITY. A table field is an ATTRIBUTE. The difficult part is deciding where to draw the line(s) between when an attribute might be better off being treated as an entity. The primary driver of those decisions is the nature of the business, which is where your expertise comes in. There is a saying for this - "normalize until it hurts; de-normalize until it works". An example of your deeply flawed approach is that each "asset" is not a table; it would be tblAssets. The only time you'd add a field is if you overlooked an attribute of the assets, or if a new attribute was created. This would mean adding one field to tblAssets, not one field to 63 tables, which would only be the beginning.

    When a novice says they're diving in to making a financial database I figure if they only knew what they don't know, they'd have a reason to be concerned. I cannot over stress the importance of doing your homework, because Access is easy to do poorly. You have to forget what you know about spreadsheets as that knowledge will only interfere in your success.

    Hopefully I'm not coming across too strong in trying to save you a lot of grief - at least more so than realizing you have done that work for naught. I suggest you bookmark these links so that you can review their content at the appropriate time. Better to understand normalization, apply it to your concept, then post back here for feedback. Believe it or not, pencil and paper are important design tools in this endeavor. If you can't make the relations on paper, don't bother designing tables.
    Normalization is paramount. Diagramming maybe not so much for some people.

    Normalization Parts I, II, III, IV, and V
    http://rogersaccessblog.blogspot.ca/...on-part-i.html
    and/or
    http://holowczak.com/database-normalization/

    Entity-Relationship Diagramming: Part I, II, III and IV
    http://rogersaccessblog.blogspot.ca/...ng-part-i.html

    How do I Create an Application in Microsoft Access?
    http://rogersaccessblog.blogspot.ca/...cation-in.html

    Important for success:
    One source about how to name things - http://access.mvps.org/access/general/gen0012.htm
    What not to use in names - http://allenbrowne.com/AppIssueBadWord.html
    About Auto Numbers
    - http://www.utteraccess.com/wiki/Autonumbers
    - http://access.mvps.org/access/general/gen0025.htm

    The evils of lookup fields - http://access.mvps.org/access/lookupfields.htm
    Table and PK design tips - http://www.fmsinc.com/free/newtips/primarykey.asp
    About calculated table fields - http://allenbrowne.com/casu-14.html

    Aside from these links, do your own searching and find what speaks to you or augments your understanding of what you've already read on a subject. Also, go to the forums page of this site and see what you can find under tutorials. Here's only one good link https://www.accessforums.net/showthread.php?t=64285
    The more we hear silence, the more we begin to think about our value in this universe.
    Paraphrase of Professor Brian Cox.

  6. #6
    cazeek is offline Novice
    Windows 10 Access 2016
    Join Date
    Sep 2018
    Posts
    11
    Micron- thanks so much for all of the resources. I have no pride or ego about this haha- I've been running around blind trying to figure out what's what, so this will be very helpful.

    Before I dive in- I totally understand the benefit of consolidating to one entity (table) that you and Misty are suggesting. I have two initial concerns.

    1. First, the query/data access time- I will be probably using SQL commands to retrieve data back into excel from the DBs. Will having a 350k+ record table dramatically slow down these retrievals?
    2. I need to adjust the prior records for each "asset" independently. My current process is automated (through Excel VBA) to dump the whole single-asset table into excel, delete the records that are to be updated, and then add the new records. This works, b/c I'm using the fact that the records are "in order", and I'm deleting/re-adding the last X records. I'm assuming this is not the right way to go about it, and it will be more difficult in a consolidated table. Is there a proper way to approach this? My initial thought would be to basically go through a similar process where I use SQL to filter/sort the table and then proceed the same way (if that would work).


    Thank you again

  7. #7
    cazeek is offline Novice
    Windows 10 Access 2016
    Join Date
    Sep 2018
    Posts
    11
    Just went through the Roger's blog.. very helpful and easy to understand- thank you again. This is what I'm thinking based on that and your comments. There will be some more tables as well. The "Models" table would really have a lot of data- guessing around 1-1.5 million records and the "Prices" would be 350k as stated in the prior post.

    Click image for larger version. 

Name:	Screen Shot 01-14-19 at 07.15 PM.PNG 
Views:	25 
Size:	26.7 KB 
ID:	36942

  8. #8
    Micron is online now Very Inert Person
    Windows 10 Access 2016
    Join Date
    Jun 2014
    Location
    Ontario, Canada
    Posts
    13,421
    1) query speed depends on many factors: indexes and the use of or lack thereof; too many calculated fields, too many aggregate functions, sub queries, complicate expressions, too many IIF functions, basing forms on large tables (I almost always base a form on a query, and in some cases it is the only possible way), slow network, Windows swap files too small etc etc. See http://allenbrowne.com/QueryPerfIssue.html for starters. 350k records is not that much, really.

    2)
    and it will be more difficult in a consolidated table
    Updating (not adjusting) can be selective based on criteria. I don't see why it would be difficult unless you've structured your table design poorly. Normalization is to a db what the foundation is to a house. Do it poorly and you're on shaky ground so to speak. As for order of records, forget it. Records in a table are like a bucket of marbles with no guaranteed order. The only "proof" I've ever encountered about this analogy is when I open a recordset on a table and the first record isn't what the table shows, so I believe the statement. Employing a sort order on a table is OK for understanding data during development but is best not used to control presentation afterwards. Users should not be allowed to interact with tables, so organizing data in them is pointless.

    Regarding your present process, constantly deleting and appending (rather than updating) contributes to db bloat. Then there is the catastrophe you might have if a delete or append operation fails part way through (research Transactions down the road). Speaking of bloat, it's also advisable to NOT store images or files in a db even though you can. That's another topic.

  9. #9
    Micron is online now Very Inert Person
    Windows 10 Access 2016
    Join Date
    Jun 2014
    Location
    Ontario, Canada
    Posts
    13,421
    Look again at my coloured text in post 5, then go to the links and visit Autonumbers. I think you're getting ahead of yourself and us just a bit.

    To get useful feedback on proposed table structures would require us to know something about the business at hand. You have a couple of short paragraphs above that give clues, but what you wrote is mostly what you think you want out of your db. One has to know what needs to go into it first, and of that I have almost no clues. A concise but clear (no jargon) from the ground level up about the business or major processes the db needs to support would help. I'm not a financial guy, so I might not be able to contribute to design elements as much as others here and would defer to them rather than give you bad advice if I can help it.

  10. #10
    cazeek is offline Novice
    Windows 10 Access 2016
    Join Date
    Sep 2018
    Posts
    11
    Again, thank you so much for your help- I know I'm prob a pain in the ass w/ my ignorance here, but I feel like I'm getting on the right track now... I hope!

    Bloat/Updates
    In terms of updating- understood, and I may have noticed the "bloat" issue in my process thus far. I sometimes have to compact/repair to shrink down the DBs a bit, so I suspect that's from this delete/add instead of update nonsense I've been doing.

    Primary Keys
    Based on the colored text and one of your links, it seems I'm better off just leaving the PK as autonumber for efficiency later on.

    Better Explanation (hope this helps a little bit)
    I have data from several sources / of several structures (same sets of fields)
    - "Prices" have 12 fields for each date for each asset. There are over 5k dates (records) per asset and 62 assets. So the asset and date are not unique, but each combination of asset + date is unique.
    - "Models" have 50-100 fields for each date for each model. There are 1-8 models for each asset (about 200 total) and 5k dates (records) per model. Each model + date is unique.
    - "Model Definitions" will contain about 30-50 common fields for each model. There are about 200 models or so, so it will contain about 200 records. Each model is unique.
    - "Asset Definitions" is like my "Customer" table (per most of the tutorials). It contains basic info about the asset. It will probably have under 15 fields and 62 records. Each asset is unique.
    - "Analytics" is just a set of dated records with each field being a data series. The fields of each record are only related to each other by date. I will probably have several tables of kind. The amount of records will be equal to the amount of dates I have (so over 5,000). Each date is unique.

    I'm building, because I want to design a new system to integrate everything. Ultimately, I intend on building some sort of Excel plug-in that has a GUI to browse my data and import, but I want this to be designed well enough to make all interactions with the data easy enough going forward. There are consistent interactions across data sets that I may want to build reports out of / calculate manually. For example, I may want to multiply a price from the "Prices" table by a value in the "Models" table every day. The "link" b/w the two would be the date. There are more complex interactions I'd want to create as well, but that's the general idea.

    My Plan

    My new plan / diagram is below. I switched all to autonumber indexing, and I'm thinking to use multi-field indexing where required to facilitate faster queries. So for "Prices", I'll create an index that that sorts by Asset and Date. For "Models", I'll create an index by "Model" and "Date". "Analytics" by Date. To interact with the data, mastering the SQL queries seems to be the most efficient (per one of your posts). So to be most efficient: optimal normalization, integer primary key, smallest field sizes possible, indexing.

    Click image for larger version. 

Name:	Screen Shot 01-14-19 at 09.33 PM.PNG 
Views:	25 
Size:	25.8 KB 
ID:	36944

  11. #11
    Micron is online now Very Inert Person
    Windows 10 Access 2016
    Join Date
    Jun 2014
    Location
    Ontario, Canada
    Posts
    13,421
    Regret to say that I learned almost nothing of the nature of the business that the db is supposed to support from that post. That was all about what you have in the way of data, and "models" and such words are jargon to me. Models of appliances? Plastic car and airplane model kits? An abstract concept such as data models? Assets are stock portfolios? Equipment belonging to the company? NSA secret agents? The reveal needs to be more like

    We make 6 different models of 7 Widget types in 2 plants that we sell to customers. I need to keep track of production over 2 shifts in each plant and what machine made a batch of widgets. Also need to track stock counts and pricing along with discounts to customers. Imagine this story goes on a lot farther about invoicing, PO's - whatever. How many tables do you see for that scenario? Probably at least 12.

    It sounds like your tables are not correct yet (based on fields for dates and having 100 fields) and the like.

    You images indicate you might be grasping some aspects of relationships between tables but without understanding the business, it's impossible to comment on the entity/attribute relationships. One thing that I find unhelpful in these situations is when the OP makes this all about "shoes" for example, and when you make suggestions they don't work because surprise - it's not about shoes. Hopefully you don't mask the real thing if you're going to explain it further.

  12. #12
    cazeek is offline Novice
    Windows 10 Access 2016
    Join Date
    Sep 2018
    Posts
    11
    Hahahaha- very sorry, let me give this another try after I wipe the shame off of my face.

    I develop analytics of the financial markets (stocks, commodities, etc.) both for sale (as data/PDF reports to traders/investors) and to use in my own trading models (where I import the data and systematically trade using quantitative models). The analytics are derived from a vast amount of data from a wide variety of sources (economic data, oil prices, stock prices, USDA agricultural data, etc.). All of the data is dated (either daily or monthly). I create mathematical models based off the data and generate new analytics/data from those models (we'll call this "Product Data").

    The purpose of my current undertaking with Access is to organize all of this data (oil prices, Product Data, etc.) that I currently import/manipulate/export through Excel, VBA, and other programs. Once it is organized, I am going to develop some type of plug-in for Excel or a standalone application to retrieve/chart/display the data in a variety of ways. This plug-in/app would be for my use (not for employees/customers or anything like that). My Excel "vision" would be some sort of "Data Browser" type window where I have my data organized (the Product Data, the raw market data, etc.). Through the browser, I would select the series I'm interested in "importing" and the dates I want the data for, click "Import", and the data would be imported into Excel.

    Separately, it would be useful to generate reports directly through Access where possible. I may want to multiply fields from two tables together for each date available for example (e.g. oil price x dataproduct1 would be a field called "OilDP1" in some other report/table).

    In terms of the terminology:
    tblPrice ("Prices"): This includes the market prices for different assets (e.g. gold, oil, S&P 500 Index, Euro/US Dollar exchange rate). Each day I have a variety of prices that I will populate this table with for the new date.
    tblModels ("Models"): This includes analytics that I develop through quantitative models (the "Product Data"). For each asset, I have about 75 fields or so of data I generate each day that I will populate this table with for the new date.
    tblAnalytics ("Analytics"): This is just a broad set of dated data. Each series has a daily reading, and I need a way to store and refer to all of the data for each date. For example, the US 10-Year Treasury yield would be one field. Another would be the oil price. I have over 1,000 fields worth of this type of data that I need to populate, so I would probably have to split it up in some organized way. There are several other tables that I'd also need with the same concept as this.. just "columns" of dated data that I'll want to retrieve/interact with later on.
    tblAssetDef ("Asset Definitions"): This describes each asset. So "Gold" would be an asset, and the fields would be "ASSET_CODE: GC, FULLNAME: Gold, SECTOR: Precious Metals, ...". I would use this when I created a report for gold to display some of those properties.
    tblModelDef ("Model Definitions"): This describes each quantitative model from which the Product Data is derived. Each model has a code, so for a random gold model, the fields would be: "MODEL_CODE: GCE_R_M, ASSET_CODE: GC, SHORTNAME: Main, CALIBRATION_DATE: 10/10/2018,..."


    I hope this was clearer.. long story short, the business involves constant interaction with the data, both inputting and retrieving. Right now I live in Excel and VBA, and I use Access as just large data repositories, but I need to simplify how I interact with all of the data I develop and use to generate reports and so on. Thank you once again!

  13. #13
    Join Date
    Apr 2017
    Posts
    1,792
    Quote Originally Posted by cazeek View Post
    First, the query/data access time- I will be probably using SQL commands to retrieve data back into excel from the DBs. Will having a 350k+ record table dramatically slow down these retrievals?
    Not really - in case you index your data properly. But having your data cluttered between tens or hundreds tables will make more complex queries slower for sure. The effect from indexing will be reduced practically to nothing, you have to connect a lot of tables practically in every query, there may be issues with number of linked tables, or with query string length. And whenever you add a new asset, you have to redesign (or at least to check the design) all your saved queries, reports etc.

    When having doubt about database speed (because of size of your tables), you can consider using e.g. SQL Server database as back-end. SQL Server database can be bigger, and it is more efficient with big data. And you can use scheduled Job's there.

    When having database split to front- and back-end (either using Access or some other database back-end), you have to decide where to design saved queries/views - in front-end or in back-end. Usually saved query in front-end is faster (when all needed data is linked to front-end, then there will be no additional data transmission between front- and back-end). But it also means, that whenever you edit such queries, all users must update their front-ends! The choice is yours!
    Quote Originally Posted by cazeek View Post
    I need to adjust the prior records for each "asset" independently. My current process is automated (through Excel VBA) to dump the whole single-asset table into excel, delete the records that are to be updated, and then add the new records. This works, b/c I'm using the fact that the records are "in order", and I'm deleting/re-adding the last X records. I'm assuming this is not the right way to go about it, and it will be more difficult in a consolidated table. Is there a proper way to approach this?
    What happens when someone decides to sort the excel table e.g. by Sum's in descending order? You must have some better criteria in your Excel tables! Entry date, orderable entry identifier of some kind, whatever else!
    When you have to update table, then you have options:
    1. Delete all records meant to update (delete query with some WHERE clause), and then add new records which match this WHERE clause (NB! Don't use autonumeric Primary Key for your table when this option is chosen);
    2. Optionally delete all records in table not having match in records meant to update the table - or to set some record status field to "deleted" status (Delete or Update query). Update all records in your table which match with records meant to update the table with data from query (Update query). Add records in query meant to update the table which don't have match in your table to your table (Insert query);
    3. (Btw., the only option when you can't find any good way to identify the update scope!) Delete all records from your table, and insert new data. When this option is selected, then:
    a) Don't use autonumeric Primary Key for your table when this option is chosen;
    b) When possible, use SQL Server database as back-end, create a Job which will be run procedure(s) to update data, and schedule this Job to be run at non-worktime (e.g. at nighttime).

  14. #14
    Micron is online now Very Inert Person
    Windows 10 Access 2016
    Join Date
    Jun 2014
    Location
    Ontario, Canada
    Posts
    13,421
    Not being well versed in financials or the analytics thereof, I will have to read your treatise more than once to develop even a modicum of understanding. In the meantime, I still get the impression that you are still inclined to think in terms of fields (columns) rather than records.

    If for example, commodities are precious metals, lumber and the like, tblCommodities should list those by name. tblStocks (or Funds, or Bonds, whatever) likely would be on their own and might have partner tables for prices (tblComPrice) that I suppose are captured x number of times per day. Thus each new stock/bond/commodity/fund etc that comes in to play becomes a new row in the respective table, not a new field. Changing the count of price captures in a day means that for a given date, there would be x additional records than before, not another field for that additional capture. I think this point was raised at least back as far as post 3 but I'm not sure it has registered yet. Sorry if that is incorrect because as I've said for me the subject isn't one of my strong points.

    Speaking generally about the "high hanging fruit" (plug ins) I do envision a role for Excel here as IMHO, Access graphing sucks. In addition Excel is the tool for doing a lot of number crunching. At some point you'll have to decide if you're going to link to Access or import/export between Access and Excel in order to work with the data. However, you're not there yet, and getting this data quickly and accurately will greatly depend on how you construct the tables.

    EDIT - forgot to mention that using a back end data such as sql server might be a better option for you, otherwise you might find yourself having to split data into something like decades down the road. You could still use Access as the front end (fe) but might not be at that point yet.

  15. #15
    cazeek is offline Novice
    Windows 10 Access 2016
    Join Date
    Sep 2018
    Posts
    11
    Thank you both for the help. I have started looking into SQL Server as well now, but let me give one more shot w/ the image below to help understand the data. I color-coded the headers so that you can see what I am proposing "linking". I think the table I labeled "tblOtherData" is the one that you believe I'm doing incorrectly (thinking in terms of "columns"). I'm not sure how to best handle that one- as I have over 1,000 series that will have a new data point each day that I need to store. My plan is to create a field for each of those series and the date and add 1 record each day (well as I'd have to have this spread across several tables, 1 record/table each day).

    Click image for larger version. 

Name:	Screen Shot 01-16-19 at 02.03 AM.PNG 
Views:	21 
Size:	87.7 KB 
ID:	36968

Page 1 of 2 12 LastLast
Please reply to this thread with any new information or opinions.

Similar Threads

  1. Replies: 9
    Last Post: 06-23-2018, 10:25 PM
  2. Replies: 3
    Last Post: 04-13-2017, 09:47 AM
  3. Time-series database for monthly loan balances
    By rlmax in forum Database Design
    Replies: 6
    Last Post: 03-08-2017, 09:15 AM
  4. Replies: 7
    Last Post: 12-06-2016, 09:06 AM
  5. Replies: 4
    Last Post: 10-16-2016, 06:22 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Other Forums: Microsoft Office Forums