Results 1 to 13 of 13
  1. #1
    frandorm is offline Novice
    Windows 10 Office 365
    Join Date
    Sep 2021
    Posts
    5

    Managing Access file size

    I have an excel file containing 16 sheets with an average of 372,000 rows and 26 columns, totalling 5.96 million cells. This file size is 697.324 KM on my hard drive.
    I transferred all these sheets in Access into a single table with 5.6 million rows and same number of columns. The access file with same data takes 1,756,352 KB of space in my hard drive, even after doing a compact/repair run.
    Whys the Access file is considerably much larger that the excel file for containing exactly the same data ? Where is the effectiveness of Access to handle large data ?

    Furthermore, for many years, after having installed newer versions of Access, I still face a very restrictive capacity size limit of 2 Gb. When Microsoft will decide to increase that limit ?

    I also noted that when the file size nears 1.9 Gb (below 2 Gb), I get often error messages which have nothing to do with the real problem. This is very misleading and does not help finding the cause of the problem. After months of practice handling large files, I realised that nearing the 2 Gb capacity limit causes the application to malfunction and send wrong error messages. Is there a chance that Microsoft will address this problem?

    In other words, is there a way to exceed this 2 Gb limit to get rid of the problems associated to file size ? I read several advises saying that I can create several Access files with links to a main file. That is probably valid when dealing with several tables, but not when dealing with one single large table.

    Any advice you could provide will be most welcomed.

  2. #2
    Minty is offline VIP
    Windows 10 Office 365
    Join Date
    Sep 2017
    Location
    UK - Wiltshire
    Posts
    3,157
    ~6 million rows with 26 columns is a fairly substantial amount of data.
    What is the data type of each column? Are they all text? If they are fixed text width then ensure that access hasn't set them all to 255 characters length. Reduce the field size where appropriate.

    Once a Access file approaches 2Gb it get very hard to do anything with it, as the size restriction also affects temporary storage when thigs are being moved around in memory.
    You could consider moving the data into SQL server express which is free, and doesn't have the same size restrictions.
    DLookup Syntax and others http://access.mvps.org/access/general/gen0018.htm
    Please use the star below the post to say thanks if we have helped !
    ↓↓ It's down here ↓↓

  3. #3
    ranman256's Avatar
    ranman256 is offline VIP
    Windows Vista Access 2010 32bit
    Join Date
    Apr 2014
    Location
    Kentucky
    Posts
    9,550
    use external database files (split db) and compact regularly.

  4. #4
    Micron is offline Very Inert Person
    Windows 10 Access 2016
    Join Date
    Jun 2014
    Location
    Ontario, Canada
    Posts
    13,423
    Where is the effectiveness of Access to handle large data ?
    Perhaps the biggest advantage is the ability to have multiple concurrent uses whereas Excel limit is one. Second, a single table made from an Excel dump will likely result in one of the most inefficient db designs you can create as it will no doubt totally lack any resemblance of normalization. Access will surely consume far more space when replicating a spreadsheet design (wide, many fields) vs tall (many rows but in related tables). I would recommend you study db normalization and when you understand it, attempt to replicate your data with a properly normalized design - or stick with Excel.

    As for approaching the size limit, one could look at mySql or Sql Server or similar as a back end option, but they are less forgiving when it comes to poor design.
    The more we hear silence, the more we begin to think about our value in this universe.
    Paraphrase of Professor Brian Cox.

  5. #5
    frandorm is offline Novice
    Windows 10 Office 365
    Join Date
    Sep 2021
    Posts
    5
    Quote Originally Posted by ranman256 View Post
    use external database files (split db) and compact regularly.
    Well.. This is precisely what I wanted to avoid. You mean that I have to split a large single file into several independent sections from the same file? Say for example my large file contains airline traffic data for 16 years. If I create a separate access database for each of these 16 years, how could I create a 17th file linked to each of the 16 other file to query them in order to have in a query the results for annual traffic trends in a single query ? Most of my queries are for many parameters to analyse for the whole 16 year period.
    Thanks anyway for your comment but would need more information about that based on my comments in this message.

  6. #6
    frandorm is offline Novice
    Windows 10 Office 365
    Join Date
    Sep 2021
    Posts
    5
    Thanks for your useful suggestions. I will look at the things you suggest.

  7. #7
    isladogs's Avatar
    isladogs is offline Access MVP / VIP
    Windows 10 Access 2010 32bit
    Join Date
    Jan 2014
    Location
    Somerset, UK
    Posts
    6,204
    Quote Originally Posted by frandorm View Post
    Well.. This is precisely what I wanted to avoid. You mean that I have to split a large single file into several independent sections from the same file? Say for example my large file contains airline traffic data for 16 years. If I create a separate access database for each of these 16 years, how could I create a 17th file linked to each of the 16 other file to query them in order to have in a query the results for annual traffic trends in a single query ? Most of my queries are for many parameters to analyse for the whole 16 year period.
    Thanks anyway for your comment but would need more information about that based on my comments in this message.
    No that's not what splitting means.
    All the tables are moved to a separate backend file. Normally that is ONE backend file for all data tables.
    The remaining database objects are in a frontend file which should then be MUCH smaller

    However as Micron & Minty mentioned, much of the file size is likely due to having a spreadsheet design for your table(s). That is inefficient in a database
    Normalising your data will make it work far better in Access and may well reduce file size.
    If you are using attachment fields, those will use up a lot of space. There are better methods.

    A SQL Server backend database can be up to 10GB with the free Express version and at least 1TB with paid versions
    Colin Riddington, Access MVP, Website, email
    The more I learn, the more I know I don't know. When I know I don't know, I keep quiet!

  8. #8
    frandorm is offline Novice
    Windows 10 Office 365
    Join Date
    Sep 2021
    Posts
    5
    Quote Originally Posted by isladogs View Post
    No that's not what splitting means.
    All the tables are moved to a separate backend file. Normally that is ONE backend file for all data tables.
    The remaining database objects are in a frontend file which should then be MUCH smaller

    However as Micron & Minty mentioned, much of the file size is likely due to having a spreadsheet design for your table(s). That is inefficient in a database
    Normalising your data will make it work far better in Access and may well reduce file size.
    If you are using attachment fields, those will use up a lot of space. There are better methods.

    A SQL Server backend database can be up to 10GB with the free Express version and at least 1TB with paid versions
    So if I understand, I should first design a blank table in Access defining the nature of each column (text, integer, dopuble etc.) and define the correct length to eliminate unecessary space for each column ? And then import the data from excel into that framework ?
    This sound interesting if it is the case and I will try to follow that approach.
    Thanks

  9. #9
    Join Date
    Apr 2017
    Posts
    1,792
    Usual way to read Excel data into Access database:

    1. When you use Access back-end, link your Excel table(s) into it. When you use SQL Server database as back-end, you can't link the tables, so you have to write a procedure to read data when needed and store it temporarily (you have a table for Excel data, read info into it, use it to update your database tables, and then clear all data from it). When you need Excel data read continuously, link them also into front-end (from back-end), when not, then you can make all data manipulation directly in back-end, using queries/procedures created for temporary use there;

    2. Design table(s) where imported data will be stored. The main rule will be that entering any information repeatedly must be avoided as much as possible, and using smallest amount of information! E.g. You have person or firm names in your Excel table columns, every one of them many times. Instead reading them directly into your Access tables, you must design additional tables (registers), one for every different information type, where all unique names are listed, and given an unique ID (an easiest way to give such ID is to use an autonumeric field as ID). Now, when importing Excel data, instead of such names you store their matching ID's into your Access tables. Or you have some specific info about certain information entity (e.g. supplier) stored in several columns in your Excel table(s). You create a table where all unique suppliers and their data will be listed, and an unique ID for every supplier will be added, update all suppliers info in this table with info read from linked table(s), and when importing rest of data, only supplier ID is added instead of bunch of same values repeatedly in many records;

    3. Write procedure(s) which process data in linked Excel tables, and update Access tables with them. With simplest conversion, a single query may do, but usually you have to write a VBA procedure, which constructs needed querystrings and runs them, to step-wise process and import the data;

    4. When the first-time import was successfull, then in case this import was one-time project, you have to delete linked excel tables, and any queries/procedures written for this purpose specifically. When this import must be repeated in future, you have to write an event for this (OnOpen event for database/form, OnClick event for some button. In case you'll use SQL Server DB as back-end, a procedure and a job to initiate a scheduled run for this procedure must be created - in case you have full SQL Server license of-course).

    Btw, principles designed in p.2 will work when you design Excel apps too! Whenever I design such Excel files, they have several sheets with registry tables. The main info entry is always a single table, with Value-List combos used in columns which have according registry tables, which ensures no typing errors are stored. And whenever some specific information is entered only by specific user, he/she will have a separate workbook for it, and when those data must be accessible in other apps, they are read in using e.g. ODBC queries.
    Last edited by ArviLaanemets; 09-29-2021 at 11:13 PM.

  10. #10
    frandorm is offline Novice
    Windows 10 Office 365
    Join Date
    Sep 2021
    Posts
    5
    Quote Originally Posted by Minty View Post
    ~6 million rows with 26 columns is a fairly substantial amount of data.
    What is the data type of each column? Are they all text? If they are fixed text width then ensure that access hasn't set them all to 255 characters length. Reduce the field size where appropriate.

    Once a Access file approaches 2Gb it get very hard to do anything with it, as the size restriction also affects temporary storage when thigs are being moved around in memory.
    You could consider moving the data into SQL server express which is free, and doesn't have the same size restrictions.
    Thanks for the link you sent to me.

  11. #11
    Minty is offline VIP
    Windows 10 Office 365
    Join Date
    Sep 2017
    Location
    UK - Wiltshire
    Posts
    3,157
    It might be worth showing us some sample data from the spreadsheet, so we could determine if it's stored efficiently, or has been suggested could be normalised to reduce the amount of data stored.
    DLookup Syntax and others http://access.mvps.org/access/general/gen0018.htm
    Please use the star below the post to say thanks if we have helped !
    ↓↓ It's down here ↓↓

  12. #12
    CJ_London is online now VIP
    Windows 10 Access 2010 32bit
    Join Date
    Mar 2015
    Posts
    11,936
    So if I understand, I should first design a blank table in Access defining the nature of each column (text, integer, dopuble etc.) and define the correct length to eliminate unecessary space for each column ? And then import the data from excel into that framework ?
    This sound interesting if it is the case and I will try to follow that approach.
    unfortunately you do not understand. At least, not the bit that matters.

    No idea what airline traffic data implies in terms of data, but using customer invoices as an example - They will consist of customer details, the invoice header and the line detail including things like product name. In excel these would all be on one line - and customer details and invoice header would be repeated for each line detail. You have 1000 invoices with on average 5 line per invoice - so excel contains 5000 lines.

    In a database, each of these would be in separate tables and joined together in a query - there would be 5000 line detail records, 1000 invoice header records and maybe a few 10's or 100's of customer records. In addition you would probably have a product table which perhaps only contains 50 products. So you don't need 'a' table, you need 4,5 or more.

    There are overheads, indexing for example, but indexing is what will make a database considerably faster than excel, particularly for large datasets.

    One of the performance killers for a database is indexing large text fields which matters when joining the tables together in a query. Lets say we are taking about a customer. As a minimum the table should contain a numeric primary key and the customer name.

    The numeric key takes 4 bytes, the customer name 10 bytes plus the number of characters (note this is based on the actual data, restricting a text field to say 10 characters has no impact on file size when the data is stored). So lets say your customers name is 'The ABC Co' and you don't bother with a primary key. That's 10 characters which will take up 20 bytes on the disk. Double it because it is indexed - then all the other tables that link to this table will also require 40 bytes for each record. So just looking at 100 customers with 1000 invoice headers, that will take up 44000 bytes.

    Using a numeric primary key takes 4 bytes, plus another 4 for the index, plus the name, another 20 bytes, a total of 28 bytes but all the related tables will only need 8 bytes. So 100 customer with 1000 invoice headers would require a total of 10,800 bytes - around a quarter of the number required for a text index.

  13. #13
    CJ_London is online now VIP
    Windows 10 Access 2010 32bit
    Join Date
    Mar 2015
    Posts
    11,936
    Just an extension of my previous post - forgot to include the equivalent for a single table. For customer, no PK field, 20 bytes plus 20 bytes for the index x 5000 lines = =200,000 bytes. And if no index, slow performance and 100,000 bytes, still over twice as much as the worst of the two previous scenarios.

    Excel stores data differently because it is unstructured and data is not 'typed'. databases strongly type data in a strongly defined structure which has an overhead but allows for better performance particularly with large datasets.

    And to be clear the 2Gb limit applies to the Jet or ACE database which comes free with access, not access itself which is a RAD tool. As others have said, Access will connect to any odbc application such as sql server, mysql etc. which have much larger capacities to jet/ace as well as a wider range of functionality (and more effort required to maintain).

Please reply to this thread with any new information or opinions.

Similar Threads

  1. Replies: 12
    Last Post: 10-21-2018, 05:49 PM
  2. MS-Access increasingly large file size
    By Herbie in forum Access
    Replies: 4
    Last Post: 01-14-2014, 12:08 PM
  3. Replies: 5
    Last Post: 04-01-2012, 12:50 PM
  4. Access 2007 file size vs Excel
    By andrewalms in forum Access
    Replies: 4
    Last Post: 02-02-2010, 02:32 PM
  5. Replies: 2
    Last Post: 01-14-2008, 12:15 PM

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Other Forums: Microsoft Office Forums