I will attempt to explain this as best I can, but it is difficult because the problem is so vague. I am not as much looking for a drop-dead answer as I believe there is none, but more brainstorming on troubleshooting paths. Here is the scenario:
1. The back-end is stored on a Windows Server 2012 on a gigabit network with 2 Dell managed switches. It is an .accdb file designed in Access 2010.
2. Approximately 15 users are in the system all day. All of the users are local and use the front-end to manipulate the data. They are using Access 2010 runtime.
3. Client machines are all in good working order with gigabit Intel NICs.
The problem:
In the DB is a table (called Loads) that contains about 30 fields and 65k records. All data is manipulated in the front-end via native Access queries (meaning I don't have any proprietary SQL code in my VBA). Mostly by forms bound to Select queries. There are many other tables, but the Loads table is the main scope of this problem.
Randomly, out of what seems like nowhere, users will be unable to open any forms or reports. They will get an "unrecognized database format" error on the front-end. Some users can continue to do things for a while, but eventually they will all get this error. When I try to open the back-end, I get the message that the database is in an inconsistent format. I have to get everyone to close the front-end and run a compact/repair on the back end. Sometimes it repairs fine and I can have people get back in, other times it gets weird as follows:
Other times, I run the repair operation and I get a table that pops up called "MSysCompactError." Now, I know what this table is and why it exists, but the results in it are strange. I attached a screenshot for reference because it's too difficult to explain clearly.
Now the times when I get this compact error table, I open the Loads table and it shows a certain number of records that have been deleted. This number always agrees with the number of deleted records mentioned in the error table. However, the record isn't really deleted, it's just had all of its fields nulled out with the exception of one or two that are full of ## characters. It gets weirder.
Not only are records randomly deleted, but other records are randomly duplicated. I'm not just talking about the ID field being reused for another record, I'm talking like someone copied the row, clicked a new row, and pasted the entire record, unique ID and all. All of the info is exactly the same. I switch to design view and the primary key designation is gone. It's like Access is violating its own rules and then nuking the primary key. I have to remove the duplicate records and then reset the PK. The PK is not an AutoNumber, it is a Long Integer that I increment using the DMax() function. I only invoke this function at the instant before saving the record, so the chance of a conflict is minimal, and it has worked this way for over 2 years so I do not think it is related to this problem.
The number of records deleted and number of records duplicated are always the same. However, they do not seem to be related in any way. They have different FKs, and are not necessarily adjacent to one another in the recordset (in fact they almost never are). The affected records are always within the past month or so of being created, and that is approximately the window of activity for editing records (anything older is frequently accessed but infrequently edited). That tells me that it is not 100% random, and that someone is "touching" these records somehow and making them go crazy.
Obviously the above is evidence of corruption, but from where? This process has been working fine the majority of the past couple of years. It first occurred about a month ago out of the blue (I had not made any changes to the front or back ends in quite a while). At the time, I just kept repairing it and after the 5th or 6th time, the problem stopped occuring. It was fine for several weeks until it occurred again yesterday morning. I repaired it twice and it behaved the rest of the day. It happened again twice in one hour today, and now is going on 3 hours without a problem. I also cannot find any consistency with the time of day that it happens.
We are not having any known connectivity problems. We also use Quickbooks and it freaks out at the least hint of a network interruption, which it has not been doing. The server is extremely robust and as I said the workstation NICs are in good shape. I have checked all active ports on the switch and none are showing any packets with errors. The front-end file is excluded from virus scanning on the workstations, and the back-end file is excluded on the server.
I would appreciate any input as to what I can test or change next to work towards a solution. This problem is so vague and random I am pulling out what is left of my hair to even come up with a possible fix. I have already created a blank database and imported the front and back ends into fresh files. That seemed to have no effect at all. Some other things I plan on trying:
1. Isolate the Loads table into a different back-end file.
2. Create a quick and dirty transaction log to identify if one workstation is the culprit. This seems like a complex undertaking and adding a lot of overhead, but it may be my only hope.
Any other suggestions, whether on the network or specific to Access? Has anyone come across crazy problems of a similar nature?
Thanks in advance for help and suggestions. I can't really post my file to this thread as it is a 7MB front end and 20MB back end and anything I remove could be the source of the problem. I can perhaps attach specific tables, queries, or forms if anything would possibly help. But as I said, these have all been functioning perfectly for months without any changes.