I'm designing a database for our neuroscience lab. Currently, our lab and every other lab in the department keep our data in Excel spreadsheets, which is ridiculous considering the massive amount of data we work with. I'm proficient with SQL but clueless about actually designing an efficient database without redundant information.
I hope to get some help from the pros here.
As always, we're concerned about data integrity so how we capture and manage data must be as efficient and fool-proof as possible.
I hope a logistical explanation will help:
One Experiment has multiple Researchers who are assigned multiple Animals that are ran on multiple (behavior) Tests over multiple Days. As is the case with experiments, Animals are grouped into different Conditions with some animals getting one condition and other animals getting another based on the design. Not every experiment uses each test and the researchers change each semester.
Thus far, I've created tables for Experiment, Researcher, two of our 20 behavioral tests, Animals, and a table for animal information such as sex, mother ID, father ID, birth date, etc. I'm clueless about how I'd efficiently track elements unique to the experiment design such as drug, dose, and vial color codes.
EXPERIMENT
Each Experiment also uses a different design, something I'd also like to track. We use different Drugs and Dosages. Sometimes one Experiment uses two or more Drugs in two or more different Dosages.
We color code the drug vials so the researcher remains blind to what they give the animals. For instance, 50 mg of caffeine may be red, while 100 mg is green, and saline is blue. 50 mg of caffeine + estrogen may be orange, while 100 mg caffeine + estrogen is yellow, and saline + estrogen is black.
When researchers enter their drug data, they only enter color codes which we (archaically) translate by hand into the drug given and by association the condition the animal was in. It would be great to keep track of the drugs, dosage, and vial code colors used in a particular experiment.
ANIMALS
Each animal is weighed on every test day and sometimes multiple tests are given in one day so I assume it'd be useless to have that data in two different Test spreadsheets. On test day, we also record the amount of food and water an animal consumes. Again, to track that info in two different spreadsheets would be bad, right?
In my limited knowledge of database design, I know that Experiment is the table with the primary key but I'm clueless as to which tables to create and how relate the other tables efficiently.
The true end goal is to, of course, streamline data entry so that the forms step the researchers through the process with minimal chance for error. In the end, I'd first like to create a locked form that only I can enter data about the experimental conditions such as which drugs we use, etc.
But I'd also like to create an experiment-specific form that display only the parameters of that experiment for data entry by our researchers.
EXAMPLE
For example, Experiment 0001 has Jane running Animals 101 and 102 (both in Condition 1 with a red vial) on Test 1 on Monday and Test 2 on Tuesday. Joe runs Animals 201 and 202 (in Condition 2 with a blue vial) on Test 2 on Monday and Test 1 on Tuesday. Again, note that we have at least 20 behavioral tests, each with their own particular data collection paradigms, and typically 4 or more conditions per experiment.
With respect to data entry, Jane would type in an experiment number which would open a form with a combo box that allows her to select her name from a list of only those researchers working on that experiment, as well as the ability to enter data only for those behavioral tests specific to it (say with appearing buttons that say Enter Data for X Test or Enter Data for Y Test).
She would click Enter Data for X Test and be taken to a form in which she has to input the animal ID, the color of the vial given that day, and data for that test day. She would have to do that with multiple days of data for X Test, hopefully without reentering the animal ID.
I apologize for the length of this post and I'm certain these design logistics aren't tough to translate into Access, but I'm a rookie at database design. It's amazing how many scientists love and work with data but have no idea how to manage it efficiently, leading to all sorts of problems. I don't want to be one of those scientists.
Thanks in advance for any upper-level, pro help you can provide because I'm definitely at a loss.