Simple Programming Question

**EvanRosenlieb** · 11-16-2011, 05:19 PM

I have a form that uses a recordset to add data to a table. I would like to improve it by adding a text box so that the person using the form can choose what table they would like to enter the data into, as this database has multiple table with parallel structure. I can't quite figure out how to work it.

This is what I have currently:

Code:

Set WorkingDB = OpenDatabase("C:\Documents and Settings\Von Fischer Lab\My Documents\WorkingDB.accdb")
Dim rsSurvivorsEntry As Recordset
Set rsSurvivorsEntry = WorkingDB.OpenRecordset("TableOne", dbOpenDynaset)

I have added a "WhatTable" text box to the form, and I would like the code

Code:

Set WorkingDB = OpenDatabase("C:\Documents and Settings\Von Fischer Lab\My Documents\WorkingDB.accdb")
Dim rsSurvivorsEntry As Recordset
Dim Me.WhatTable As String
Set rsSurvivorsEntry = WorkingDB.OpenRecordset("Me.WhatTable", dbOpenDynaset)

To tell the record set to use whatever Table name is written in the "WhatTable" text box. Instead, I get an error message that says "Cannot find Me.WhatTable Table or Query"

What do I need to change to make this work? As always, your help is greatly apreciated

--Evan

**CraigDolphin** · 11-16-2011, 05:24 PM

You would likely need to remove the quotes in red:

Code:

Set rsSurvivorsEntry = WorkingDB.OpenRecordset("Me.WhatTable", dbOpenDynaset)

to make it look like:

Code:

Set rsSurvivorsEntry = WorkingDB.OpenRecordset(Me.WhatTable, dbOpenDynaset)

I will not touch the issue of normalization in detail but if you have multiple tables sharing the same structure (ie same named fields etc) then there is something very rotten in the state of denmark. Or with your table structure, as the case may be. I wish you luck in the future.

**EvanRosenlieb** · 11-16-2011, 05:43 PM

Craig Dolphin:
That was the trick, thanks a bunch

as to

I will not touch the issue of normalization in detail but if you have multiple tables sharing the same structure (ie same named fields etc) then there is something very rotten in the state of denmark. Or with your table structure, as the case may be. I wish you luck in the future.

Although I am a novice, I am aware of the concept and importance of normalization. I am organizing a database for a decade long (well, eleven years now) academic study, and all eleven years of the data have thus far been stored in Excel sheets. Every year, the same sort of data has been collected (in general, measurements on the size, health, and reproduction of a certain type of plant), but unfortunately not exactly the same sort of data has been collected every year. For example, early on the the study they tried to take the total stem length for plants, then to save time they switched to taking the average length of five stems and the total stem number to later calculate an approximate total stem length, then for a couple of years they lost funding and only measured the longest stem. After they received funding again they returned to the average of five stems method, but in addition to counting the number of flowers, they also started counting the number of fruit. This would mean it would be very difficult to take all this data and turn it into a single normalized table, as the data aren't truly parallel. However, this last time the professor has secured funding it has been for at least ten years, and because of that she believes that she can keep the structure parallel for the rest of the study. So, from here on out, it would be possible to make a normalized table, but then all of the queries and forms I have written for the current structure would need to be rewritten (which wouldn't be the most horrible thing in the world) and the second half of the database would be in a different form than the second half. In this situation, would you recommend using a normalized table for the rest of the study? The professor likes the idea of keeping the data in separate table for (conceptual, not functional) simplicities sake, but if I convinced her that switching to a normalized format was truly superior she would be amenable to changing her mind.

**CraigDolphin** · 11-18-2011, 11:52 AM

Hi Evan,

I am a biologist myself so I understand the issues you're mentioning. And the fact is that it is perfectly possible to normalize even such data as you have outlined. The problem is that you're still trapped in the 'excel'/flatfile mindset which all of us biologists are thoroughly indoctrinated with us we go through college

Since I do not know enough about your experimental design to suggest a full table structure, let me at least show you something to demonstrate how you could have handled this.

First rule of changing from flatfile thinking: tables should (generally) be narrow but tall. That is, they have few columns, many records/rows. This contrasts to spreadsheets which typically are composed of many columns, few records/rows.

Second rule: think very carefully about whether you need to add a field, or whether you really should be adding a new table. Attribute information in the table should depend solely on the entity or relationship represented by the pk field of the table.

So, let's imagine that you are sampling plants repeatedly over time (each sampling event being an 'observation'). Let's say you have several different treatments too: say a concentration of fertilizer or somesuch. In this scenario, a control has a concentration of 0. And lets say there are several plants in each treatment. For the sake of this example, I'm assuming that these plants were all planted in the same year, and live for the duration of the experiment. (Might be true if you're working on trees, but not so likely if you're working on grasses.) I am also assuming that each plant undergoes one fertilization concentration treatment type only. And I will also add provision for testing multiple species of plant.

So, off the top of my head, I can identify four entities: individual plants, individual observations, plant species, and treatment types. Each entity requires a dedicated table. You might also wish to track which personnel were involved for various parts of the experiment. So, that's another entity and another table. I am going to assume onloy one person was involved at each observation.

tblPlants
PlantID (auto, pk)
PlantNameCode (text) (human readable identifier for specific plant)
PlantingTreatmentDate (date/time)
PlantSpeciesID (fk)
TreatmentID (fk)

tblPlantSpecies
PlantSpeciesID (auto, pk)
PlantGenus (text)
PlantSpecies (text)

tblTreatments
TreatmentID (auto, pk)
TreatmentName (text)
FertilizerConcentration_gPerSqFt (single)

tblObservations
ObservationID (auto, pk)
ObservationDate (date/time)
PlantID (fk)
PersonnelID (fk)

tblPersonnel
PersonnelID (auto, pk)
FirstName (text)
LastName (text)

Ok. So, that's great and all but what about the stuff you want to measure? Flatfile thinking would take this structure and add fields/columns to the observations table for each type of measurement you make over the life of the experiment. That would certainly be feasible but if that thought crossed your mind then find yourself a mild electrical current and apply it to a non-critical part of your anatomy.

That kind of structure makes working with the data very difficult after it is entered into the database. Any time you change your protocol, you would need to add new fields. When you add new fields to your structure, you will also need to change any forms or queries or reports or vba code that depend (even in part) on the new fields. Why be so hard on yourself or your future self who inherits your project after you leave in a huff. (...Because Access is so 'stupid' and you don't get paid enough to wade through another round of revamping the database because the good professor ended up changing his method again because of unforseen circumstances [ e.g., funding got cut off because ....?])

Let's step back a bit. In each sampling event (observation) you measure 'stuff'/metrics, and arrive at a value. Each value is interpreted against some system of measurement. So, that's what we need to store. For each value that you record, we need to know:
Which observation is this value related to?
What metric is this value related to?
What unit of measurement was this value measured in?

So, we have two more entities we need to track: Metrics, and Units
tblMetrics
MetricID (auto, pk)
MetricName (text)

tblUnits
UnitID (auto,pk)
UnitName (text)

So, in this model, one metric might be 'Average Stem Length' which would be a record/row in the metrics table. This might be measured in 'milimeters' which would be a record in the units table. 'Individual Stem Length' is a different metric, and is another row in the Metric table. 'Flower Count' is yet another metric, as is 'Fruit Count'. The units for these could be 'Count', or 'Count Per Stem', or 'Fruit Per Plant', or whatever it is that you are counting. Whatever you use, it would be a row in the units table.

And finally we need a table store our values in.

tblObservationResults
ObservationResultID (auto, pk)
Observation ID (fk)
MetricID (fk)
ResultValue (single)
UnitID (fk)

And that should cover the bases for the hypothetical I described. There's a lot that could be refined, by-the-by. I haven't really considered whether you need to track specific methods of obtaining a particular metric. e.g., using calipers to optain a linear measurement versus using string and a ruler versus optical scanning and digital analysis etc. Nor would this work pefectly if you have more than one person conducting each observation (hint: you'd need a 'junction' table between observations and personnel instead of the column in tblObservations).

Now, in my biologist's mind I can see the good professor looking at this structure, blanching, and wondering how in the name of Darwin this data could possibly be analyzed?

And the answer is that it's actually much more simple than he suspects. Any time that you would want to view the data in a traditional flat-file format, you can combine all of the needed elements using queries. And you can convert the different metrics into columns of summary data using crosstab queries. You will need to understand how to use the query builder, and what the difference between outer and inner joins is, but once the query is setup, it will never need to be changed.

But....says the good professor...I want the data to show treatment results on one axis of my graph versus the number of months since treatment began: where do I enter that? You don't need to. You calculate it, as you need it, by using the datediff function (which you can find out more about using help or handy references like http://www.techonthenet.com/access/functions/index.php) by having access return a calculated filed in your query which evaluates the number of months between the planting/treeatment date value in tblPlants, and the observation date value in tblObservations.

Hah, cries the professor! But that would still leave me with a column for 'AverageStemLenth' and another column for 'Individual Stem Length' as these are two different metrics. Yep. Which would put you back to where you began but without the messed up table structure. But, a crosstab query generates summary data for each column. That is, you will be required to specify a summary operation to perform if there is more than one value for each metric type in each observation. These operations could be average (mean), median, stddev, variance, count, max, min, first, last etc. If you elect to use the 'average' operation and have one 'average stem length' for some observations, you will get the 'average' of the single value in that column for that observation. If you have several individual stem lengths for some other observations, then you will get the average of these values in that column for those observations. If you want to obtain the stdev for each of these averages, it's as simple as either re-running the query using the different operator, or creating a copy of the query and having it return stdev instead of average. Or count. Or max, or whatever.

Hmm, says the dubious professor, we used to measure the stems in mm when the plants were small but now we use cm because they're really big. How do you deal with that? Well, how would you LIKE to deal with that? You could export the results as described and have the crosstab query show a column for each permutation of metric and units, if you like. Or if the units are always the same for each metric in any given observation event, then just show the units used as a column in the output and do all the converting yourself in Excel. Or, you could use the database to convert measurements into a single default unit of measurement at the point of data entry (ie, input your value and units in cm and have access convert the value into mm using a custom vba function). Or you could strore the units as-is, then create an intermediate query that converts the stored units (using a custom vba function) into the desired default unit, and then build a crosstab query that references this intermediate query for export to excel.

It all depends on what you feel comfotable with and how much time you like spending playing with numbers in spreadsheets.

Best of all, you can predict the SQL required to generate the values for your analysis because the field names in the tables will never change, and the result values are always in the same field regardless of which metric you are talking about. This means that you can use VBA to dynamically write the SQL for queries using user input, and to do things like export the results to excel.

There's a learning curve to working in Access versus Excel. The good news is that you can start by managing your data in Access, and then export it as a flatfile to Excel whenever you want to analyze it. But as you learn how much quicker it is to automate parts of your analysis in Access, the more you'll want to free up your time by doing more in Access. Key to that, however, is the right table structure. Get that wrong, and working in Access can be a real PITA.

**EvanRosenlieb** · 11-18-2011, 01:10 PM

Craig,

Thanks so much for the long reply! That was a very detailed explanation.

First of all, as it turns out I was actually an economics major in college, but certainly the same mindset about Spreadsheets applies, If not more so.

I was previously vaguely aware of most of those concepts (but not in such depth), but there are a couple of issues that kept the database from assuming more of that sort of structure.

I have been working on two projects on this job, and on the other one I have been working on since the beginning of the project. This means I have been able to help shape the dataflow into the database instead of trying to shape the database to how the data was already stored. Although it is an ecological study and a little messier than the theoretical experiment you mentioned earlier, the same sort of concepts apply. I can have a PlotsKey table, a ChamberKey table, a SoilCoreKey, and store all the data attained from the soil cores and the gas chambers in separate tables, linked by the appropriate ID's, and everything is accessed/calculated in queries...

The project of which I am speaking I was talking about in the original post definitely does not have a typical "controlled experiment" sort of structure.. I am having a little bit of a hard time wrapping my head around how these sort of concepts apply to it. It is a demography study, so basically all the data is census data about plants. Its is sampled once a year, and the main bits of information are about its location (given in x and y coordinates from a line, with additional information given in notes columns) and its "class" (seedling, non-fruiting adult, fruiting adult, or dead). Unlike the other project I am working on, there really isn't a whole lot that needs to be calculated from the data. The big question that the study is trying to answer is basically what percent of this particular plant makes it to fruiting adulthood in the wild and how long the plants live on average in order to give an estimate of the genetic turnover. The other data are used for similarly simple things such as what the average number of fruits produced per plant is, etc. In short, there's not really any "Plot X versus Y" data coming out.

Previously all the data from each year was just stored on its own excel file. There was no form of ID at all. Basically, whenever the professor wanted to see which plants had died and which survived, she had to suture together the excel files from the two adjacent years based on their coordinates and making educated guesses (unfortunately the plants do indeed move from time to time, as some stems can die and other stems can pop up up to about a foot away). The main reason I was enlisted was to basically create an ID system for the plants. This accomplishes her two main goals by:

-Allowing her to print of ID numbers on the field datasheet, so that the field researchers can go looking for plant number 12, find it and take the relevant data, and enter it and have it already be associated with all of the data from previous years, and

-Facilitating doing year by year survivorship calculations.

Anyway because of the sheet amount of data involved (I have no idea how she used to work with it) and the fact I was working part time on her project, it has taken a long time to simply correctly ID each of the plants and link it to the rest of the data, which was more or less simply imported from each years respective excel sheet. So in a sense, I have only completed the first step (but most important for the projects needs) of normalization. This certainly doesn't mean that the database couldn't be served by the data further being broke down, however I'm not sure if I can quite conceptualize how I should begin breaking it down further, and I'm not sure if the resources are going to want to be spent on that process in any case.

Thanks so much again for your help,

--Evan

**CraigDolphin** · 11-18-2011, 04:31 PM

Without a better understanding of the sampling protocol it's hard for me to make helpful suggestions about table structure. For example, I'm not sure what the relationship might be between 'plots' and 'plants', or 'soil cores' and plants, or 'chambers' and plants, or whether these labels should really refer to different kinds of entities, or are simply different flavors of a one entity. And from what you describe, it sounds to me as if 'stem' might need to be an entity that relates to a parent entity 'plants' since it sounds like you're tracking position at the stem level.

You have at least come up with a unique identifier at the plant level, and that's something.

What it boils down to is that your data model/table structure should model the real world system as accurately as possible.

One plant can have many stems. There's a relationship between these two entities. Do you need to track individual stems? Well, if you collect information at that level of detail and want to input it as raw data or report on your results at the stem level then, yes. However, if you only collect summary data (e.g., 80% of this plant's stems are dead) and only want to report on your results at the plant level then you could get away without tracking individual stems. There is a tradeoff sometimes between what data normalization would suggest versus ease of data entry etc. But in my experience I've never yet regretted normalizing, but I have regretted not-normalizing on many occasions.

Getting to the bottom of this data model would require a lot of back and forth typing on the forum and it's a bit hard for me to justify that amount of time to my boss. If you are interested in talking it over more I'd be willing to give you my phone details via PM. Perhaps if you sent me an example of your data as it stands, and we discussed the general methodology/work flow of your study, I might be able to offer a more tailored suggestion as to what table structure is most appropriate and maybe how to proceed with converting your existing data to fit the new structure.

Simple Programming Question

Thread Tools

Simple Programming Question

Similar Threads

A simple question:

Basic report programming question.

Simple question

Simple question?

Very simple question!

Tags for this Thread

Posting Permissions