Hey, Hans here. New to the forum. Thank you for welcoming me in advance
I'm relatively new to Access, way more familiar with Excel. I've run quite a bit of Access tutorials over the past weeks, but I'm starting to question if I will find a solution to my challenge this way. So I want to present my thoughts here in the hope someone can help or show me where to go look for my solutions.
I'm looking at the following.
I have data for about 8,000 'locations'. They are not physical locations, but let's call it that for simplification; as I'm interested in the 'distance' between locations, which is also not a physical distance, but a distance nonetheless. What I want to create is a database where I can record and store the distance between any unique combination of locations. So that is (8,000 * 8,000 - 8,000) / 2 = 31,996,000 distances.
When I started with this I thought it was going to be easy. Create a matrix with the names of locations on a vertical header and a horizontal header and voila: I can start recording my distances. Let's simplify it to 8 instead of 8,000 locations and it would look like this (where x’s are non-unique or distance to itself which both are irrelevant):
--A B C D E F G H
A x
B x x
C x x x
D x x x x
E x x x x x
F x x x x x x
G x x x x x x x
H x x x x x x x x
Right?
But then I figured that would not be useful, even in Excel, because pivots are only used to present rather than record data (right? Though it seems intuitive in cases like this…). And I learned that pivots in Access have been discontinued since Access 2013 (why?!?), so I haven’t even taken the effort to start to try to figure out if it could have been done in there with that function.
So am I looking at making a normalized table with 30+ million rows now? Is that my only option? In the simplified (8 locs) example that would look like this:
Loc1 Loc2 Distance
A A X
B A
C A
D A
E A
F A
G A
H A
A B
B B X
C B
D B
E B
F B
G B
A C
etc etc
It feels tedious to me.
You can also see why I find this by not writing out the full table like I did with the pivot. It’s too much work.
If such table is my only option, is there some automatic or systematic way of creating such table? Having to copy the 8,000 locations 8,000 times (for each location that is) to create all my rows and then another column with 8000 * 8000 rows with each location against that seems crazy and inefficient. There must be a better way…
A second reason it feels tedious to me, after I successfully create it, is that I will have to look up through the list to find the combination of locations I want to edit, every time I want to record or edit a distance. This as opposed to having a pivot table where looking up the distance I want to record feels (at least) easier to me. But then again, I don’t believe you can even edit your values/data in an Excel pivot table, so I don’t know how feasible that idea was in the first place (I am wondering if that was possible with Access pivots, but its discontinuation makes the question irrelevant now I guess). I won’t be recording all 30+ million distances (not all distances actually exist, or a great portion of it can be set to an arbitrary set number, likely 0 or a maximum, but depends). It will always be a manual way of entering a distance and there is no way of knowing beforehand which distances will interest me. Well, it is reasonable to suspect the distances within a class (the locations are classified/structured some way) will interest me more than between classes, but it is not exclusively the case.
Finally, some additional, more general questions.
Is it good conduct to use several tables in Access? I have additional information about every location. Let’s say, an ID, what country and continent it is in (again, it isn’t physical locations, but there is some sort of classification/structure going on), alternative names, population. I’m just making things up now, but there is much more to say about these ‘locations’ and maybe there will be some additional information I will realize I want to store too while creating the database. If I would record the information in Excel I would simply have another column for any additional information. But if I’m going to make that huge table with all combinations of locations, it doesn’t seem right to repeat all that information as well.
So is that solved by having a separate table with the distances and one with other information and then create relationships? Should I split up other information as well besides a table for the distances and one for the rest of the info about the locations? I guess you can sense that I’m a bit new to Access from these questions. I suspect the answer to these additional questions is basically ‘yes’…
If I have 30+ million rows for my distances, is it still possible to represent this data from my database in Excel with a pivot table? Excel has a limit of just over 1 million rows… I’ve also heard about an Excel add-in called PowerPivot, should I look into that?
Because it has become such a long post, a summary by the most important questions:
- How should I set up a database with distances between 8,000 locations (30+ million distances)?
- If I need a normalized table with 30+ million rows, how can I make this table in an efficient way (without having to copy-paste all loc names/ID's thousands and thousands of times)?
- After creating such table, how can I efficiently edit the distances given that distances need to be recorded manually?
- Am I right in thinking that creating multiple tables with relationships in between is good conduct in Access?
- Can I still present my 30+ million distances in a pivot table in Excel? Or can I use such large database in Excel at all in meaningful ways?
Best wishes, and thanks so much for taking the effort to read through my post.