Triple nested queries

**niculaegeorge** · 07-24-2011, 11:55 AM

Hy! I'm a newbie in access and I'm trying to do a small linguistic analysis and I'm having problems with append and aggregate sub-query.
In a nutshell, I'm trying to hit tree rabbits with one stone.
I'm trying to make one query which is first an aggregate query with an append sub-query inside it. But this last append sub-query has another aggregate query inside it (triple nested queries! yeaks!)
Basically I have one database with one preexistent table that has just 4 fields (columns) (it is normalized). Never changing, always the same.

Code:

word|nrbooks|freq|distrib
Home|3|542|1626
  Table|2|230|460
  Garden|1|712|712
  Chair|2|380|760

You read the table like this: the word 'home' is found in 3 books with a TOTAL freq of 542 hits in all 3 books, and the distrib field is a simple product of these two previous fields 3*542=1626. And so on. I need the last field 'distrib' so that for example the word garden, which has the greatest freq doesn't show up in my descending sorting as the first, because it is only found in one book, so when I do a descending sorting by "distrib" field, my table will look like this:

Code:

word|nrbooks|freq|distrib
Home|3|542|1626
  Chair|2|380|760
  Garden|1|712|712
  Table|2|230|460

So, you've guessed it right, 'distrib' comes from how much that particular word is well distributed in books.
Step 1:

I import new data (a new book) from a text file which has only 1 field. It looks like this:
Grass
Chair
Fork
Knife
Grass
Spoon
Grass

Note that the word 'chair' is already in my database, and that the word 'grass' repeats 3 times.
Step 2: I aggregate the data from this one column data text file, and the duplicate entries show up in a new field called 'freq' like this:

Code:

Word|freq
  Chair|1
  Fork |1
  Grass|3
  Knife|1
  Spoon|1

Step 3: I append this data to my database table.

But wait! I think I need to fill the second column(field) 'nrbooks' with value 1, so that when I do a aggregate query I sum up the entire 'nrbooks' field. I hope I'm doing it right. Please tell me if I'm wrong.

Code:

word|nrbooks|freq|distrib
Home|3|542|1626
  Chair|2|380|760
  Garden|1|712|712
  Table|2|230|460
  Chair|1|1|1
  Fork|1|1|1
  Grass|1|3|3
  Knife|1|1|1
  Spoon|1|1|1

Step 4:

As you have noticed the entry 'chair' is repeated, so I have to make another aggregate query that gives me my final table like this:

Code:

word|nrbooks|freq|distrib
Home|3|542|1626
  Chair|3|381|1143
  Garden|1|712|712
  Table|2|230|460
  Grass|1|3|3
  Fork|1|1|1
Knife|1|1|1
  Spoon|1|1|1

Note the entry 'chair' now appears in 3 books! It's freq changed from 380 to 381 (it was found only once in this last book) and the distribution value ('distrib') took a huge leap from 760 to 1143(3*381)
Why do I really need such a complicated triple nested aggregate-append-aggregate query?

Well, if I do it by hand, one query at a time, with each new imported data, my final aggregate query should be a new make table query into another new table in the database, since the source and destination can't be the same.

Then at the next import my destination will be this newly created table, then again I make the final aggregate to another make table so I constantly have to change my destination, and have 2 queries for each of these two tables.
So I have to bounce back and forth between 2 tables in my database.

But if I properly create this triple nested queries then I have the source table (imported one Colum entry text file) and my destination table will constantly be my preexistent table, each time and every time with only one constant never-changing query!
My query so far looks like this:
For simplicity
0 means my preexisting table in my database
9 is the table which contains the imported data (which at each import get's overwritten)

Code:

INSERT INTO 0 ( word, nrbooks, freq, distrib )
  SELECT [0].word, sum([0].nrbooks) AS nrbooks, sum([0].freq) AS freq, [0]!nrbooks*[0]!freq AS distrib
  FROM (SELECT [9].Field1 AS word, 1 AS nrbooks, Count([9]![Field1]) AS freq FROM 9 GROUP BY [9].Field1)  AS 0
  GROUP BY [0].word, [0].nrbooks, [0].freq;

This query works well but it doesn't do the last step 4 aggregate query. It's just a simple append query with a aggregate sub-query. I want this whole query INSIDE ANOTHER AGGREGATE QUERY! Can it be done? Or I approach this problem in a wrong way? Can it be done easier in another way? How?
Please, can someone give me a helping hand?

**June7** · 07-24-2011, 02:14 PM

I am not sure if can have nested Appends and Updates. You are faced with both appending new words and updating summary data for existing words. Whenever I have seen this situation, two SQL actions are need.

I built a table of sample raw data and by nested Select Aggregate queries produced the final output you describe:

SELECT WordCountOfWord.Word, WordCountOfWord.CountOfWord, WordCountOfBooks.CountOfBookSeq
FROM
(SELECT WordBookCount.Word, Count(WordBookCount.BookSeq) AS CountOfBookSeq
FROM
(SELECT Table1.Word, Table1.BookID
FROM Table1
GROUP BY Table1.Word, Table1.BookID) As WordBookCount
GROUP BY WordBookCount.Word) As WordCountOfBooks INNER JOIN (SELECT Table1.Word, Count(Table1.Word) AS CountOfWord
FROM Table1
GROUP BY Table1.Word) As WordCountOfWord ON WordCountOfBooks.Word = WordCountOfWord.Word;

My sample table had fields for Word and BookID. Any reason you cannot maintain table of raw data as I describe - too many records? If you can, I recommend you import and append new data with the words and an ID for the book.

**niculaegeorge** · 07-25-2011, 02:12 AM

First of all you guessed it right. I cannot maintain table of raw data as you describe because there will be too many records in my raw data table. It is imperative that the imported data to be overwritten, for reasons of space, and archiving data.
I figure out that I can't hit tree rabbits with one stone, but at least I can hit 2 of them with ease.
Here is a diagram of what I am doing:

After I delete 'main table' I rename 'main table2' to 'main table' so that I won't have to change my query code with each import.
To incorporate all the queries into one, my "one stone" query will have to interrogate both my raw data table and my main table, so I think it can't append and aggregate to my main table since it is used for reference as a second source.
I didn't saw it the first time, but that's kind of impossible since, just like I said previously, It can't have the same source and destination table. I will have to stick to a minimum of 2 SQL actions.
Using your code I come up with the idea for the best closest solution to my goal:

So the process starts over with each 'n' book.
I'll work on this, to see where it leads me.

**June7** · 07-25-2011, 03:16 AM

I didn't try to follow your schematics in detail, but here is process I could envision, is it in line with what you have in mind?

1. Import new book data to a temp table

2. Run aggregate query on the temp table, if new data has multiple books, use my suggested query

3. Use that query as basis for Append and Update to main table, 2 separate SQL actions

4. Delete the temp table

**niculaegeorge** · 07-25-2011, 04:25 AM

I don't follow what you mean by "…aggregate query on the temp table, if new data has multiple books, use my suggested…".

How can my new data have multiple books? My new data doesn't have anything except one simple column/field with most likely duplicate entries. It's raw data, that's just what it is.

I can't use your suggested querry, because it implies that I always APPEND to my 'raw data table' and I DON'T WANT THAT. (that will make the 'raw data table' the backbone of my database, and the query just a table result to show me what I want to see). That's not what I want. I want my 'main table' TO BE my MAIN TABLE! To be the only one backbone of my database. The 'raw data table' is just like a BUFFER of data!

Ok. Let me explain what I want to achive:
Let me explain it in detail.

I import my book data(raw data) ALWAYS to the same table (let's call it 'raw data table'). It is clear that this table has only one field, which may contain duplicates
Then do ONE QUERY (just one, that's what I want) to CREATE(make a new table). This table contains data from both my 'raw data table' and my 'main table'. So My one query interrogates 2 tables, the result being this NEW TABLE let's call it 'final table'
That's all what I want to do.

But there's a catch, when I import a new book, (raw data) I want to "INSERT" it it to this new table called 'final table'. Since I want to save my query, I don't want to change the sql code to replace the instances 'main table' to 'final table' so that it interrogates the tables 'raw data table' and 'final table', I WILL delete the 'main table' (it's old data now) and rename 'final table' to 'main table', because the 'final table' is the new version.
Now you will say that I need an update query, but I disagree

, because there are also new words coming in in the 'raw data table' that don't exist in the 'main table' or 'final table' whatever.
So basically, let me rephrase:

What do I mean by the "insert it in to this new table" Well, let's forget about distrib field, let's consider my 'main table' having only 3 fields: word/nrbooks/freq. I import the txt file (raw data) into a the 'raw data table' then by making a grouping by 'word' field (and duplicates get's removed and a new field called freq tells me how many duplicates there were - I think this procedure it's called aggregate -I'm a newbe, don't get mad) Then this table, get's appended to my main table, in the corresponding field1 and 3(word and freq) and field2 get's filled with 1.
Then in 'main table' if there are duplicates in this field1(word)then it will sum up their corresponding 'nrbooks' fields, and 'freq' fields. (is it correct to name this final procedure another aggregate?)
I like to arrange the code by command lines and indentation subrutines. It's an old C programming bad habit:

Code:

SELECT main.word, main.nrbooks, main.freq
FROM
  (
  SELECT [9].field1 as word, count([9].field1) as freq
  FROM [9]
  GROUP BY [9].field1
  )
AS piv
INNER JOIN
  (
  SELECT [0].word, [0].nrbooks, [0].freq
  FROM [0]
  GROUP by [0].word,  [0].nrbooks, [0].freq
  )
AS main
ON piv.word = main.word

It dosen't work, still in progress, but if you could help me with this code I will appreciate to use my terminology,
for example
SELECT main table.word, main table.nrbooks
FROM main table
so that I would understand the code better (I'm a beginner, please have patience

)

**niculaegeorge** · 07-25-2011, 07:04 AM

I've solved my problem.

I figured it out. Not on my on of course, your help had a some contribution.

I tweaked the code, and I come up with this final working complex much desired querry:

Code:

SELECT join.word, Sum(join.nrbooks) AS nrbooks, Sum(join.freq) AS freq, Sum([join]![nrbooks])*Sum([join]![freq]) AS distrib INTO 0n
FROM (select piv.word, piv.nrbooks, piv.freq
   From
     (
     SELECT [9].field1 as word, 1 as nrbooks, count([9].field1) as freq
     FROM [9]
     GROUP by [9].field1
     )
  AS piv
  UNION
  SELECT main.word, main.nrbooks, main.freq
  FROM
     (
     SELECT [0].word, [0].nrbooks, [0].freq
     FROM [0]
     GROUP BY [0].word,  [0].nrbooks, [0].freq
     )
  AS main
  )
  AS join
  GROUP BY join.word;

This is what I wanted.

Basicaly i have
table 1, called piv (created from '9' table-raw data table)
Table 2, called main (just a copy from the existent one called '0')
table 3 which is a union created from these previous 2
and final table 0n (the final aggregate query)

Thanks for your support and quick reply.

Triple nested queries

Thread Tools

Triple nested queries

working on a solution

i'm a begginer, have patience with me

Huray! I did it!!!

Similar Threads

Nested IIf Statement

Triple primary key

Nested data

Triple State checkbox scroll order

Nested Joins

Tags for this Thread

Posting Permissions