Friday, April 8, 2011

Having difficulty combining JET SQL queries

Warning: Here be beginner SQL! Be gentle...

I have two queries that independently give me what I want from the relevant tables in a reasonably timely fashion, but when I try to combine the two in a (fugly) union, things quickly fall to bits and the query either gives me duplicate records, takes an inordinately long time to run, or refuses to run at all quoting various syntax errors at me.

Note: I had to create a 'dummy' table (tblAllDates) with a single field containing dates from 1 Jan 2008 as I need the query to return a single record from each day, and there are days in both tables that have no data. This is the only way I could figure to do this, no doubt there is a smarter way...

Here are the queries:

SELECT tblAllDates.date, SUM(tblvolumedata.STT)
FROM tblvolumedata RIGHT JOIN tblAllDates ON tblvolumedata.date=tblAllDates.date
GROUP BY tblAllDates.date;

SELECT tblAllDates.date, SUM(NZ(tblTimesheetData.batching)+NZ(tblTimesheetData.categorisation)+NZ(tblTimesheetData.CDT)+NZ(tblTimesheetData.CSI)+NZ(tblTimesheetData.destruction)+NZ(tblTimesheetData.extraction)+NZ(tblTimesheetData.indexing)+NZ(tblTimesheetData.mail)+NZ(tblTimesheetData.newlodgement)+NZ(tblTimesheetData.recordedDeliveries)+NZ(tblTimesheetData.retrieval)+NZ(tblTimesheetData.scanning)) AS VA
FROM tblTimesheetData RIGHT JOIN tblAllDates ON tblTimesheetData.date=tblAllDates.date
GROUP BY tblAllDates.date;
The best result I have managed is the following:
SELECT tblAllDates.date, 0 AS STT, SUM(NZ(tblTimesheetData.batching)+NZ(tblTimesheetData.categorisation)+NZ(tblTimesheetData.CDT)+NZ(tblTimesheetData.CSI)+NZ(tblTimesheetData.destruction)+NZ(tblTimesheetData.extraction)+NZ(tblTimesheetData.indexing)+NZ(tblTimesheetData.mail)+NZ(tblTimesheetData.newlodgement)+NZ(tblTimesheetData.recordedDeliveries)+NZ(tblTimesheetData.retrieval)+NZ(tblTimesheetData.scanning)) AS VA
FROM tblTimesheetData RIGHT JOIN tblAllDates ON tblTimesheetData.date=tblAllDates.date
GROUP BY tblAllDates.date
UNION SELECT tblAllDates.date, SUM(tblvolumedata.STT) AS STT, 0  AS VA
FROM tblvolumedata RIGHT JOIN tblAllDates ON tblvolumedata.date=tblAllDates.date
GROUP BY tblAllDates.date;

This gives me the VA and STT data I want, but in two records where I have data from both in a single day, like this:

date      STT   VA
28/07/2008  0  54020
28/07/2008  33812 0
29/07/2008  0  53890
29/07/2008  33289 0
30/07/2008  0  51780
30/07/2008  30456 0
31/07/2008  0  52790
31/07/2008  31305 0

What I'm after is the STT and VA data in single row per day. How might this be achieved, and how far am I away from a query that could be considered optimal? (don't laugh, I only seek to learn!)

From stackoverflow
  • The table of dates is the best way.

    Combine the joins in there FROM clause. Something like this....

    SELECT d.date,
      a.value,
      b.value
    FROM tableOfDates d
      RIGHT JOIN firstTable a
        ON d.date = a.date
      RIGHT JOIN secondTable b
        ON d.date = b.date
    
    Lunatik : Thanks, this works with a simple value (e.g. "a.scanning"), but returns "You tried to execute a query that does not include the specified expression 'date' as part of an aggregate function" whenever I try to use one of the SUM expressions.
    Lunatik : Sorry, should have mentioned that this works only when I change both joins to LEFT JOIN, which I don't think will give me the result I need. Using RIGHT JOIN gives a "Join expression not supported" error. Have I run into a JET SQL limitation?
  • Turn the SQL into views and join them on the dates.

    Lunatik : Thanks, but it seems that CREATE VIEW can only be accessed via ADO, which might help when I go to stick a front end on the database, but for now I'm just using Access's inbuilt SQL window.
  • You could put all of that into one query like so

    SELECT 
    dates.date, 
    SUM(volume.STT) AS STT,
    SUM(NZ(timesheet.batching)+NZ(timesheet.categorisation)+NZ(timesheet.CDT)+NZ(timesheet.CSI)+NZ(timesheet.destruction)+NZ(timesheet.extraction)+NZ(timesheet.indexing)+NZ(timesheet.mail)+NZ(timesheet.newlodgement)+NZ(timesheet.recordedDeliveries)+NZ(timesheet.retrieval)+NZ(timesheet.scanning)) AS VA
    FROM 
    tblAllDates dates 
    LEFT JOIN tblvolumedata volume
    ON dates.date = volume.date
    LEFT JOIN tblTimesheetData timesheet
    ON 
    dates.date timesheet.date
    GROUP BY dates.date;
    

    I've put the dates table first in the FROM clause and then LEFT JOINed the two other tables.

    The jet database can be funny with more than one join in a query, so you may need to wrap one of the joins in parentheses (I believe this is referred to as Bill's SQL!) - I would recommend LEFT JOINing the tables in the query builder and then taking the SQL code view and modifying that to add in the SUMs, GROUP BY, etc.

    EDIT:

    Ensure that the date field in each table is indexed as you're joining each table on this field.

    EDIT 2:

    How about this -

    SELECT date, 
    Sum(STT), 
    Sum(VA)
    FROM 
    (SELECT dates.date, 0 AS STT, SUM(NZ(tblTimesheetData.batching)+NZ(tblTimesheetData.categorisation)+NZ(tblTimesheetData.CDT)+NZ(tblTimesheetData.CSI)+NZ(tblTimesheetData.destruction)+NZ(tblTimesheetData.extraction)+NZ(tblTimesheetData.indexing)+NZ(tblTimesheetData.mail)+NZ(tblTimesheetData.newlodgement)+NZ(tblTimesheetData.recordedDeliveries)+NZ(tblTimesheetData.retrieval)+NZ(tblTimesheetData.scanning)) AS VA
    FROM tblTimesheetData RIGHT JOIN dates ON tblTimesheetData.date=dates.date
    GROUP BY dates.date
    UNION SELECT dates.date, SUM(tblvolumedata.STT) AS STT, 0  AS VA
    FROM tblvolumedata RIGHT JOIN dates ON tblvolumedata.date=dates.date
    GROUP BY dates.date
    )
    GROUP BY date;
    

    Interestingly, When I ran my first statement against some test data, the figures for STT and VA had all been multiplied by 4, compared to the second statement. Very strange behaviour and certainly not what I expected.

    Lunatik : Thanks, a bit of parentheses got this going. However, I think I had concocted something similar to this before and this query is one of the ones that seems to bring Access to it's knees.
    Russ Cam : Have you indexed the date field on each of the 3 tables?
    Lunatik : I hadn't, but have now and unfortunately it isn't speeding things up, I'm still having to CTRL+BREAK out of the query
    Russ Cam : How many dates are you running it for? Have you tried running it and restricting to a date range and seeing how long that takes?
    Russ Cam : I would estimate that the longest amount of time is being spent aggregating values in each table and performing the NZ function on each value.
    Lunatik : I need all the dates as the data will used elsewhere. The separate queries only take a second, and the 'nearly' query I have only takes a few seconds for all records. Even restricting the date range to a few days takes >10 seconds with this query. Might have to find another way to get this data!
    Lunatik : That's Numberwang! Brilliant, this gives the right result in only a few seconds. Cheers for all your help.
    David-W-Fenton : In regard to the commment about "Bill's SQL," I've never understood why people want to write their SQL by hand. Use the QBE and drag and drop to create your joins. Access will take care of writing the joins with appropriate parentheses -- that's what the QBE grid is for!
    Lunatik : Hand-cranking SQL makes you feel more of a man ;)

0 comments:

Post a Comment