The following query returns strange results for me:
SELECT
`Statistics`.`StatisticID`,
COUNT(`Votes`.`StatisticID`) AS `Score`,
COUNT(`Views`.`StatisticID`) AS `Views`,
COUNT(`Comments`.`StatisticID`) AS `Comments`
FROM `Statistics`
LEFT JOIN `Votes` ON `Votes`.`StatisticID` = `Statistics`.`StatisticID`
LEFT JOIN `Views` ON `Views`.`StatisticID` = `Statistics`.`StatisticID`
LEFT JOIN `Comments` ON `Comments`.`StatisticID` = `Statistics`.`StatisticID`
GROUP BY `Statistics`.`StatisticID`
LIMIT 0, 10
I'm querying this on a table structure like the following:
(only data relevant to Statistics.StatisticID = 8
)
Votes
StatisticID
8
Views
StatisticID
8
8
Comments
StatisticID
8
8
8
8
8
Now if I run this query I get the following result set:
StatisticID Score Views Comments
8 5 5 5
I knwo where the 5 comes from - the number of Comments - and this works if I take the comments statement out. Can anyone debug this as this is out of my reach (I'm relatively new with SQL).
Thanks, Ross
-
When joining like this, you will duplicate the data as many times as you find mathing rows in the other tables. This is fine, if you only have 1 corresponding row in each table.
Run this query without the grouping, and you will get an idea on why you get the same result on all counts. I would however have guessed that you'll get 10 as the count for every field (1*2*5) If you want to solve this, you need to call a subselect for every count.
SELECT s.`StatisticID`, (SELECT COUNT(*) FROM Votes WHERE Votes.StatisticID = s.StatisticID) AS Score, (SELECT COUNT(*) FROM Views WHERE Views.StatisticID = s.StatisticID) AS Views, (SELECT COUNT(*) FROM Comments WHERE Comments.StatisticID = s.StatisticID) AS Comments, FROM `Statistics` s LIMIT 0, 10
There are certain performance-issues with this if the outher result is big. You might optimize it a little by joining one of the tables however I'm uncertain if the queryparser would be smart enough to only run 1 single time for each grouped item. Hopefully it would. Otherwise you could split it into different queries.
Kieveli : I wouldn't worry about performance issues - this sub-query solution will execute perfectly, and not perform any spurious actions!jishi : I was more thinking if the outer result would have like 1000 rows, then it would need to run 3000 sub-queries aswell. In that case it would probably be faster to join one table at the time, and call the query 3 times.benlumley : agreed on the subquery thing - if there are a lot of outer results. But we do have limit 0, 10 in it at the moment i suppose. Also, you've selected from votes twice, third suybquery should be from commentsjishi : Hm, I did change that, somehow my edit was reverted, or it was a glitch in the update.Ross : Thanks, I wouldn't have been aware of the perf issues if you hadn't pointed that out to me :) -
Assuming you have an id field or similar on the votes/views/comments:
SELECT `Statistics`.`StatisticID`, COUNT(DISTINCT `Votes`.`VoteID`) AS `Score`, COUNT(DISTINCT `Views`.`ViewID`) AS `Views`, COUNT(DISTINCT `Comments`.`CommentID`) AS `Comments` FROM `Statistics` LEFT JOIN `Votes` ON `Votes`.`StatisticID` = `Statistics`.`StatisticID` LEFT JOIN `Views` ON `Views`.`StatisticID` = `Statistics`.`StatisticID` LEFT JOIN `Comments` ON `Comments`.`StatisticID` = `Statistics`.`StatisticID` GROUP BY `Statistics`.`StatisticID` LIMIT 0, 10
Not tested it, but think it should work. (We have to use the different field because the statisticID will always be the same within a given group ...)
jishi : Actually, this is a better solution. Didn't know that you could use DISTINCT within a count, just tested it and it works.
0 comments:
Post a Comment