1. First get your counts, something like
select some_key,count(*) from table1 group by some_key;
2. and stick these values into an intermediate table (yes, not required but easier to explain)
create table kount as select some_key,count(*) as value from table1 group by some_key;
3. now decide what kinds of "buckets" you want for your histogram. Be sure that you cover your min and max values from the previous count query. So in my case I'll make buckets that represent orders of magnitude. You create a table with these buckets and then populate a min and max value for each bucket thusly.
create table bins (min_value int, max_value int);
insert bins values(0,10),(10,100),(100,1000),(1000,10000),(10000,100000);
Our resultant buckets look like this:
+-----------+-----------+
| min_value | max_value |
+-----------+-----------+
| 0 | 10 |
| 10 | 100 |
| 100 | 1000 |
| 1000 | 10000 |
| 10000 | 100000 |
+-----------+-----------+
4. Now we do a join and get the counts for each bucket.
select b.min_value,b.max_value,count(*) from bins b
left outer join kounts a on a.value between b.min_value and b.max_value
group by b.min_value
order by b.min_value
And our histogram values look like the following and can be put into your favorite charting product and made into a pretty graph.
+-----------+-----------+----------+
| min_value | max_value | count(*) |
+-----------+-----------+----------+
| 0 | 10 | 3189118 |
| 10 | 100 | 239142 |
| 100 | 1000 | 9004 |
| 1000 | 10000 | 208 |
| 10000 | 100000 | 4 |
+-----------+-----------+----------+
I hope this is useful to some new sql user. Enjoy.
No comments:
Post a Comment