Sep 19, 2016 - Among the numerous new charts available on the new Excel 2016 is the Box and Whisker Chart. This chart was originally created by John. Feb 10, 2016 - What to do with Excel 2016's new chart styles: Treemap, Sunburst, and Box & Whisker. These new chart styles make it easier to compare data.
Instructor In Excel 2016, Microsoft introduced a number of new chart types. Among them are Histogram, Pareto, and Box and Whisker. These three are interrelated. On the insert tab in the ribbon, you'll find them amongst these chart icons on the icon that's got four columns, right here.
Insert statistic chart. Now you don't see the word Pareto at first, but you do see Histogram, but just to the right we see Pareto, and below this, Box and Whisker. So they're all found on this icon. You might also encounter these if you use recommended charts. Let's first talk about Histogram. On the screen is some data, prepared in Excel 2013.
Here is an arbitrary list, every ten thousand. I want to calculate how many of these entries here are up to and including 20,000. I can certainly see that one ahead of time. How many are from 20,000 up to 30,000?
This has been done over here in column D, but as I zoom in a bit and double click on one of the formulas here, in column D, it's a frequency formula. It also happens to be an array formula. Now, they're somewhat complex.
I don't want to go into the details. What I'm suggesting is, we could have created, we can create a Histogram in earlier versions of Excel, but it's not very straight forward, and in order to do it, we need to write these frequency functions and use this as the starting point for creating a Histogram. Now, if we simply focus on the data in column A and we're using Excel 2016, we might first of all discover this under recommended charts. What does that look like? Click it, and we see the word Histogram.
There it is, click okay, and we've got a chart. Now, not everything about this chart might be ideal. I'm going to zoom back a little bit so we can see just a portion of the data and the chart. You'll notice down below, the numbers here. These don't quite fit the way we might want them to. Now zoom back even farther so we can make the chart taller.
That might help, and it does to some extent, but now we see some unusual gaps here. Wouldn't it make sense if each column represented a range of about 10,000? So let's double click one of the numbers below. That activates the format access dialog box on the right.
Now, see the option called Bin width. It says 11,000. That's close to what I might want. I'm going to put in 10,000 and enter. Watch the chart change. Every cluster now represents a 10,000 range, but the numbers begin somewhat strangely and they don't match up with 20,000 30,000 et cetera.
I think most people would want that. And by the way, when you make these wider, sometimes you'll see that the data's slanted. That's easier to read, but there's still something a little bit off about this.
This begins at 19,890. That happens to be the very first entry. If we change that to 20,000 and right away you might be thinking, and I would agree, we don't want to be manipulating the data to change the chart, but my point is, in order to make this be more readable, you might want to do that. I'm going to copy this data up here, I might need it later, but then I'm going to change this to be 20,000. And then as I press enter, watch the chart. Those bin ranges seem to be sensible. So, maybe this is a flaw, or maybe it's just something you can work around, but do remember, when you double click those numbers below the chart, you can, off to the right here, and you might have to come back and click this Axis Options here, but using Axis Options, you can adjust the width of the bin range, or possibly another approach would be to change the number of bins.
So it might be something to experiment with. Now, the next sheet over is called Pareto.
This is based on a concept sometimes called the 80-20 rule. When we're looking at data, we might want to show as we look at the data for example in columns A and B. In looking at the sales here, it may look as if only a few of the states are accounting for most of the sales. So let's represent this in a chart. In order to create a chart before Excel 2016, you needed to come up with this calculation like in column C here. So we have to do that up and down the column for this to make sense because we want to see this being represented in the chart as well.
So if you have gone to the work of doing that and you have the data sorted in descending order, we can click within the data here, go to the insert tab. We might be talking about the idea of working with the data before we come to Excel 2016. We've got a choice here. Under Clustered Column, it looks pretty good. So we could possibly be using that.
![Excel Excel](/uploads/1/2/5/5/125511054/274719523.png)
I'm going to change the size of this and leave it around. But starting in Excel 2016, we need only record this data. This list happens to be alphabetized, which we might not want to keep, but because it is surrounded by empty cells, we don't even need to highlight it. So click within it, insert.
If we know where to find it this way, by way of the icon for Histogram, there's Pareto, there's our chart. Notice that the data isn't sorted and yet we still get the kind of chart we need here. Let me close the dialog box to the right. And even if we were to sort this in ascending order or descending order, watch what happens. I'm going to sort this data in ascending order in Column G, chart doesn't change at all, descending order, chart doesn't change at all, or make it alphabetical. With the other data over here, in other words this where we're working outside of that chart type, I'll sort it this way. The chart looks inappropriate, so I'll undo that.
So again the basic idea is to show that in some situations, just a few of the entries are comprising a larger percentage of the total here, and that all comes out, if you were doing it manually by the formulas that are here, over in Column C. Now, Box and Whisker is a term you might not have heard about. This gives us some information based on the data that we might be collecting, for example, over in columns A, B, and C.
We've got some transaction numbers, data for different states here. Looks like it's in no particular order, doesn't need to be either. We've got sales entries here, and I've got some statistics over here, and we don't necessarily need these, but once we see the chart here, it's going to make some sense to help us read what the chart really means. So this data goes down, as I double click the bottom edge of this cell, goes down to row 601. So let's click within the data, insert. We might go to recommended charts first, but no real options there that make sense. Here's the icon again, Histogram, Box and Whisker, and there it is.
Now, we don't really need to make these line up with the data, but why not. Example here.
So what is this telling us? We don't see icons on the screen. I'm going to point to the top of the bar here that says 95,957. Well, you can see up above, that's based on analysis of the data.
That's the highest value found for the California entries. I've indicated over in column J the max here, the function called max, which I've used manually here. I'm saying, what's the highest value found for California? 95,597, it corresponds with the top line that we see here in each of these cases. So as we just glance at the chart itself, what can we say about the boxes, the size of them, and so on. It looks as if the range of sales within California is greater than it is in New York.
That's one thing we can look at. We could possibly have shown the standard deviation, but what else are we seeing on the chart? The low entry down below, the low horizontal bar is referred to as the bottom line over in column J, represents the minimum value, and we see what that is in each case. The X that we see within here, now that's a bit tricky to see. What I might suggest is, double click one of the boxes. That activates the dialog box on the right.
Go to the field color bucket and perhaps change the current color to be a lighter color. Maybe this will work a bit better. There we are, close that dialog box. The Xes that we see there represent, and we can see over in column J, I've indicated that that's the average in each case.
So the average California entry is just above 60,000 and the average Texas entry is 54,000 and so on. Now, the horizontal line that you see in each case, notice that it's not bisecting the box exactly. In fact, in Florida, it's particularly not in the middle. The horizontal line that we've seen represents the median. So for example, looking at California here, as we point to the line too, we see the value 62,513. I've confirmed that up above in cell F3, where we're simply taking the median, but just for the California data. So although this isn't truly needed, I would suggest that many times, if you were to display this chart, you might want to line it up the way I've done it here.
This confirms some of the numbers that we see. Now, the box itself represents data from the second and third quartile of the data. The first quarter is below that, fourth quarter is above it.
The size of the box itself represents percentiles, we're talking about data from the 25th through the 75th percentile, is represented in the box. Everything above that, above the 75th percentile is above the box.
Everything in the lower 25 percent is below the box in each case. I didn't use the actual statistics above, but that does explain the size of the box.
So it gives us a picture, it gives us some sense of what's going on with this data. I wouldn't necessarily say it's that dramatic, using these four states, but certainly some information pops out here, and if you do have the statistical background, I think you can see how valuable this chart is in giving us a broad picture of what's going on over in columns A, B, and C for those 600 rows. So, we've seen three different chart types.
We've just seen Box and Whisker, and prior to that we saw Pareto, and prior to that we saw Histogram. These chart types only available starting in Excel 2016.