Adding an Average Line to a Chart Using Defined Names

Tuesday, September 4, 2007

Please Note: This technique appears not to work in Excel 2007 (SP 1). For some reason, not explained by Microsoft as far as I can tell, some (but not all) named ranges are no longer allowed in the Series() function. This appears to be yet another problem with the new charting engine. So, this post only applies to Excel 2003 and earlier versions.

I was perusing Yahoo! Answers looking for interesting questions and I ran across one that asked how you can add an line that represents the average of a data series to a chart of that series. The most obvious way to do this would be to use a helper column with the Average() function in each cell. It would be the same value all the way down the column, and then you could just add that series to the line chart. Of course, the problem is that you have to remember to copy the Average() function down every time you add a new data item. There has to be a better way.

Of course there is, and this post will show my solution. It involves using defined names (AKA range names). Defined names are a powerful tool, not because they allow you to give a name to a static range (or cell) on your worksheet, but because defined names are really formulas. In Excel 2003, go to Insert—> Name—> Define to access the Define Name dialog box (in Excel 2007 Define Name is located on the Formulas ribbon). Notice the box at the bottom that is labeled “Refers to:” In that box, the default entry is something like =Sheet1!$A$1 (it will be the address of the active cell or range). It is the equals sign that is the giveaway. These defined names are actually formulas.

The first thing that we want to do is to create a defined name that references our original data, and that will also expand as we add new data. In other words, we want to create a dynamic range name. To do this, we need to create a reference using the Offset() function and the Count() function. The goal of these two functions will be to find the last data point in the column. The Count() function will tell us how many numbers are in the column, and the Offset function will create a reference to that last data point. In my sample worksheet, I have “Random Data” as the label in A1, and then 23 numbers below that. So, my data is going to be in column A. Now, go to Insert—> Name—> Define and then enter the name ChartData at the top. Now, click in the “Refers to:” box at the bottom and enter the following formula:

=Sheet1!$A$2:Offset($A$2,Count(Sheet1!$A:$A)-1,0)

That will create a dynamic reference from A2 to the last data point. Make sure that you use absolute references (the dollar signs) in the formula. Note that I used Count(A:A) to get a count of the numbers in column A and then subtract 1. I am subtracting 1 because otherwise the Offset function would reference one row after the data ends.

Now, create a line chart of the original data (I just generated some random numbers for my sample worksheet. Click on the line for the data series and then look in the formula bar. You should see the Series() function that Excel uses to define the chart. The formula will look something like this:

=SERIES

The first reference is the name of the data series, and the second is the range that holds the data. We want to replace the data range with our dynamic range name so that the formula looks like this:

=SERIES

You can do that directly in the formula bar by editing the Series() function. We now have a dynamic chart. If you add new data to the series, the ChartData dynamic range will automatically expand and the chart will expand to include the new data as well.

We are now halfway done. At this point, we want to add another series to the chart that represents the average of all of the data points. It will be a straight line all of the way across the chart, and it will move up and down as the average changes because of the new data that will be added.

Again, we will use a defined name. However, in this case we need to create an array function because we need to show the average at each data point. In other words, the new series will have exactly as many data points as the original data series, but each point will be the average. To do this, I need to create an array such as {0.49;0.49;0.49;0.49}. So, I need the average to repeat the same number of times as there are data points. There are probably other ways to do this, but here is what I used:

=IsNumber(ChartData)*Average(ChartData)

Create a defined name called Avg and set it to that formula. Note that the IsNumber() function evaluates each data point in the original series to see if it is a number or not. If so, then it returns TRUE. I know that every data point is a number, so I’m using this to generate an array of TRUE with the same number of values as the original data. Now, when I multiply that array of Trues by the average, Excel will convert TRUE to its numeric representation (which is a 1), and create an array that contains the average repeated N times (such as the example above).

At this point, all that remains is to add the Avg series to the chart. To do that, just right click in the chart and choose Source Data. Go to the Series tab and click the Add button. In the Values box, type:

=Sheet1!Avg

and the click the OK button. You should now have a horizontal line representing the average in your chart. The picture below shows an example:

An example workbook is available.

Posted by .(JavaScript must be enabled to view this email address) on 09/04 at 01:54 AM

Permalink Tell-a-Friend

Comments

How do you get this to work in excel 2007?


Posted by Mike Smith on 02/04 at 08:50 PM | #

Very interesting question, Mike. I hadn’t tried it in Excel 2007 until now, and I find that it doesn’t work. At least, I haven’t been able to get it to work right away. It seems that Excel 2007 doesn’t like defined names in the Series() function for some reason. Interestingly, the charts created in Excel 2003 still work perfectly if you open the file in Excel 2007.

BTW, I just realized the links to the example workbook weren’t working (new blogging platform), so I fixed those. Download a copy if you are interested in seeing it work.

I’ll look into this and post back if I figure out the problem. Thanks for letting me know.


Posted by admin on 02/04 at 09:50 PM | #

Another proof that Office 2007 should be the step forward but as far as I see - it’s backward


Posted by VLadimir on 03/10 at 07:36 AM | #

Hi Mike,
If you do find a way of putting it together in Excel 2007, please let us know or post something about it….
I tried achieving it in 2007 version with no luck so far.

_________________________
http://www.microsoft-office-excel.com


Posted by Jack Gopher on 04/16 at 09:39 AM | #

Jack,

Still no joy on this one. However, it does seem that at least some defined names can work in the Series() function in Excel 2007.

Jon Peltier recently had a post (for Excel 2003) titled Dynamic Ranges to Find and Plot Desired Columns that worked for me just perfectly in Excel 2007. The formulas in those defined names are different than the one presented here, but they work. I’m still confused as to what the problem is.


Posted by Tim on 04/16 at 01:07 PM | #

How might I set this up using rows vs. columns of data?

Also, I need the “average” number to be linked to data in a single cell. That is, my “average” is actually an array formula for taking the average excluding zeroes. I have that formula figured out… I just need to know how to plot that single cell’s worth of data as a horizontal line on my graph. The number, of course, will change as I add more data, so I need the line on my graph to change, as well.

Thanks!


Posted by Alea on 05/01 at 03:21 PM | #

Alea,

If you want to link the average line to the value that is calculated in a single cell, that doesn’t change the formula for the defined name much at all. Also, having your data in rows instead of columns mainly just changes the Offset() function in the ChartData name. The Offset() function is defined as:

Offset(Reference, Rows, Columns, Height, Width)

where Reference is the starting cell from which you will offset. Height and Width are optional and we don’t need them. So, your ChartData name using data in row 1 might look like this:

=Sheet1!$B$1:Offset(Sheet1!$B$1, 0, Count($1:$1)-1)

So, just move the Count() function to the Column argument instead of the Row() argument.

Your formula for the Avg name would be:

=IsNumber(ChartData)*$A$5

assuming that you are calculating the average in cell A5. You can use any formula that suits your needs to calculate the average in A5.


Posted by Tim on 05/02 at 12:53 AM | #

You can make this work in Excel 2007 if you avoid names that start with the word “Chart”. For example, choose names so that the series formula changes from

=SERIES(Sheet1!$A$1,,Sheet1!ChartData,1)

to

=SERIES(Sheet1!$A$1,,Sheet1!ChtData,1)


Posted by Jon Peltier on 06/05 at 11:46 AM | #

On the Excel 2007 SP1 page (http://support.microsoft.com/kb/936982)

In the “List of issues that the service pack fixes”
on row 124:
The Chart Data Series value field does not retain the value set by using a reference to a Named range on another sheet. The workbook name is changed to “[0].” No errors are displayed. The chart simply is not updated when you add new data.

However, I am still experiencing this behavior even is my names do not contain “chart”.

Any ideas?
Do you think this is still a live issue?

Thanks again Tim & Jon!


Posted by Red J on 08/14 at 02:17 PM | #

The information on Microsoft’s page “Excel 97-2003 features that are not supported in Office Excel 2007” (http://office.microsoft.com/en-us/excel/CH100648071033.aspx) might explain why it’s not working in 2007. It may have to do with named ranges, or it may be related to the full column references.
Pat

(gleaned from the aforementioned page)
Full row or column references
97-03
Full row and full column references that are used in a workbook can include data in cells that are located within the row and column limit of Excel 97-2003.   

2007
Full row and full column references automatically take into account the cells in the larger grid size of Excel 2007. This means that the reference =A:A, which refers to cells A1:A65536 in earlier versions of Excel, refers to cells A1:A1048576 in the Excel 2007 file format…

Defined names in formulas
97-03
Specific names that use a combination of letters and numbers (such as USA1, FOO100, and MGR4) can be defined and used in formulas in Excel 97-2003 because they do not conflict with cell references.

2007
With a new limit of 16,384 columns, the columns in Excel 2007 extend to column XFD. This means that specific names (such as USA1, FOO100, and MGR4) that you could define in earlier versions of Excel will conflict with valid cell references in Excel 2007…


Posted by Pat Kujawa on 02/02 at 10:59 AM | #

Pat, thanks for the reference. I need to revisit this as soon as I get a chance. Jon was exactly right about names starting with “Chart,” but there are also other hinky things going on with defined names in Excel 2007.


Posted by Tim on 02/02 at 11:13 AM | #

Tim,
Have you looked at moving averages with this method?  For example, if I wanted to add new data over time, but only plot the average of the most recent X number of data points. 

Thanks,
Russ


Posted by Russ Wood on 02/20 at 02:14 PM | #

Russ,

No, I haven’t tried it but there is no reason to think that it wouldn’t work. You just have to define the name so that it has a dynamic beginning point in addition to the dynamic end point.

For example, you can change the ChartData name to:

=OFFSET(‘Chart with Average’!$A$2,COUNT(‘Chart with Average’!$A:$A)-11,0):OFFSET(‘Chart with Average’!$A$2,COUNT(‘Chart with Average’!$A:$A)-1,0)

That will plot only the last 10 data points. That isn’t exactly what you want, but the technique is the same.


Posted by Tim on 02/20 at 04:41 PM | #

Thanks.  This helps and gives me another idea.


Posted by Russ Wood on 02/23 at 06:58 AM | #

Commenting is not available in this channel entry.