Office Labs Chart Advisor for Excel

Monday, August 25, 2008

Microsoft’s Office Labs has released an interesting new add-in for Excel 2007 (only!) called Chart Advisor. The purpose of this add-in is to help you to decide on the best chart type to represent your data. It is supposed to be a kind of charting expert system. I suspect that Jon Peltier will rip it apart, and that should be fun. Still, it seems to be an interesting idea so I thought I would provide an overview here. Note: I have updated this post at the bottom to reflect a recent (28 August 2008) update to the add-in.

The general concept of Chart Advisor is that you select some data in the worksheet, click the Chart Advisor button on the Insert tab, and then it analyzes the data and provides a range of potential charts sorted by desirability. To test it out I tried to create a chart of student grades across three exams (student names followed by three columns of exam grades). My thought was that a clustered column chart would be most appropriate. Instead, the add-in suggested an XY Scatter chart as its first choice. That is weird, especially since it ignored the text in the first column (student names). Something like this should recognize text in the first column and suggest a chart type with a category axis. Here is a screenshot of the Chart Advisor in action (click for a larger image).

chart-advisor-1

Chart Scoring

The XY Scatter chart was given a "Recommended Chart Score" of 63%. The second favorite was a clustered bar chart (score 55%), which should have been the first choice. This chart scoring is interesting, particularly because it will tell you why the chart scored as it did when you hover the mouse over the choice at the top. For example, here is the breakdown for the XY chart (the new version, see bottom of post, now shows many more rules and points than this graphic):

chart-advisor-score

Notice that it shows how the score was determined, and the points associated with each rule. That total is then multiplied by a "popularity" score. I imagine (not really sure, though) that Microsoft has some kind of research results that show how often each type of chart is used, and that is how the popularity multiplier is determined. Very interesting, but I think that the rules seem somewhat crude. Of course, assumptions have to be made in order to have a mathematically tractable model of the chart creation process. I’m sure these will be tweaked over time if this tool becomes popular.

My original choice of a clustered column chart got a score of only 22%, mainly because the category descriptions are more than 16 characters in length. That earned it a -15 on that rule, whereas the clustered bar chart got a +15. That makes sense and I can understand the logic. The difference in the two scores (55 vs 22) is due to the 30-point difference in that rule multiplied by the 1.10 popularity score (33 point difference = 30 × 1.10).

Other Charts

If you don’t approve of the top-ranked chart, you can easily explore the others. Just move the mouse over the other choices at the top of the dialog box and the pictured chart will appear. There are more choices available in the gallery, which can be expanded by clicking the scroll bar at the right.

Once you find one that you like you select it by clicking the chart in the gallery, and then you can further refine your chart. For example, in the Modify Chart pane you can turn individual data series on or off. However, there is no way to change any of the chart formatting. For example, you cannot turn off the "eye candy" (3D and shadow effects), or change the colors for any of the series. Still, these things can be modified after the chart is inserted into the worksheet by pressing the Insert Chart button.

Chart Data Filtering

One of the most interesting features of the Chart Advisor is its ability to filter the chart data (much like you can in a Pivot Table). You can filter based on any of the data series (even the category labels) that are in the chart, and you can apply multiple filters. For example, I could filter based on student names and/or any of the exam scores. Note that the add-in also ignores hidden columns, so they don’t have to be manually disabled.

Interestingly, when you insert the filtered chart into the worksheet it becomes de-linked from the data in the worksheet. That is, it doesn’t reference the original data. Since the actual data (not ranges) is stored within the chart, changing the original data does not affect the chart. I don’t like this, but I suppose that there was no way around it.

Unfortunately, the de-linking happens even if you haven’t applied any filters to the data. That shouldn’t happen and needs to be fixed in order for the Chart Advisor to be truly useful. I can understand de-linking with filtered data, but not if you are charting the entire data set.

My Overall Impressions

I really like the concept of the Chart Advisor, and I think that a lot of Excel users will find it to be useful in helping them to select an appropriate chart type. I see a lot of people using incorrect chart types (especially using Line charts when an XY Scatter would be much more appropriate) and this tool should help with that problem. I also like how easy it is to use. It is a little bit like the Chart Wizard from previous versions of Excel. Finally, the ability to filter the data and include or exclude a data series is very cool.

However, I dislike the fact that the charts that are created are not linked to the data in the worksheet. You can get around this problem by re-linking the data after you insert the chart. Just right-click the chart and choose Select Data, and then edit the data ranges for each series.

Overall, I would say that this is a good first attempt, but it could use some additional work. It is completely free, and you don’t even need to provide any personal data. All that you need is a copy of Excel 2007, and this add-in is certainly worth a download if you have that.

If you don’t like it, you can always uninstall it through the Add-ins manager in Excel Options. Note that it is a COM add-in, so you will have to select COM Add-ins in the manage list before getting to the dialog box that allows you to disable or remove it.

Updated Version Now Available

Office Labs has released an updated version of the Chart Advisor (the link is the same as before). This update has fixed problems that people were having with installation. However, they also have changed several other things. The “eye-candy” in the charts seems to have been turned off or, at least, toned down. Also, I originally lost the ability to see the reasoning behind the chart scoring (as shown above), but now it is back and more of the rules and points are now exposed. The chart scores that I am now getting for this same data set are now much higher than before, but I don’t know why. Also, it seems to ignore more of the data by default. For example, it is now ignoring my Exam 3 data in the XY Scatter chart (still top-ranked), and it ignores the exam scores for Exam 2 and 3 in the “clustered” column chart. So, the clustered column chart isn’t really clustered since it left out two of my three numeric series. Finally, there is now a brief message that notes that this tool is a prototype and it isn’t tuned for performance. I’m not sure, but I think that I liked the original version better.

Posted by .(JavaScript must be enabled to view this email address) on 08/25 at 02:59 PM

Permalink Tell-a-Friend

Comments

Tim -

Thanks for the heads up. I downloaded and tried installing the new utility, but ran into issues with PIAs. This indicates to me that the utility was not written in anything as simple and reliable as VBA, but in some .Net language that requires these PIAs in its deployment.

I’m hoping to get a second chance to test it, but for now my review is incomplete:

<a >Microsoft Chart Advisor</a>


Posted by Jon Peltier on 08/25 at 08:14 PM | #

Jon,

You may not even want to post an empty link after you try it. I think you’ll be very frustrated by some of its suggestions. The third-ranked chart for my data was a bubble chart? What the…

That said, you are an expert. I suspect that an awful lot of Excel users will find it very useful.

Here is a link to Jon’s post showing the error that he got.


Posted by Tim on 08/25 at 09:17 PM | #

With any luck, they will read comments of the first sets of users and make adjustments.

What’s the popularity multiplier? Does it give you a factor of 3.14159 for pi charts? Seriously, is it based on how many people use a particular chart type, irrespective of the effectiveness of the chart type?


Posted by Jon Peltier on 08/26 at 09:18 AM | #

I don’t know much about the popularity score, but it appears to be some kind of frequency of usage index with 1.00 being average. For example, the bubble chart has a popularity score of 0.85.

I swear I saw somewhere on their site that they had data from Microsoft Research on chart type usage. I can’t find that exact statement anymore, but in The Origins of Chart Advisor they say:

“We’ve looked at hundreds of spreadsheets, but given the vast capabilities of Excel, we’ve only scratched the surface of how people organize their data.  Our rules are XML based to make them easier to update – so this project can evolve over time and get even better.  We’ve put out version ONE to get your opinion.  So the question is, how are we doing?”


Posted by Tim on 08/26 at 10:59 AM | #

Hi Tim.  Thanks for all the feedback—and yes, we are listening! 

Our goal in Office Labs is to quickly turn ideas into code in order to prove (or disprove) the idea.  I hope Chart Advisor is “good enough” that you can imagine it in its final state, but give us feedback far earlier.

The tool tips aren’t gone.  However, we do see an intermittent problem where they just don’t always show up.  We’re looking into it. In the meantime, try it again on the same dataset, or another.

We’ve wrestled over the popularity multiplier. The fact is that we’ve often received feedback that people expect a certain chart at the top of the list, even if the data seems to fit another chart a little better.  This is just a quick way for us to tweak individual charts without going through and re-applying a new weight to every rule.

(I hit the max length for posts on your site.  To be continued in the next message)


Dave Lee
Chart Advisor from Office Labs
Developer


Posted by Dave Lee on 08/28 at 08:53 PM | #

(continued from previous post)

Regarding de-linking of the charts—you may have noticed that we create pivot tables behind the scenes, in order to summarize your data before creating a chart (we also do this for filtering, and to clean up header rows).  In this case and several others, we create intermediate data.  Because we don’t want to introduce any privacy issues, we chose to always delete this intermediate data when the form closes, causing the de-linked charts.  Should this idea make its way into the product, the architecture would be different and I am certain that the linking would be preserved.

Scoring rules have been tremendously interesting and challenging to work on. This is really the heart of our prototype, and what we are most interested in.  Is it possible to apply a generic set of rules to essentially free-from entry of data and get good recommendations at least most of the time?  I believe it is, but we aren’t there yet smile

Thanks again for all your feedback, and don’t hold back the punches.  We want it all.

Dave Lee
Chart Advisor from Office Labs
Developer


Posted by Dave Lee on 08/28 at 08:58 PM | #

Dave, thanks for posting. I hope that I didn’t come across as too harsh because I really think the Chart Advisor is a great idea (and I said so several times). I realize it is a prototype, and that’s why I posted some criticisms. I hope that they will improve the product.

As for the popularity multiplier, I think that was a great idea. The charts that are most commonly used probably should receive a bump. I think it is the rules themselves that need tweaking. Also, the Chart Advisor should default to using the entire selected data set. As it is, it ignores several of my series and Excel has never before worked that way. Maybe you should change from the automatically using the entire range around the active cell, to only using the selected range(s)?

I’d be very interested in knowing more about the rules. Are you really only using three, or do the three shown just have the most impact on the choice? It would be great if you would reveal the rules so that we could help.

Thanks again. I’m looking forward to further updates!


Posted by Tim on 08/28 at 09:39 PM | #

I noticed a few things about how the data was parsed. It is at odds with Excel’s typical behavior, so that those of us who understand the behavior and arrange the data accordingly may have some surprises.

The intermediate pivot table creation accounts for a few things I’ve noticed, so I’ll have that in mind as I continue my assessment.


Posted by Jon Peltier on 08/29 at 04:47 AM | #

No worries.  You weren’t too harsh at all.  I was just try to express that it is a prototype and we expect things to not be perfect, so blunt feedback and gut reactions are great.  Details are even better!

We are trying to be smarter about which columns to include because in many many datasets, including all of the value series create horrible charts.  On the other hand, we are a little to tight with our current algorithm—excluding columns which you would expect to be there. 

Our rules files is around 5000 lines.  It is called ChartAdvisor.Rules.xml, and it is installed to your AppData directory (you’ll need to search since the location depends upon your OS and configuration).  We show you only the scoring rules which evaluated to true.  We also have rules for data mapping (figuring out which Excel series should be mapped to chart axis), and formatting rules, which allow things like turning on the gummy styles only when there are a few bars on a column chart, or moving legends in to the slices on a pie chart when the smallest slice is > 5%.

btw, I wouldn’t spend time trying to change the rules file—the syntax is fairly rigid and modifications by hand can cause it to be easily broken, so we overwrite the file when we start up.

Thanks

Dave Lee
Chart Advisor from Microsoft Office Labs
Developer


Posted by Dave Lee on 08/29 at 09:17 AM | #

Thanks Dave, I figured that it had to be a whole lot more complex than just three rules. For those who like to hack into these things, the XML configuration files for Chart Advisor can be found by entering the following into the address bar of Windows Explorer:

%AppData%\Chart Advisor from Office Labs

Preferably, you will have an editor that “understands” XML files. I use Notepad++, which is a free and very powerful text editor.

I took a quick look, and I wouldn’t feel comfortable modifying the file at this time. However, it looks like you could change the descriptions of the charts, which might be a fun prank to play on a colleague. grin

If you are going to mess with it, make sure to create a backup of the file first. Also, take a look at my follow-up post.

I looked at the bottom of the file (as it is today), and noticed what appears to be a commented out rule for setting the height of the chart:

<Rule Element=“CHART.ChartHeight                     “>CHART.ChartWidth * 0.618</Rule>

That seems like it would set the proportions of the chart to 1.618:1. Somebody is familiar with the Golden Ratio (AKA, the divine proportion from the Fibonacci series). grin


Posted by Tim on 08/29 at 11:37 AM | #

Tim - A good first assessment of the rules, especially for someone new to XML.

Dave - Blunt feedback and gut reactions, we’ve got aplenty. As long as you remember not to take anything personally. Prototypes are fun to work with, because partly they work great, partly they misbehave in unexpected ways, and partly you learn about the thought processes of the other programmers. I have lots of “aha” moments, when I finally figure out “that’s why they did it that way”.


Posted by Jon Peltier on 08/30 at 08:03 AM | #

Commenting is not available in this channel entry.