Friday, March 27, 2009

Using Google Spreadsheets with iGoogle to create a shareable dashboard for your startup

Google Analytics is excellent for tracking visits and can even be used for more advanced things like tracking conversions and events, but what if you want to track application-specific metrics which don't happen in a user's browser? For instance our Twitter meme aggregation site gets tweets using the Twitter API and stores them in our database, and the amount of tweets is something that I am interested in keeping an eye on. Here's what our dashboard looks like, and in this post I'll show you how you can effortlessly create your own.

Not only do the charts look pretty neat, but you can drag and drop them to rearrange them on the screen and even drag around inside each time series graph to see different spans of time. So where are these graphs coming from? A cronjob executes a simple Python script every quarter hour. The script queries our system for the load average, asks MySQL for the number of tweets and other data I want to chart. Each piece of data gets inserted into a separate Google Spreadsheet using the Google Spreadsheets API.

I can visit each of these spreadsheets and create charting gadgets (insert > gadget). Here I only used the time series gadget, which is like the one used on Google Finance, so you can easily zoom in to different periods of time and mouse over to see the absolute values. Clicking on the upper right corner of a gadget you'll find the option "add gadget to iGoogle". After you have added a few gadgets to your iGoogle, it may make sense to combine them into an iGoogle tab. Then on a tab's menu you'll find an option to "share this tab", and now any member of your team can view the data.

That's a summary of how this is done, but for the sake of completeness I will now mention some problems I encountered and share some code snippets so you won't necessarily have to go digging through the API.

First of all, one perhaps strange decision I made was to use a separate spreadsheet for each thing I track. One reason is that different data might be polled at different intervals, so the data is easier to handle this way as there won't be any blank fields for missing data. The more real reason however is a caveat I found, which is that each spreadsheet can contain no more than 200000 cells. There are around 35000 quarter hours in a year, so holding all data in the same spreadsheet would hit the limit in just a few months. I might also mention that I don't yet know how to get a charting gadget to automatically extend to cover all rows as the spreadsheet grows. If you know how to do that, please let me know and I'll be sure to mention the solution here.

Now some snippets. These use the Python library gdata-1.3.0. Logging in to Google Spreadsheets:

gd_client = gdata.spreadsheet.service.SpreadsheetsService() = 'YOUR GOOGLE ACCOUNT EMAIL'
gd_client.password = 'YOUR GOOGLE ACCOUNT PASSWORD'
gd_client.source = 'exampleCo-exampleApp-1'

Below is the snippet for inserting one row to a certain spreadsheet. Each spreadsheet can consist of several worksheets. Not sure if it's always true, but experimentally I found that the first sheet seems to be called 'od6'. The spreadsheet_id you can find by creating a spreadsheet, then looking at your address bar and copy the base64 looking string in the "key" GET argument. The value here is any piece of data that you want to track, for example the number for your current load average.

def newValue(spreadsheet_id, value):
worksheet_id = 'od6'
dict = {'datetime':time.strftime("%m/%d/%Y %H:%M:%S"), 'value':value}
gd_client.InsertRow(dict, spreadsheet_id, worksheet_id)

Finally here is my complete list of things I import in the beginning of my script. It contains many unnecessary includes for these particular snippets, but I thought it might still prove helpful if you are trying to get the code to work.

from xml.etree import ElementTree
except ImportError:
from elementtree import ElementTree
import gdata.spreadsheet.service
import gdata.service
import atom.service
import gdata.spreadsheet
import atom
import getpass
import string
import time
import urllib
import os
import re

Thanks for reading and please comment on Hacker News (preferred) or here in the comments. I'm interested in becoming a better blogger too, so any suggestions on improving my writing would be appreciated as well.