Accessing Google Analytics From Django

Whew! It's been awhile since my last post, huh? So much for my "...vow to keep up with this blog" and bold statement that "everyone else can do it, so damnit I can too" mentioned in the inaugoral post. But - in my defense - I have been really busy. Like, actually busy, not playing videogames on the couch in my underwear busy. I started working full-time at Rukkus, built a pretty sweet injector for smartly managing Django template variables in external Javascript files, and otherwise kept myself fully booked. Anyways, today I'm going to talk about how to query the Google Analytics API from a Django app, a task I was presented with recently and hit a few major pain points tackling. So here goes.

not me Because Bean

In the age of API's, we've come to expect instant access to data on demand. When discussing a Google service, this expectation holds even firmer. Yet, querying the Google Analytics API has a few gotchas that can easily turn into major timesucks if not properly handled and while the official tutorial is, by and large, a good start, it skims over some vital details.

Configuring the app

1. Create an API project

To start the process, go to https://code.google.com/apis/console/ and create a project if you don't already have one. Important to note is that the Google account under which you create the project should be the same one which you used to sign up for Google Analytics.


2. Turn on the Analytics API

analytics


3. Go to the "Credentials" tab and under OAuth, click "Create new Client ID"

And this is where some of the trickiness comes into play. If building a web app which will be making requests on behalf of a logged in user, this step is trivial - choose Web Application. However, if - like me - you are building a backend service which is responsible for querying the API as part of a cron job or some other server side task, you must select Installed Application and other for application type. More on this later.


4. Download the JSON and save it as client_secrets.json

Authentication

By and large, the authentication step follows what is specified in the official tutorial with a few key exceptions. But heck, we're feeling pretty bad-ass, so let's just take it from the top.


1. Install the Google API Python Bindings

pip install --upgrade google-api-python-client


2. Import some packages

import httplib2  
from apiclient.discovery import build  
from oauth2client.client import flow_from_clientsecrets  
from oauth2client.file import Storage  
from oauth2client.tools import run_flow  

..like a boss


3. Create a container class with some class-level fields

class GoogleAnalytics(object):  
    CLIENT_SECRETS = "client_secrets.json"
    FLOW = flow_from_clientsecrets(CLIENT_SECRETS, scope='https://www.googleapis.com/auth/analytics.readonly')
    TOKEN_FILE_NAME = 'analytics.dat'
    VIEW_ID = '/* Your Profile View Id */'

Some important things to note here:

  1. Your VIEW_ID can be found in your analytics console's admin page.
  2. CLIENT_SECRETS should be the absolute path of the file JSON file downloaded earlier.

4. Add the boilerplate authentication code

def __init__(self):  
    self.service = None

def _authenticate(self):  
  # Retrieve existing credendials
  storage = Storage(self.TOKEN_FILE_NAME)
  credentials = storage.get()

  if credentials is None or credentials.invalid:
      credentials = run_flow(self.FLOW, storage, self.GAFlags())

  return credentials

def _create_service(self):  
  # 1. Create an http object
  http = httplib2.Http()

  # 2. Authorize the http object
  credentials = self._authenticate()
  http = credentials.authorize(http) 

  # 3. Build the Analytics Service Object with the authorized http object
  return build('analytics', 'v3', http=http)

def start_service(self):  
  """
  :return: ``this`` Google analytics object with the service set

  start the service which may be used to query the google analytics api
  """
  if not self.service:
      self.service = self._create_service()
  return self

most of this is ripped wholesale from the official tutorial. However, if you were to run this right now you would notice you get an error that self.GAFlags isn't defined, and this is indeed the case. So let's add it and then I'll explain why we need it.


4. Add a GAFlags class

class GAFlags():  
  """
  total hack. If you want to see why, examine apiclient.sample_tools. The python bindings as well as the documentation kind've forgot to mention this...which is a big deal..but actually though.
  """
  noauth_local_webserver = True
  logging_level = "DEBUG"

What this flags class is allowing us to do is run the authentication flow in noauth_local_webserver mode. Basically, this means that we don't want to start a webserver for authentication because we're already running one.

Making Queries

Now that you've successfully authenticated, it's on to the fun stuff, actually accessing your GA data. My generic query method looks as follows:

def query(self, **kwargs):  
  kwargs.update({
  'ids': "ga:{id}".format(id=self.VIEW_ID),
  })

  result = self.service.data().ga().get(**kwargs)
  return result

So to get - for instance - the number of sessions, organic searches, and page views in the past day, we would run:

GoogleAnalytics().start_service().query(start_date=str((datetime.datetime.now() - datetime.timedelta(days=1)).date()), end_date=str(datetime.datetime.now().date()), metrics='ga:organicSearches,ga:sessions,ga:pageviews'  

Ahhh! Feels good to finally be writing again. After reading this post, hopefully you now know how to create an API project on Google, authenticate using OAuth, and query your sites GA data.

not me You after plotting GA session information in a time series graph. Nerd!