Read Google docs spreadsheet using Python


One of the many useful and productive feature for me personally is to programmatically read a Google docs spreadsheet in Python. This is very useful to automate a task to insert some data in MySQL/NOSQL/ElasticSearch, which would have otherwise required to build a custom user interface for data input.

Reading is actually quite simple. For example, you have a spreadsheet like this:

First, you need to grab the file id. In the image below, file id is 19Y_Oi5_riecwonPbtxN4sfDntZO62s_vJbXoogFFp9o

Make sure to give the read permission to anyone having the link, otherwise our python program won’t be able to read the file.

First, we make a simple GET request on the export url of the spreadhseet using the requests module

headers={}
headers["User-Agent"]= "Mozilla/5.0 (Windows NT 6.2; WOW64; rv:22.0) Gecko/20100101 Firefox/22.0"
headers["DNT"]= "1"
headers["Accept"] = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
headers["Accept-Encoding"] = "deflate"
headers["Accept-Language"]= "en-US,en;q=0.5"
lines = []

file_id="19Y_Oi5_riecwonPbtxN4sfDntZO62s_vJbXoogFFp9o"
url = "https://docs.google.com/spreadsheets/d/{0}/export?format=csv".format(file_id)

r = requests.get(url)

Once we have the response, it is easy to read it using the csv module

sio = io.StringIO( r.text, newline=None)
reader = csv.reader(sio, dialect=csv.excel)

for row in reader:
    # Do something with each row

For example, to read the data as a dictionary, we can do something like this:

for row in reader:
    if rownum == 0:
        for col in row:
            data[col] = ''
            cols.append(col)

    else:
        i = 0
        for col in row:
            data[cols[i]] = col
            i = i +1

        print data
    rownum = rownum + 1

This will print the following output on console:

{'Col C': '3', 'Col B': '2', 'Col A': '1', 'Col D': '4'}
{'Col C': '2', 'Col B': '3', 'Col A': '4', 'Col D': '1'}
{'Col C': '2', 'Col B': '4', 'Col A': '3', 'Col D': '1'}
{'Col C': '3', 'Col B': '4', 'Col A': '2', 'Col D': '1'}

Incase, we are using unicode characters with our file. We can make use of unicodecsv module. It is a drop in replacement of the csv module.

import unicodecsv as csv
reader = csv.reader(
        urllib2.urlopen(url),
        encoding='utf-8'
    )

for row in reader:

The complete program is given in this gist and as well as below. You just need to replace the file_id parameter to start using it. Feel free to fork it :)