Read Google docs spreadsheet using Python
One of the many useful and productive feature for me personally is to programmatically read a Google docs spreadsheet in Python. This is very useful to automate a task to insert some data in MySQL/NOSQL/ElasticSearch, which would have otherwise required to build a custom user interface for data input.
Reading is actually quite simple. For example, you have a spreadsheet like this:
First, you need to grab the file id. In the image below, file id is
19Y_Oi5_riecwonPbtxN4sfDntZO62s_vJbXoogFFp9o
Make sure to give the read permission to anyone having the link, otherwise our python program won’t be able to read the file.
First, we make a simple GET
request on the export url of the spreadhseet using the requests module
headers={}
headers["User-Agent"]= "Mozilla/5.0 (Windows NT 6.2; WOW64; rv:22.0) Gecko/20100101 Firefox/22.0"
headers["DNT"]= "1"
headers["Accept"] = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
headers["Accept-Encoding"] = "deflate"
headers["Accept-Language"]= "en-US,en;q=0.5"
lines = []
file_id="19Y_Oi5_riecwonPbtxN4sfDntZO62s_vJbXoogFFp9o"
url = "https://docs.google.com/spreadsheets/d/{0}/export?format=csv".format(file_id)
r = requests.get(url)
Once we have the response, it is easy to read it using the csv module
sio = io.StringIO( r.text, newline=None)
reader = csv.reader(sio, dialect=csv.excel)
for row in reader:
# Do something with each row
For example, to read the data as a dictionary, we can do something like this:
for row in reader:
if rownum == 0:
for col in row:
data[col] = ''
cols.append(col)
else:
i = 0
for col in row:
data[cols[i]] = col
i = i +1
print data
rownum = rownum + 1
This will print the following output on console:
{'Col C': '3', 'Col B': '2', 'Col A': '1', 'Col D': '4'}
{'Col C': '2', 'Col B': '3', 'Col A': '4', 'Col D': '1'}
{'Col C': '2', 'Col B': '4', 'Col A': '3', 'Col D': '1'}
{'Col C': '3', 'Col B': '4', 'Col A': '2', 'Col D': '1'}
Incase, we are using unicode characters with our file. We can make use of unicodecsv module. It is a drop in replacement of the csv
module.
import unicodecsv as csv
reader = csv.reader(
urllib2.urlopen(url),
encoding='utf-8'
)
for row in reader:
The complete program is given in this gist and as well as below. You just need to replace the file_id
parameter to start using it. Feel free to fork it :)