Google Search Console Python API - Download Search Analytics

Published 2022-02-05

IU Home > Projects > download-google-search-console-data
Learn how to authorise and use Google Search Console from Python with an API key. Example Python code downloads search console data to a SQLite file.

The search_analytics.py Python script will download Google Search Console data using a web API. The data are augmented with rows representing anonymised queries omitted by Google. Secondary search results are also identified. The augmented data are stored in a SQLite database file so you can perform offline analysis e.g. with R. You may archive data over time - Google keeps the last 16 months for you.

Note You are responsible for ensuring storing data for longer periods of time is in line with your organisations data privacy policy.

Download Google Search Console Data

What is Google Console API?

The Google search console search analytics API is a web service provided by Google. It gives the user programmatic access to their website search performance data gathered by the Google search engine.

This data includes daily impressions, clicks and search engine result position statistics. The data are aggregated by various dimensions including country, search device and search type.

How do I use the Google API from Python?

The API is available as a web service from many programming languages. You can use google search console api from Python as we show below.

From Python we suggest you use the google-api-client search console client. This can be installed using pip / pip3 package installer.

An example showing how to use google search console api is provided by the script below. Other samples are provided by Google.

Python Script Features

If you are involved in Search Engine optimisation (SEO) you most likely will have looked into the tools provided by Google to analyse the search performance of your site.

Google offers the Google Search Console web app as a way to navigate the metrics they capture for your site. You can query data for specific time periods, by url, country, device and so on.

However performing more bespoke queries and data mining of your search performance requires that you download your data. That is the process performed by the script

search_analytics.py has the following features:

  • Automate the bulk download of daily search performance results. This is less error prone than manually downloading data using the web app.
  • Incremental download. Start from where you left off, useful during an initial download of data, and for monthly / weekly batch downloads.
  • Store data in a single SQLite file. The file can be queried directly using SQL. You can export the data from the file as CSV for import into tools like Microsoft Excel. You may use the file directly in tools that support SQLite natively for example R.
  • Infers which results are secondary query results automatically. This removes extra work required to make sense of the results shown by the web app.
  • Infers the anonymised rows that google removes from the results they give you. These results are included in aggregates used in the graphs but not for the breakdown rows of the table below the graphs. Note the anonymous data is still missing but the script can recover the search position for you.
  • SQLite storage is used by default but due to the use of Pythons SQLAlchemy library the script can write to a number of RDBMS backends. Select them with the --engine command line argument.

Making an offline copy of the data provides the following benefits:

  • Archive data for periods longer than the data is stored by Google.
  • Reduce metered use of Google APIs by downloading the data once.
  • Analyse offline data while you fly to your next client meeting.

Installation

search_analytics.py is free to download and runs under Python 3 on Windows, MacOS or Linux. It is part of the tool chest on GitHub. You may clone the source git repository using:

$ git clone https://github.com/adamlatchem/toolchest.git

You could also download a zipped snapshot of the repository. Once you have the files locally on your device you will install dependencies using:

$ cd toolchest/searchanalytics
$ pip3 install -r pipDependencies --user

How do I Get an API Key?

The final stage is to setup OAuth access to the Google Search Console API as follows:

  1. Create a new project in the Google APIs console at https://code.google.com/apis/console. You should do this from the same account that has access to the Google Search Console data you want to download.
  2. How do I find my Google API key? To do this create an OAuth 2.0 client id for the project at https://console.developers.google.com/apis/credentials.
  3. Create your own client secrets file by copying the template provided:
    $ cd toolchest
    $ cp client_secrets.template.json client_secrets.json
  4. Enter your OAuth client ID and client secret into the client_secrets.json file.

How To Run the Script

You can run the script directly from the command line. Then later you can automate your process by scheduling the script to run from Windows Scheduler, Linux Cron and MacOS launchd.

The script has the following required arguments:

  • property_uri: The Google Search Console URI e.g. sc-domain:example.com, https://www.example.com/ etc. whose data should be downloaded
  • start_date: The first date to request data from as YYYY-MM-DD
  • end_date: The last date to request data to as YYYY-MM-DD

A typical invocation might be:

$ toolbox/search_analytics.py sc-domain:example.com 2019-01-01 2019-12-31

This will create a file called search_analytics.sqlite3 containing all search data between the given dates.

The optional --engine argument, when supplied, provides the SQLAlchemy database engine to use during the download. By default this is sqlite:///./search_analytics.sqlite3. See database urls for further options you might like to supply.

The optional -v parameter enables verbose output from SQLAlchemy and is useful when debugging SQL issues.

SQLite Data Schema

The data downloaded by the script are held in the following database tables.

The aggregate table contains verbatim data from Google and will match values you see in the graphs shown by the Google Search Console web app.

The aggregate Table
Column Name Description
search_type image / video / web
date The date of the search analytic
device DESKTOP / MOBILE / TABLET
country The ISO country code of the search analytic
clicks The sum of clicks on breakdown rows included in the aggregate analytic
impressions The sum of impressions from the breakdown rows included in the aggregate analytic
ctr The click through rate of the breakdown rows included in the aggregate analytic
average_position The average SERP of the breakdown rows included in the aggregate analytic

The breakdown table includes both verbatim data from Google Search Console and additional rows used to ensure the breakdown totals match the values in the aggregate table. In addition each row is augmented to include a flag indicating if it represents a secondary search result or not.

The additional rows can be identified by the text "*OMITTED*" in the url and query columns.

The breakdown Table
Column Name Description
search_type image / video / web
date The date of the query
device DESKTOP / MOBILE / TABLET
country The country code where the query originated
url Resultant (canonical) URL presented by Google for the query
query The keywords of the query entered by the Google Search user
secondary_result true if this query was not the first query to appear on the SERP
clicks The number of clicks on this query result
impressions The number of impressions for this query result
ctr The click through rate of this query result
average_position The average SERP of this query result

The search_appearance table contains verbatim data from Google and will contain the values you see in the search appearance table shown by the Google Search Console web app.

The search_appearance Table
Column Name Description
search_type image / video / web
date The date of the search analytic
appearance WEBLITE
clicks The sum of clicks on queries included in this analytic
impressions The sum of impressions of queries included in this analytic
ctr The click through rate = clicks / impressions
average_position The average SERP of queries included in this analytic

Example SQL Queries

Show urls that have been returned as secondary search results:

SELECT DISTINCT url
FROM breakdown
WHERE secondary_result = 1
ORDER BY url

Show top 5 countries by total clicks across all devices, search types and dates:

SELECT country, sum(clicks)
FROM aggregate
GROUP BY country
ORDER BY SUM(clicks) DESC
LIMIT 5

Show all keywords, including secondary results, surfaced by Google Search users on a given day, from a desktop device looking for web results:

SELECT DISTINCT query
FROM breakdown
WHERE date(date) = "2019-09-01"
AND search_type = "web"
AND device = "DESKTOP"
ORDER BY query