The Centre for Investigative Journalism
The Centre for Investigative Journalism

Data Investigations with Python

Harness the power of Python and the versatile data analysis library Pandas to supercharge your journalism and work with large data sources in an efficient and reproducible way. This course will be hands-on using Google Colab to write, run and manage the code you write as part of the course.

Participants will be introduced to the many uses of Python for investigative data-driven research, with a range of opportunities to put the techniques into practice using examples and exercises from real-life data and investigations.

The final session will guide participants through planning and setting up their own Python projects on which to begin implementing the techniques learned throughout the course. It is strongly recommended that participants have identified analytical projects they would like to apply Python to prior to joining the course.

Participants will also have the opportunity to submit their projects for review and feedback one month after the final session. During this month, it is expected that participants will put aside between 4 and 10 hours to work on their projects before submitting them for feedback.


N.B. Python is a great language for writing web scrapers, however, this course will focus on Python’s application to data analysis. If your main objective in learning Python is for scraping, then our specific Web Scraping for Journalists courses would be more suitable.

Technical Requirements

This course will need you to have the following software/apps/tools on your computer:

  • A Google Colab account.
  • Zoom app. During these sessions the trainers often need participants to be able to share their screen in order to solve problems or demonstrate techniques: if you are on a work computer, or other device which has screen sharing on Zoom disabled, please consider getting the restriction lifted for the duration of this course. If you can’t share the screen because the function is blocked or disabled, it makes it much harder to solve problems and learn from them. But, rest assured, nobody will be forced to share their screen against their will.
  • Camera and audio

This course will be hosted on Zoom. To find out more about how we use Zoom, please check out our Zoom InfoSec page.

Final Project
Following the final session, participants will have three weeks to continue working on their own python project. If the project is submitted by 22 November, then they will receive feedback and guidance on further steps by 29 November.

Project Submission: 22 November 2024
Project Feedback Received: 29 November 2024

21 October 2024 – Session 1 - Getting started with Python and the set-up

Find out what Python can do for your data journalism and learn some of the basics of the technologies we will be using for the course.

22 October 2024 – Session 2 - Working with Data in Python

Learn how to read data from a spreadsheet, sort it and filter records.

23 October 2024 – Session 3 - Analysing Data in Python

Summarise and aggregate data using groupby functions and pivot tables in Pandas.

24 October 2024 – Session 4 - Transforming and Cleaning Data with Python

A closer look at how to handle different types of files in Python and how to export data so that it can be used in other programmes. Introduction to string manipulation and the basics of data cleaning in Python.

25 October 2024 – Session 5 - Starting a Python Project

Planning your own Python project and starting work on the Google Colab notebook that will be your final project.

Sam Leon

Sam Leon is co-founder of Data Desk, an investigative consultancy focused on climate and the commodities industry. He previously worked at Global Witness where he ran their digital investigations unit.

Booking Form

  • 21 October 2024 10.00–12.00 BST (UK time)
  • 22 October 2024 10.00–12.00 BST (UK time)
  • 23 October 2024 10.00–12.00 BST (UK time)
  • 24 October 2024 10.00–12.00 BST (UK time)
  • 25 October 2024 10.00–12.00 BST (UK time)
BST (UK time)
Location: Online
Goldsmiths students (Full Time)
Other students (Full Time)
Small Media/Education/NonProfit Organisations (<10 staff)
Large Media/Education/NonProfit Organisations (10+ staff)
Other Organisations

In line with our non-profit mission, our pricing operates on a sliding scale, ensuring large organisations pay more to subsidise places for smaller newsrooms, freelancers and students.

*Student places for this course are capped, due to limited capacity. Anyone registering as a student will be asked for a photo/scan of their student ID ahead of the course.

**Employed individuals who cannot have their employers pay for the course are entitled to the freelancer rate. Note that we are a small charity and rely on your honesty so please do not register as a freelancer if your employer is reimbursing you for the course.

We have a strict policy of No Refund and No Transfer of bookings.