The Centre for Investigative Journalism
The Centre for Investigative Journalism
Menu

Data Journalism: Putting Python into practice — hands-on coding for journalists, by journalists

Hands-on. Beginner [B]-Intermediate [I]. See each session for the details.

This three-part series introduces participants to the power of Python, by journalists, for journalists. We’ll help you take your first steps on the road to code, from the very basics, to working with giant spreadsheets, to extracting structured data from the web.

No prior coding experience is required and participants will receive workbooks to continue practicing in their own time. A laptop with read/write or admin permissions is required for all sessions.

 

Python/Pandas 1: An introduction to (or a refresher on) Python: first steps on the road to code [B]

This session introduces the building blocks of Python (variables, lists, dictionaries, calculations, how to handle errors and the all-important For loop). No prior coding knowledge or specialist software is required. Participants will leave with an understanding of how code works, a workbook which they can use as a reference and practice tool after the course.
This session is suitable for beginners or as a refresher for intermediates.

Python/Pandas 2: Taming gi-normous spreadsheets with the Pandas library [B/I]

Ever had Excel crash because the dataset you imported was just too big for it to ingest? This module shows you how to handle massive spreadsheets (over 1 million rows) using Python’s powerful Pandas library. No prior coding experience is needed (although total beginners are encouraged to attend session 1). Participants will learn how to load, sort and filter large datasets efficiently and will receive a copy of the workbook which they can reference when interrogating large datasets in future.
If taken individually, this session is suitable for beginners / intermediates.

Python/Pandas 3: Scraping data from the web: how Python can automate data collection [I]

When the data you need is available online — right there in front of your very eyes — but there is no way of downloading it? Scraping to the rescue! In this session we demonstrate how Python can help you gather structured data spread across multiple web pages. Basic knowledge of Python or participation in the earlier sessions is highly recommended. Participants will leave with an understanding of how a web scraper works and a workbook which they can refer back to.
This session is suitable for intermediates, or beginners who have attended sessions 1 & 2.

Technical Requirements

Participants will require a laptop with read/write or admin permissions.

26 June 2025 –

11:30–12:30
Python/Pandas 1: An introduction to (or a refresher on) Python: first steps on the road to code [B]
14:40–15:40
Python/Pandas 2: Taming gi-normous spreadsheets with the Pandas library [B/I]
15:50–16:50
Python/Pandas 3: Scraping data from the web: how Python can automate data collection [I]

Anna Leach

Anna Leach is a Visual Projects editor at the Guardian, where she creates visual and interactive stories ranging from the ownership of England's water companies to an analysis of what folklore can tell us about conspiracy theories.

Pamela Duncan

Pamela Duncan is the editor of the Guardian’s Data Project team, an occasional award-winning journalist (#humblebrag) and a self-confessed data nerd. She can usually be found at her desk poring over spreadsheets and using her coding skills - usually a combination of scraping, regex and pandas/Python - to build and analyse datasets to produce high quality and exclusive data stories.
  • 26 June 2025 11.30–16.50
Location: TBC
Course
All levels
Data