The Centre for Investigative Journalism
The Centre for Investigative Journalism

Data Journalism: Scraping 1-2. Hands-on. [B+I]

Hands-on. Beginner-Intermediate.

Scraping 1
You may have heard of scraping, you may even have done a little yourself. These two sessions set out to demystify the process of getting data from websites where there is no download or export button, and copying and pasting is getting you nowhere. In the first session we will look at what is going on behind the scenes when you scrape, and discuss some of the legal and ethical issues that arise. We will do some basic scraping with ready made tools you already have on your desktop or can get for free.

Scraping 2
In the second session we will go into some of the more subtle tricks of scraping, examine common problems that arise, and write some scraping code using R/RStudio that simplifies repetitive elements of scraping webpages. There is no need for prior knowledge of R – we will be using a cloud version of an R package, so you will only need to come with your own laptop, with Chrome or Firefox installed.

Technical Requirements

Own laptop, with Chrome or Firefox installed.

28 June 2023 –

Scraping 1
Scraping 2

Jonathan Stoneman

Jonathan Stoneman is a freelance trainer specialising in data journalism. He has been working with data since 2010. Before that he worked at the BBC – as a reporter, producer, editor of output in Macedonian and Croatian, and finally as head of training at BBC World Service.
  • 28 June 2023 15.10–17.30
Location: PSH 326