The Centre for Investigative Journalism
The Centre for Investigative Journalism

Web Scraping for Journalists

In these hands-on sessions you will be introduced to some of the basic techniques to get started on scraping data for investigations:

  • investigation ideas: how to spot opportunities to use scraping and automation in investigations
  • scraping basics: finding structure in HTML and URLs; what’s possible with programming
  • simple scraping jobs: how to write a basic scraper in five minutes
  • data journalism tools: the challenges of scraping hundreds of webpages, dozens of documents, or the invisible contents of databases.


Technical Requirements

Own laptop required

5 July 2019 – #CIJSummer 2019/DAY 2

Web Scraping for Journalists 1
Web Scraping for Journalists 2

Paul Bradshaw

Professor Paul Bradshaw is an online journalist and blogger, who leads the MA in Data Journalism at Birmingham City University. He manages his own blog, the Online Journalism Blog (OJB), and was the co-founder of Help Me Investigate, an investigative journalism website funded by Channel 4 and Screen WM.
  • 5 July 2019 09.00–11.15
Location: Room 326 - PSH Building - Goldsmiths, University of London