The Centre for Investigative Journalism
The Centre for Investigative Journalism
Menu

Web Scraping for Journalists

In these hands-on sessions you will be introduced to some of the basic techniques to get started on scraping data for investigations:

  • investigation ideas: how to spot opportunities to use scraping and automation in investigations
  • scraping basics: finding structure in HTML and URLs; what’s possible with programming
  • simple scraping jobs: how to write a basic scraper in five minutes
  • data journalism tools: the challenges of scraping hundreds of webpages, dozens of documents, or the invisible contents of databases.

 

Technical Requirements

Own laptop required

5 July 2019 – #CIJSummer 2019/DAY 2

09:00–10:00
Web Scraping for Journalists 1
10:15–11:15
Web Scraping for Journalists 2

Paul Bradshaw

Professor Paul Bradshaw is a data journalist and author, who leads the MA in Data Journalism at Birmingham City University. He publishes the Online Journalism Blog (OJB), and was the co-founder of Help Me Investigate, an investigative journalism website funded by Channel 4 and Screen WM.
  • 5 July 2019 09.00–11.15
Location: Room 326 - PSH Building - Goldsmiths, University of London
Course
Beginner
Data