#CIJSummer

Investigative Journalism Summer Conference
Where new tools meet traditional craft
04 – 06 July 2019

Class information 2019

This page will be updated continuously, as talks and hands-on workshops get confirmed.

You can book for individual days as well as for all three days. On Thursday 4 July and Friday 5 July we will focus on practical skills, while Saturday 6 July will feature keynote talks, networking, discussions and #CIJSummer drinks reception.

Please note that some classes form mini-courses and are best attended as a whole.

The sessions marked [Rec] will be recorded.

Some data journalism workshops take place in computer labs where computers will be provided, but most will require you to bring your own laptops. Please see Technical Requirements page for all the info. The Data Concierge will be run every day to help you with installing software if you have difficulty doing it at home. Please make sure you have all the software installed before coming to the classes!

All data journalism workshops are practical, hands-on classes designed to teach participants the software and data analysis techniques used by journalists in the newsroom.

Refreshments are served several times a day (but not all the time) throughout the course, with a speakers and delegates lunch and drinks party held on Saturday.

Keynote/Networking Day Saturday 6 July 

Tickets are available for individual days, including Saturday keynote talks only. See the Book Now page for more information.

09:30 – Welcome from James Harkin, Director of the CIJ

09:40 –  Gavin MacFadyen Memorial Lecture

11:00 – TBC

12:00 – Speakers and Delegates Lunch.

13:00 – BreakOut Talks: TBC

14:20 – TBC

15:40 – TBC

17:00 – TBC

18:00 – Drinks reception

Computer Security Advice Clinic
Thu 4 -Sat 6 July: Getting Hands-on, Installing the Tools for Digital Self-Defence

Visit the security zone in the Atrium with your laptop and learn how to set up tools to browse anonymously, chat and mail with encryption and prevent data-loss from theft/confiscation of laptops and storage media. This will include the TOR-browser, PGP mailcrypto and OTR-chat.

The security software we will be using are all free of cost and will work on Windows, Mac and Linux laptops. They will not work on iPads or Android tablets. Please bring a laptop that you are able/allowed to install software on and contact us with any specific questions beforehand.

Talks and Mini-Courses

Thursday 4 July – Friday 5 July

(In alphabetical order. Excluding Data Journalism, see below.)

Accessing Information Under FOIA – 1
Jenna Corderoy 
This session will outline the basics of the Freedom of Information Act (FOIA), and how you can apply it to your research, campaigns and investigations. We’ll go through the type of information that can be accessed from government bodies and how you can draft effective requests to get the most out of the Act. We will also look at how to make requests for information under the lesser-known Environmental Information Regulations. Towards the end of the session, we will demonstrate WhatDoTheyKnowPro, a new FOIA toolkit for journalists developed by MySociety.

Accessing Information Under FOIA – 2
Jenna Corderoy 
This session will go through the FOIA appeals process, and teach you how to argue your case when government bodies are doing whatever they can to prevent a disclosure. At the end of the session, we will look at how to send freedom of information requests around the world. To finish, we’ll discuss and work through some of the FOIA challenges that you have encountered.

Crossborder journalism – Everybody can learn it
Brigitte Alfter
If you’re impressed by large crossborder stories like CumEx Files, Panama Papers or Malta Files, don’t be daunted. It is a method, everybody can learn. In this workshop Brigitte Alfter will give an introduction to the basic considerations of crossborder collaborative journalism, the levels of intensity and the process from idea to publication and beyond. Brigitte is the author of Crossborder Collaborative Journalism: A Step-By-Step-Guide. 

Covert Filming
Paul Samrai
This session on covert filming has evolved over the years into a state-of-the-art technical workshop looking at methods to acquire evidence for public interest investigations. It is taught by a leading undercover technician and an experienced television reporter who discuss the process and ethics of going undercover and look at the latest high-quality equipment.

Don’t be Numbed by Numbers
Jonathan Stoneman
What do you do when faced with a really big dataset for the first time? Using examples, Jonathan Stoneman will discuss approaches that help reduce a daunting mountain of data to a manageable mass.

Although this is not a hands-on session it will be possible to download the demo data and follow along.

Holding a mirror to the National Health Service
Shaun Lintern
Tips, tricks and ethical insights in how to properly investigate the UK NHS by the journalist who helped expose the Mid Staffordshire scandal. How to work with whistleblowers and bereaved families while at the same time respecting the commitment and dedication to NHS staff. Hear the shocking truth about the safety of the NHS and what role journalists can play in making it safer.

How to Get the Most Out of Companies House
Martin Tomkinson and Robert Miller
Any UK-based investigative journalist or aspiring journalist should have a working knowledge of Companies House.
Companies House is the central registry for all UK registered limited or PLC companies and contains a wealth of useful information for those who know how to use the site. The aim of this class is to show how to get the most information from the official website, as well as highlighting what information can’t be found there. The class will give ample time for questions and queries and is an absolute must for anybody who does not feel confident in using this vital tool for investigators.
Class handout: Companies House.

Introduction to Data Journalism: How to get the most of Data Tracks.
This session will provide a chance to find out what data journalism classes are on offer and which tools are best for which tasks. Our data trainers will advice you on the best data pathway and explain how you can improve your jouraliusm with data analysis.

Libel and Privacy Laws
Justin Walford
In this session you will learn about libel and privacy and hear how recent cases have affected the law. This class is for anyone who wants to update their legal knowledge and find out how they are affected by recent legal developments.

SCIENCE: 101 on science reporting
Kevin McConway. Moderated by Wendy Grossman
Where to find stories and how to read research papers and university and journal press releases.  

SCIENCE: Reporting on academic misconduct and the business of science
Holly Else, Éanna Kelly and Hannah Devlin.  Moderated by Emma Stoye

SCIENCE: Digging out research discoveries and science scoops
Joshua Howgego, Crispin Dowler and Julian Sturdy. Moderated by Wendy Grossman

Story-Based Inquiry 1: Hypothesise Your Story
Luuk Sengers and Mark Lee Hunter
Investigation has a dirty name with editors, who think it’s about slowly rummaging through piles of garbage till you find (or don’t find) a jewel. Too often, they’re right. This session will show you how to choose a subject and define your investigation as a story from the start, using hypotheses. The method helps you figure out what to look for, how to look for it and how to sell it to your boss and the public.

Story-Based Inquiry 2: Creative Techniques Create the Timeline and Scenarise the Story
Mark Lee Hunter and Luuk Sengers
In this session we map the plot of a story – a sequence of events that must have occurred, which we can subsequently verify and enrich. Simultaneously, we create scenes, with characters whose actions and conflicts define the content and meaning of the story. These events lead to the sources you need.

Story-Based Inquiry 3: From Source Mapping to the MasterFile
Luuk Sengers and Mark Lee Hunter
This session begins with an alternative to the timeline – a map of the actors in your story and the sources they hold. Now that we’ve shown you where to acquire information assets, we’ll show you how to optimise them. We’ll create a simple but effective database in which you collect the results of your investigation. This ‘MasterFile’ makes it easier to structure your story – the hardest part of composition. It’s a way to write while you research, instead of first researching and then writing. It’s also a way to build resources for a long, successful career.

Story-Based Inquiry 4: Craft the Story
Mark Lee Hunter and Luuk Sengers
This session shows you how to compose a story that hits hard and fast, and builds to a powerful conclusion. The core of this method is continuous composition and referencing – an approach that saves both you and your colleagues time and anguish. We turn the ‘MasterFile’ into a narrative structure based on a chronology or a sequence of themes and characters. We apply techniques for controlling rhythm, the element that keeps your audience reading, listening or watching. We finish with quality control – reducing the risk of mistakes that can cause damage to others and your own reputation.

Understanding Company Accounts 1-4
Raj Bairoliya
This course taught by a journalist-friendly forensic accountant will show you how to understand company accounts and get beyond the corporate PR spin. The emphasis will be on teaching practical skills rather than a series of lectures. The objective of this course is to ensure that all participants feel comfortable with a set of accounts and know where and how to look for relevant information.
The only prerequisites for this course are numeracy and an interest in financial matters as the theory will be taught in the first class and applied to real-life examples in the following sessions.
You must attend all the classes in this strand to benefit from it fully.
It will include the following topics: motivation to massage earnings; profit and loss account; balance sheet; funds flow statement; notes. And will finish with putting it all together, an interactive session building up a sample set of accounts or case study questions.
The participants are actively encouraged to ask questions throughout.

Data Journalism (CAR)

All class descriptions are listed in alphabetical order.Note: (B) signifies beginners, (I) intermediate and (A) advanced levels
Courses with numbers (eg Excel 1, Excel 2, Excel 3…) should be taken in sequence. You do not need to have your own laptop for these classes as they take place in computer labs, however you can use your own laptop if you prefer to.
The number of places is limited and allocated on first come, first served basis. 
———————————————–
Creating data visualisations and interactives with Flourish
Daan Louter
In this session you’re going to learn how to make data visualisations and interactives using Flourish. Flourish is a tool that allows journalists and non-coders to create high-end data visualisations and interactives, it’s free for newsrooms. Available templates include animated maps, network diagrams, Sankey charts, quizzes, and more. This workshop is lead by Daan Louter, Head of Newsrooms at Flourish, who previously worked in The Guardian Visuals team. He will introduce the Flourish platform and take participants through the process of creating and distributing data-driven stories. No previous experience of data visualisation or coding is required.


Data Cleaning with Pandas 1 – (I), Hands-on*
Karrie Kehoe and Max Harlow
Data cleaning can feel more like data penance, but Pandas can ease your pain, allowing you to clean and structure your data with minimal hassle. Jupyter Notebook’s interactive environment helps you keep track of your changes and allows you to explore your data.
Participants can expect to learn how to clean large complicated datasets quickly and learn how to explore data too large for Excel by using the browser based Jupyter Notebook.
Participants should have previous experience of coding at a basic level or more.

Data Wrangling with Pandas 2 – (I), Hands-on*
Karrie Kehoe and Max Harlow
Your data is squeaky clean and ready to go – time to dig deep and start hunting for those elusive leads. Pandas allows you to quickly and easily perform statistical analysis on your data helping you to mine for stories and look for outliers.
Participants can expect to learn programmatic methods to analyse large datasets and to visualise their results within Jupyter Notebook.
Participants should have previous experience of coding at a basic level or more.

*Why Python? 

Python makes it easy to replicate your analysis at a later stage and reduces the threat of human error that many face in Excel. It’s also shareable within teams and allows you to document and explain your work within the notebook so you can come back to it later and easily pick up from where you left off.
There are no upper limits in terms of data size, you can use Python on a csv with 10 rows or a billion. You get to a point where the limitation is the speed of the RAM on your machine, at which point you need to switch to a server.

Dealing with Large Datasets
Jonathan Stoneman
What do you do when faced with a really big dataset for the first time? Using examples, Jonathan Stoneman will discuss approaches that help reduce a daunting mountain of data to a manageable mass.
Although this is not a hands-on session it will be possible to download the demo data and follow along.

Excel 1: The Power of Data Analysis for Stories (B), Hands-on
Crina Boroş and Jonathan Stoneman
Data is everywhere and spreadsheets can help reporters to find story ideas in the data. This course introduces data analysis using Microsoft Excel. Participants will learn basic calculations, rates, ratios and analytic tools that generate story ideas.

Excel 2: Finding Patterns in the Data (B), Hands-on
Crina Boroş and Jonathan Stoneman
The second spreadsheet course covers built-in analytical tools, such as sorting, filtering and chart creation, tools that help reporters quickly find great stories within databases.

Excel 3: Summarising Your Data for the Big Picture (B), Hands-on
Crina Boroş and Jonathan Stoneman
To complete your spreadsheet toolkit, learn how to make pivot tables that will summarise trends in your data.

Finding Needles in Haystacks with Fuzzy Matching, Hands-On
Max Harlow
Fuzzy matching is a process for linking up names that are similar, but not quite the same. It has become an increasingly important part of data-led investigations as a way to identify connections between public figures, key people, and companies that are relevant to a story. This class will cover how fuzzy matching typically fits into the investigative process, with some story examples. We will show you how to run some of the different types of fuzzy match on some real datasets, including the pros and cons of each.
Own laptop required. Install Python 3 (https://www.python.org/downloads). On Macs open the Terminal (inside Applications, then Utilities) and run: pip3 install csvmatch. On Windows, also install Cygwin (https://cygwin.com/install.html), then open Cygwin and run: pip3 install csvmatch.

Googlesheets 1 (B) Hands-on
Pamela Duncan
Data journalism introduction: overview of the seven building blocks behind data stories
The basics: use Googlesheets to carry out basic calculations and percentage increases

Googlesheets 2 (B) Hands-on
Pamela Duncan
Finding your top line: sorting and filtering in Googlesheets/Excel
Handy/fun tools: (split, concatenate, currency conversion, translate)

Googlesheets 3  (B) Hands-on
Pamela Duncan
Quick-smart data summary/analysis using pivot tables
Merging datasets (VLookUps)
Basic scraping using Google’s Import tools

Graph Databases 1. (I) Hands-on
Leila Haddou, Max Harlow
In data journalism, we tend to use relational databases –  data in table form – such as Excel or SQL to do our analysis and find stories. Graph databases are different, but are incredibly useful to find connections or patterns within our data that would be difficult, if not impossible, to spot using a relational database. This session will provide a hands-on introduction to graph database Neo4j, showing examples of its use for investigative stories including the Panama Papers, and demonstrate how to build a graph database of political donations and match them with corporate data to see at a glance the networks involved.

Own laptop required. For graph databases 1 and 2: Install Neo4j (https://neo4j.com/download).

Graph Databases 2. (I) Hands-on
Leila Haddou, Max Harlow
In part two, you will learn to analyse your newly built graph database using Cypher, Neo4j’s query language. It is advisable to have completed part one to get the most out of this session.

Own laptop required. For graph databases 1 and 2: Install Neo4j (https://neo4j.com/download).

How Can Code Help Your Journalism?
Leila Haddou, Max Harlow
This talk is an introductory primer to understanding how code is used in the newsroom, showing recent story examples, explaining the fundamental concepts in programming and demystifying the jargon.
 You will learn how code is used by reporters to find stories and aid investigations, and gain a basic understanding of how computer programs are structured. 
We will also provide a guide to the most common programming languages to help you identify which would suit your needs if you decide pursue learning yourself.

No computers required

Introduction to Data Visualisation
Sophie Warnes
If you’ve ever wondered how to make well-designed charts, this session will explain exactly how. Sophie will take you on a brief tour of the history of data visualisation, cover the principles of how data is encoded into visual cues, and end with an opportunity to make your own data visualisation using the HighCharts.js library.

R – 1: Introduction to R (B), Hands-on**
Caelainn Barr and Karrie Kehoe
In the first class, R-1, you’ll be shown the basics and get familiar with R and RStudio, import data and learn some functions for getting to grips with your dataset including sorting and filtering. This class assumes no prior experience with R.

R – 2:  Data Wrangling and Statistics in R (A), Hands-on**
Caelainn Barr and Karrie Kehoe
In R-2 you’ll get down to some data wrangling and learn how join datasets and carry out calculations in R that will allow you to identify trends in the data for storytelling. You’ll also learn statistical functions in R and how to use ggplot2 for basic visual analysis.

R – 3:  Scraping and APIs in R (A), Hands-on**
Caelainn Barr and Karrie Kehoe
In the third and final class, R-3, you’ll use R to scrape, clean and structure data from webpages and APIs. You’ll also learn how to use R to convert, join and split difficult data files.

**If you are a complete beginner, these sessions will work best if you come to classes 1 to 3 as we will be building on knowledge and datasets from class to class. However, if you have experience in R you are free to join classes 2 and/or 3.

SQL for Journalists -1, Hands-on***
Crina Boroş
What to do when Excel is not enough to crunch your data and hardcore coding is not your style? SQL is like Excel, but on steroids! This is the first of three workshops and will introduce you to the lingua franca of programming and a popular relational database. You’ll see what SQL does: create a database, import a spreadsheet, and learn about the main ‘select statements’.
Note: Familiarity with Excel is recommended for those wishing to attend.

​SQL for Journalists – 2, Hands-on***
Crina Boroş
You’ll learn about the power of the Golden Query through the introduction of functions, filters and analysing data using code for reporting. You’ll also start joining tables.
Note: Familiarity with SQL ‘select statements’ is necessary, and with Excel recommended for those wishing to attend.

SQL for Journalists 3, Hands-on***
Crina Boroş
Building on SQL 1 and 2, you’ll make tables talk to each other, clean dirty data and update tables.
Note: Familiarity with SQL ‘select statements’ is necessary, and with Excel recommended for those wishing to attend.

*** Software requirements: SQL
The classes will take place in a computer lab, but if you prefer to use your own laptop you will need:
Microsoft SQL Server Manager; Excel 2010 or newer; Notepad (classic, retro, free one for .txt)​

Web Scraping for Journalists 1+2
Paul Bradshaw
In these hands-on sessions you will be introduced to some of the basic techniques to get started on scraping data for investigations:

– investigation ideas: how to spot opportunities to use scraping and automation in investigations

– scraping basics: finding structure in HTML and URLs; what’s possible with programming

– simple scraping jobs: how to write a basic scraper in five minutes

– data journalism tools: the challenges of scraping hundreds of webpages, dozens of documents, or the invisible contents of databases.
Own laptop required.