Data-Journalism and scraping skills – Report of a meet-up

  • Isaline Mülhauser

_Tuesday 27th September, at Liip Lausanne, we had the pleasure to welcome Barnaby Skinner from SonntagsZeitung and Tages-Anzeiger and Paul Ronga from Tribune de Genève for a meet-up about data-journalism. You'll find the slides and further readings here._

During the summer I came across a news, written by Barnaby Skinner about a 3 months course at Columbia University in New York, that he was attending with Paul Ronga (from Tribune de Genève) and Mathias Born (from Berner Zeitung). The course was mainly intended for journalists, teaching them to gather data, improve their analytic skills (for example with Python, Panda libraries, SQL, combining the three, scraping with BeautifulSoup and using Selenium for automated scraping).

Finding the theme extremely interesting, I invited both Barnaby Skinner and Paul Ronga, at Liip Lausanne to tell us more on the subject.

Why are Scraping Skills Important, Especially For Swiss (and Other European) Journalists, Researchers or App Developers?

You can find the slides Datajournalism_Presentation.

The US government, data driven US companies, NGOs, Thinktanks make so much data available. At least when you compare it to Swiss and European governments or companies. That's why scraping skills are all the more valuable for Swiss journalists, researchers, app developers: in so many cases the data is actually there. It's just not structured in a way that is easily machine readable.

Starting with the basics, we will discuss more elaborate and sophisticated scraping techniques, using examples and, discussing and sharing some sample code.

By Paul Ronga and Barnaby Skinner

Paul and Barnaby started by a brief introduction and a few examples about the difference of data availability between the US and Europe. They showed us however that, even in Switzerland, some data is available. The whole point indeed is to be able to gather and organise it in something usable.

They pursued the presentation by showing us – live – a few tools that we can use for scraping. Then, they explained a few of their past and current researches and how the managed to gather the data.

As a conclusion, Paul and Barnaby questioned the legal aspect of scraping.

I might not – yet – be able to master the researches like they did, but it was highly interesting to see how they proceed. It seems that it is all about trying and maybe failing and being creative with the tools at your disposal.

Further readings and information

Paul Ronga wrote a short text about the main aspects of data-journalism:

  1. Beware of data
  2. Do not rely on your tools
  3. Be cautious about anything interactive
  4. Data-journalism is a type of investigation

→ Read the full text (in French)

Barnaby Skinner proposes further readings on data-journalism in this article.

Thanks again to everyone for joining us, and especially to Paul and Barnaby for their presentation!

Please, contact me if you have any idea or wish to propose an idea for a next meet-up about opendata or data-journalism!

Added October 11th: Barnaby Skinner's presented us an example during the meet-up, the article, is now online TagesAnzeiger.

Tell us what you think