<?xml version="1.0" encoding="utf-8"?>
<!-- generator="Kirby" -->
<rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom">

  <channel>
    <title>Mot-cl&#233;: ckan &#183; Blog &#183; Liip</title>
    <link>https://www.liip.ch/fr/blog/tags/ckan</link>
    <generator>Kirby</generator>
    <lastBuildDate>Tue, 29 May 2018 00:00:00 +0200</lastBuildDate>
    <atom:link href="https://www.liip.ch" rel="self" type="application/rss+xml" />

        <description>Articles du blog Liip avec le mot-cl&#233; &#8220;ckan&#8221;</description>
    
        <language>fr</language>
    
        <item>
      <title>The role of CKAN in our Open Data Projects</title>
      <link>https://www.liip.ch/fr/blog/the-role-of-ckan-in-our-open-data-projects</link>
      <guid>https://www.liip.ch/fr/blog/the-role-of-ckan-in-our-open-data-projects</guid>
      <pubDate>Tue, 29 May 2018 00:00:00 +0200</pubDate>
      <description><![CDATA[<h2>CKAN's Main Goal and Key Features</h2>
<p><a href="https://ckan.org/">CKAN</a> is an open source management system whose main goal is to provide a managed data-catalog-system for Open Data. It is mainly used by public institutions and governments. At Liip we use CKAN to mainly help governments to provide their data-catalog and publish data in an accessible fashion to the public. Part of our work is supporting data owners to get their data published in the required data-format. We’re doing this by providing interfaces and useable standards to enhance the user experience on the portal to make it easier to access, read and process the data.</p>
<figure><img src="https://liip.rokka.io/www_inarticle/2ea997/bookcase.jpg" alt="bookcase"></figure>
<h3>Metadata-Catalog</h3>
<p>Out of the box CKAN can be used to publish and manage different types of datasets. They can be clustered by organizations and topics. Each dataset can contain resources which themself consist of Files of different formats or links to other Data-Sources. The metadata-standard can be configured to represent the standard you need but the Plugin already includes a simple and useful Meta-Data-Standard that already can get you started. The data is saved into a Postgres-Database by default and is indexed using SOLR.</p>
<h3>Powerful Action-API</h3>
<p>CKAN ships with an <a href="http://docs.ckan.org/en/latest/api/index.html">API</a> which can be used to browse through the metadata-catalog and create advanced queries on the metadata. With authorization the API can also be used to add, import and update data with straight-forward requests. </p>
<h3>Cli-Commands</h3>
<p>The standard also includes a range of Cli-Commands which can be used to process or execute different tasks. Those can be very useful, e.g. to manage, automate or schedule backend-jobs.</p>
<h3>Preview</h3>
<p>CKAN offers the functionality to configure a preview of a number of different file-types, such as tabular-data (e.g. CSV, XLS), Text-Data (e.g. TXT), Images or PDFs. That way interested citizens can get a quick overview into the data itself without having to download it first and having to use local Software to merely get an better idea on how the data looks.</p>
<figure><img src="https://liip.rokka.io/www_inarticle/16fb9f/statistik-stadt-zurich-preview.png" alt="Preview von Daten auf Statistik Stadt Zürich"></figure>
<h2>Plugins</h2>
<p>While CKAN itself acts as a CMS but for data, it really shines when making use of its extensibility and configure and develop it to your business needs and requirements. There is already a wide-ranging  list of plugins that have been developed for CKAN, which covers a broad range of additional features or make it easier to adjust CKAN to fit your use cases and look and feel. A collection of most of the plugins can be found on <a href="http://extensions.ckan.org/">CKAN-Extensions</a> and on <a href="https://github.com/topics/ckanext">Github</a>.</p>
<p>At Liip we also help maintaining a couple of CKAN's plugins. The most important ones that we use in production for our customers are:</p>
<h3>ckanext-harvest</h3>
<p>The ckanext-harvest-plugin offers the possibility to export and import data. First of all, it enables you to exchange data between Portals that both use CKAN.</p>
<p>Furthermore we use this plugin to harvest data in a regular manner from different data-sources. At <a href="https://opendata.swiss">opendata.swiss</a> we use two different types of harvesters. Our DCAT-Harvester consumes XML-/RDF-endpoints in <a href="https://handbook.opendata.swiss/en/library/ch-dcat-ap">DCAT-AP Switzerland</a>-Format which is enforced on the Swiss Portal.</p>
<p>The Geocat-Harvester consumes data from <a href="https://geocat.ch">geocat.ch</a>. As the data from geocat is in ISO-19139_che-Format (Swiss version of ISO-19139) the harvester converts the data to the DCAT-AP Switzerland format and imports it.</p>
<p>Another feature of this plugin we use, is our <a href="http://opendata.swiss/catalog.xml">DCAT-AP endpoint</a>, to allow other portals to harvest our data and also serves as an example to Organizations that want to build an export that can be harvested by us.</p>
<figure><img src="https://liip.rokka.io/www_inarticle/02f3fc/harvesting.png" alt="How our Harvesters interact with the different Portals"></figure>
<h3>ckanext-datastore</h3>
<p>The plugin ckanext-datastore stores the actual tabular data (opposing to 'just' the meta-data) in a seperate database. With it, we are able to offer an easy to use API on top of the CKAN-Standard-API to query the data and process it further. It provides basic functionalities on the resource-detail-page to display the data in simple graphs. </p>
<p>The datastore is the most interesting one for Data-Analysts, who want to build apps based on the data, or analyze the data on a deeper level. This is an <a href="https://data.stadt-zuerich.ch/api/3/action/package_show?id=freibad">API-example of the Freibäder-dataset</a> on the portal of <a href="https://data.stadt-zuerich.ch">Statistik Stadt Zürich</a>.</p>
<h3>ckanext-showcase</h3>
<p>We use ckanext-showcase to provide a platform for Data-Analysts by displaying what has been built, based on the data the portal is offering. There you can find a good overview on how the data can be viewed in meaningful ways as statistics or used as sources in narrated videos or even in apps for an easier everyday life. For example you can browse through the <a href="https://data.stadt-zuerich.ch/showcase">Showcases on the Portal of the City of Zurich</a>.</p>
<h3>ckanext-xloader</h3>
<p>The ckanext-xloader is a fairly new plugin which we were able to adopt for the City of Zurich Portal. It enables us to automatically and asynchronously load data into the datastore to have the data available after it has been harvested.</p>
<h2>CKAN Community</h2>
<p>The CKAN-Core and also a number of its major plugins are maintained by the CKAN-Core-Team. The  developers are spread around the globe, working partly in companies that run their own open-data portals. The community that contribute to CKAN and its Plugins is always open to developers that would like to help with suggestions, report issues or provide Pull-Requests on Github. It offers a strong community which helps beginners, no matter their background. The <a href="https://lists.okfn.org/mailman/listinfo/ckan-dev">ckan-dev-Mailing-List</a> provides help in developing CKAN and is the platform for discussions and ideas about CKAN, too.</p>
<h2>Roadmap and most recent Features</h2>
<p>Since the Major-Release 2.7 CKAN requires Redis to use a new system of asynchronous background jobs. This helps CKAN to be more performant and reliable. Just a few weeks ago the new Major-Release 2.8 was released. A lot of work on this release went into driving CKAN forward by updating to a newer Version of Bootstrap and also deprecating old features that were holding back CKAN's progress. </p>
<p>Another rather new feature is the datatables-feature for tabular data. Its intention is to help the data-owner to describe the actual data in more detail by describing the values and how they gathered or calculated.</p>
<p>In the Roadmap of CKAN are many interesting features ahead. One example is the development of the CKAN Data Explorer which is a base component of CKAN. It allows to converge data from any dataset in the DataStore of a CKAN instance to analyze it.</p>
<h2>Conclusion</h2>
<p>It is important to us to support the Open Data Movement as we see value in publishing governmental data to the public. CKAN helps us to support this cause by working with several Organizations to publish their data and consult our customers while we develop and improve their portals together.</p>
<p>Personally, I am happy to be a part of the CKAN-Community which has always been very helpful and supportive. The cause to help different Organizations to make their data public to the people and the respectful CKAN-Community make it a lot of fun to contribute to the code and also the community.</p>
<figure><img src="https://liip.rokka.io/www_inarticle/6a421d/opendata-swiss-homepage.png" alt="Open Data auf opendata.swiss"></figure>]]></description>
                  <enclosure url="http://liip.rokka.io/www_card_2/f90764/account-black-and-white-business-209137.jpg" length="3167986" type="image/jpeg" />
          </item>
        <item>
      <title>Make open data discoverable for search engines</title>
      <link>https://www.liip.ch/fr/blog/make-open-data-discoverable-for-search-engines</link>
      <guid>https://www.liip.ch/fr/blog/make-open-data-discoverable-for-search-engines</guid>
      <pubDate>Tue, 24 Apr 2018 00:00:00 +0200</pubDate>
      <description><![CDATA[<p>Open data portals are a great way to discover datasets and present them to the public. But they lack interoperability and it’s thus even harder to search across them. Imagine if you’re looking for a dataset it’s just a simple “google search” away. Historically there are <a href="http://rs.tdwg.org/dwc/index.htm">lots</a> <a href="http://www.dcc.ac.uk/resources/metadata-standards/ddi-data-documentation-initiative">and</a> <a href="https://www.w3.org/TR/vocab-data-cube/">lots</a> <a href="https://frictionlessdata.io/specs/data-packages/">of</a> <a href="https://www.iso.org/standard/26020.html">metadata</a> <a href="https://www.loc.gov/marc/">standards</a>. CKAN as the de-facto standard uses a model that is close to <a href="http://dublincore.org/specifications/">Dublin Core</a>. It consists of 15 basic fields to describe a dataset and its related resources.</p>
<p>In the area of Open Government Data (OGD) the metadata standard that is widely used is <a href="https://www.w3.org/TR/vocab-dcat/">DCAT</a>.  Especially the application profiles (“DCAT-AP”), which are a specialization of the DCAT standard for certain topic areas or countries. For CKAN the <a href="https://github.com/ckan/ckanext-dcat">ckanext-dcat</a> extension provides plugins to expose and consume DCAT-compatible data using an RDF graph. We use this extension on <a href="https://opendata.swiss/">opendata.swiss</a> and <a href="https://data.stadt-zuerich.ch">data.stadt-zuerich.ch</a>, as it provides handy interfaces to extend it to our custom data model. I’m a regular code contributor to the extension.</p>
<p>When Dan Brickley working for  Google, <a href="https://github.com/ckan/ckanext-dcat/issues/75">opened an issue on the DCAT extension</a> about implementing schema.org/Dataset for CKAN, I was very excited. I only learned about it in December 2017 and thought it would be a fun feature to implement over the holidays. But what exactly was Dan suggesting?</p>
<p>With ckanext-dcat we already have the bridge from our relational (“database”) model to a graph (“linked data”). This is a huge step enables new uses of our data. Remember the 5 star model of Sir Tim Berners-Lee?</p>
<figure><img src="https://liip.rokka.io/www_inarticle/5a61db/5-star-steps.jpg" alt="5 star model describing the quality of the data"></figure>
<p>Source: <a href="http://5stardata.info/en/">http://5stardata.info/en/</a>, CC-Zero</p>
<p>So with our RDF, we already reached 4 stars! Now imagine a search engine takes all those RDFs, and is able to search in them and eventually is even able to connect them together. This is where schema.org/Dataset comes in. Based on the request from Dan I built a <a href="https://github.com/ckan/ckanext-dcat/pull/108">feature in ckanext-dcat</a> to map the DCAT dataset to a schema.org/Dataset. By default it is returning the data as <a href="https://json-ld.org/">JSON-LD</a>. </p>
<p>Even if you’ve never heard of JSON-LD, chances are, that you’ve used it. Google is promoting it with the keyword <a href="https://developers.google.com/search/docs/guides/intro-structured-data">Structured Data</a>. At its core, JSON-LD is a JSON representation of an RDF graph. But Google is pushing this standard forward to enable all kinds of “semantic web” applications. The goal is to let a computer understand the content of a website or any other content that has JSON-LD embedded.<br />
And in the future, Google wants to have a better understanding of the concept of a <a href="https://developers.google.com/search/docs/data-types/dataset">“dataset”</a>, or to put it in the words of Dan Brickley:</p>
<blockquote>
<p>It's unusual for Google to talk much about search feature plans in advance, but in this case I can say with confidence &quot;we are still figuring out the details!&quot;, and  that the shape of actual real-world data will be a critical part of that. That is why we put up the documentation as early as possible. If all goes according to plan, we will indeed make it substantially easier for people to find datasets via Google; whether that is via the main UI or a dedicated interface (or both) is yet to be determined. Dataset search has various special challenges which is why we need to be non-comital on the details at the stage, and why we hope publishers will engage with the effort even if it's in its early stages...</p>
</blockquote>
<p>This feature is deployed on the CKAN demo instance, so let’s look at an example. I can use the API to get a dataset as JSON-LD. So for the dataset <a href="https://demo.ckan.org/dataset/energy-in-malaga">Energy in Málaga</a>, I could build the URL like that:</p>
<ul>
<li>Append “.jsonld”</li>
<li>Specify “schemaorg” as the profile (i.e. the format of the mapping)</li>
</ul>
<p>Et voilà: <a href="https://demo.ckan.org/dataset/energy-in-malaga.jsonld?profiles=schemaorg">https://demo.ckan.org/dataset/energy-in-malaga.jsonld?profiles=schemaorg</a></p>
<p>This is the result as JSON-LD:</p>
<pre><code class="language-json">
{
  "@context": {
    "adms": "http://www.w3.org/ns/adms#",
    "dcat": "http://www.w3.org/ns/dcat#",
    "dct": "http://purl.org/dc/terms/",
    "foaf": "http://xmlns.com/foaf/0.1/",
    "gsp": "http://www.opengis.net/ont/geosparql#",
    "locn": "http://www.w3.org/ns/locn#",
    "owl": "http://www.w3.org/2002/07/owl#",
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
    "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
    "schema": "http://schema.org/",
    "skos": "http://www.w3.org/2004/02/skos/core#",
    "time": "http://www.w3.org/2006/time",
    "vcard": "http://www.w3.org/2006/vcard/ns#",
    "xsd": "http://www.w3.org/2001/XMLSchema#"
  },
  "@graph": [
    {
      "@id": "https://demo.ckan.org/dataset/c8689e49-4fb2-43dd-85dd-ee243104a2a9",
      "@type": "dcat:Dataset",
      "dcat:contactPoint": {
        "@id": "_:N71006d3e0205458db0cc7ced676f91e0"
      },
      "dcat:distribution": [
        {
          "@id": "https://demo.ckan.org/dataset/c8689e49-4fb2-43dd-85dd-ee243104a2a9/resource/c3c5b857-24e7-4df7-ae1e-8fbe29db93f3"
        },
        {
          "@id": "https://demo.ckan.org/dataset/c8689e49-4fb2-43dd-85dd-ee243104a2a9/resource/5ecbfa6c-9ea0-4f5f-9fbe-eb39964c0f7f"
        },
        {
          "@id": "https://demo.ckan.org/dataset/c8689e49-4fb2-43dd-85dd-ee243104a2a9/resource/b74584c7-9a9a-4528-9c73-dc23b29c084d"
        }
      ],
      "dcat:keyword": [
        "energy",
        "málaga"
      ],
      "dct:description": "Some energy related sources from the city of Málaga",
      "dct:identifier": "c8689e49-4fb2-43dd-85dd-ee243104a2a9",
      "dct:issued": {
        "@type": "xsd:dateTime",
        "@value": "2017-06-25T17:02:11.406471"
      },
      "dct:modified": {
        "@type": "xsd:dateTime",
        "@value": "2017-06-25T17:05:24.777086"
      },
      "dct:publisher": {
        "@id": "https://demo.ckan.org/organization/f0656b3a-9802-46cf-bb19-024573be43ec"
      },
      "dct:title": "Energy in Málaga"
    },
    {
      "@id": "https://demo.ckan.org/organization/f0656b3a-9802-46cf-bb19-024573be43ec",
      "@type": "foaf:Organization",
      "foaf:name": "BigMasterUMA1617"
    },
    {
      "@id": "https://demo.ckan.org/dataset/c8689e49-4fb2-43dd-85dd-ee243104a2a9/resource/b74584c7-9a9a-4528-9c73-dc23b29c084d",
      "@type": "dcat:Distribution",
      "dcat:accessURL": {
        "@id": "http://datosabiertos.malaga.eu/recursos/energia/ecopuntos/ecoPuntos-23030.csv"
      },
      "dct:description": "Ecopuntos de la ciudad de málaga",
      "dct:format": "CSV",
      "dct:title": "Ecopuntos"
    },
    {
      "@id": "https://demo.ckan.org/dataset/c8689e49-4fb2-43dd-85dd-ee243104a2a9/resource/c3c5b857-24e7-4df7-ae1e-8fbe29db93f3",
      "@type": "dcat:Distribution",
      "dcat:accessURL": {
        "@id": "http://datosabiertos.malaga.eu/recursos/ambiente/telec/201706.csv"
      },
      "dct:description": "Los datos se corresponden a la información que se ha decidido historizar de los sensores instalados en cuadros eléctricos de distintas zonas de Málaga.",
      "dct:format": "CSV",
      "dct:title": "Lecturas cuadros eléctricos Junio 2017"
    },
    {
      "@id": "https://demo.ckan.org/dataset/c8689e49-4fb2-43dd-85dd-ee243104a2a9/resource/5ecbfa6c-9ea0-4f5f-9fbe-eb39964c0f7f",
      "@type": "dcat:Distribution",
      "dcat:accessURL": {
        "@id": "http://datosabiertos.malaga.eu/recursos/ambiente/telec/nodos.csv"
      },
      "dct:description": "Destalle de los cuadros eléctricos con sensores instalados para su gestión remota.",
      "dct:format": "CSV",
      "dct:title": "Cuadros eléctricos"
    },
    {
      "@id": "_:N71006d3e0205458db0cc7ced676f91e0",
      "@type": "vcard:Organization",
      "vcard:fn": "Gabriel Requena",
      "vcard:hasEmail": "gabi@email.com"
    }
  ]
}</code></pre>
<p>Google even provides a <a href="https://search.google.com/structured-data/testing-tool#url=https%3A%2F%2Fdemo.ckan.org%2Fdataset%2Fenergy-in-malaga.jsonld%3Fprofiles%3Dschemaorg">Structured Data Testing Tool</a> where you can submit a URL and it will tell you if the data is valid.</p>
<p>Of course knowing the CKAN API is good if you’re a developer, but not really the way to go if you want a search engine to find you datasets. So the JSON-LD that you can see above, is already embedded on the <a href="https://demo.ckan.org/dataset/energy-in-malaga">dataset page</a> (check out the <a href="https://search.google.com/structured-data/testing-tool#url=https%3A%2F%2Fdemo.ckan.org%2Fdataset%2Fenergy-in-malaga">testing tool with just the dataset URL</a>) . So if you have enabled this feature, every time a search engine visits your portal, it’ll get structured information about the dataset it crawls instead of simply the HTML of the page. <a href="https://github.com/ckan/ckanext-dcat#structured-data">Check the documentation</a> for more information, but most importantly: if you’re running CKAN, give it a try! </p>
<p>By the way: if you already have a custom profile for ckanext-dcat (e.g. for a DCAT application profile), check my current <a href="https://github.com/opendata-swiss/ckanext-switzerland/pull/177">pull request for a mapping DCAT-AP Switzerland to schema.org/Dataset</a>.</p>]]></description>
                  <enclosure url="http://liip.rokka.io/www_card_2/b0552b/23165561351-078fc782cd-k.jpg" length="1088440" type="image/jpeg" />
          </item>
    
  </channel>
</rss>
