User demand driven and machine-readable open data

water-drop

Open data is undergoing a paradigm change where the focus is shifting to user demand driven publication of data in machine-readable formats, with open standards and licenses that is appropriate for its application area. This is often refereed to as “liquid information” or “liquid data” which can be read about in this report from McKinsey’s 2013. The report address the potential value that can be achieved if standards, formats and metadata are functional for its intended use. Open data 2.0 is another emerging term which refers to data that is being made available based on demand and provides means for participation and collaboration, where users can report suggestions for improvement and provide feedback on flawed data.

office-scannerWhen the phenomenon of open data appeared on the agenda in the mid 00’s there was an idea that all data was valuable only if it were open and available. Much of the data that were published back then, were internal data which were made available in their original format and did not use common standards and formats. This could be tabular data in PDF files, scanned documents, data in spreadsheets, summarised reports that lacked underlying details etcetera.

From an innovation perspective this data did not proved to be particularly valuable because it is difficult to apply and combine with other data – meaning that the data are non-liquid. The problem has partly to do with the data from internal IT systems contain attributes linked to its internal operation, that are not easily translated or understood by people outside the organisation. To make data more useful there is often a need to harmonized and making it available in viable standards, formats with metadata to explain its properties. To harmonize internal data for external use are usually linked to cost for making it usable for other purposes. If this cost is not carried by the data owner, these costs will be shifted to the user which must transform the data to machine-readable formats and standards to be able to use it.

“… and according to European data portal only 26 percent of the total datasets on the portal are machine readable.” – Article from 9 Dec 2016.

international-delivery

User demand driven open data (open data 2.0) and liquid data, refers to machine-readable data in internationally well known formats, standards with appropriate metadata, which enabling data to be used by other IT systems and can be combined with other data. The area of geographic information systems (GIS) is an example of where data were made appropriate for its application area early on, were involved actors realised the strategic importance of collaborating nationally and internationally to enable reuse of data.

In the best of worlds all IT systems would use common data formats, standards and include a detailed metadata description, which would make data comprehensible for both external and internal users. But create a IT system is usually more complicated and messy in reality, because it is continuously developed and adapted to the business it support over a longer period of time, which often leads to a fragmented system with several ad-hoc based solutions. There is emerging evidence which suggest that the strategy of only opening up the data, without taking into account the are of application, external demand and interaction with intended users, do not lead to the sought-after results. In order to achieve better efficacy there is a need to include participation and collaboration to make open data more useful.

3d-design

Several research articles have begun to shed light on that principles of participation and collaboration need to be addressed to making better use of open data. By striking a balance between different value dimensions depending on the application area and the user group. Thorhildur Hansdottir Jetzek defines in her doctoral thesis The Sustainable Value of open goverment Data (2015) that liquid open data is; “data that are available online, free-of-charge and under an open access license, published in machine-readable formats, easily discoverable, accessible and conceptually coherent. Liquid open data can be re-used without discrimination or limitation, linked to other data and streamed across systems”.

The table (Jetzek, 2015, p.45) is from the dissertation and articulate seven value dimensions of liquid open data that affects the ability to make data useful. Which is based on that data owners has limited resources at his disposal and needs to prioritise between these dimensions to make data suitable for the intended user group and application area.

Seven dimensions of liquid open data - Jetzek 2015

Source: The Sustainable Value of Open Government Data – Thorhildur Hansdottir Jetzek 2015