Open : Data : Cooperatives – Synopsis

This is a synopsis of the meeting held in Berlin that forms the basis of the upcoming Open : Data : Cooperation event on the 20th October 2014

Open : Data  : Cooperatives.

On the evening of the 16th July 2014 in a small bar of SingerStraße in Berlin a group of Open Knowledge Festival attendees came together for a meeting, to discuss whether cooperatives offered the potential to create formalised structures for the creation and sharing of common data assets, and whether this would enable the creation of value for their stakeholders. This discussion is sets the framework for an event that will take place in Manchester UK on the 20th October 2014

The discussion was initially broken down into seven themes of

  • Models: How do the varied models of cooperative ownership fit to data, and do new forms of cooperative and commons based structure offer potential solutions?
  • Simplicity: Can one model fit all data or do different scenarios need tailored solutions
  • Transparency: How can a cooperative that is steered by its membership along ethical grounds also be considered open?
  • Representation: Do individuals have enough control over their data to enable third party organisations such as a cooperative, to represent their data?
  • Negotiation: How can cooperative members balance control over their data with use by third parties?

  • Governance: Is it possible to create an efficient system of governance that respected the wishes of all members?
  • Mechanisms of transaction: Can a data cooperative exist within a federated cooperative structure and how would it transact and create value.

This is a synopsis of the discussion

Why create a data cooperative?

Our modern, technologised society exists on data. Our everyday interactions leave a trace that is often invisible and unknown to us. The services that we interact with, the daily transactions that we make and the way we negotiate through our everyday generate data, building a picture of who we are and what we do. This data also enables aggregators to predict, personalise and intervene seamlessly and sometimes invisibly. Even for the most technically literate, keeping track of what we do and don’t give away is daunting. There is a need to stem the unbridled exploitation of personal data by both public and private organisations, to empower individuals to have more control over the data they create, and for people to have more of a say in the services that are built upon and informed by this data. Data cooperatives may help rebalance the relationship between those that create data and those that seek to exploit it whilst also creating the environment for fair and consensual exchange.

Structure

Cooperation for the creation of common good is a widely understood concept and in a world where value is often extracted by large organisations with opaque processes and ethics, they are starting to be seen as a way of reinvigorating value transactions within smaller, often under-represented communities of interest, and between organisations that create and use data.

Finding already existing data cooperatives is not easy. Examples such as The Good Data which allow people to control data flow at a browser level and the Swiss-based Health Bank are two known examples, and as the principles of data custodianship for social good become understood there is little to challenge that more would develop.

There are organisations that exhibit cooperative traits but may not themselves be cooperatives or co-owned structures. Open Street Map (OSM) is a resource that is essentially created and administered by the community, with the underlying motivation for OSM being for common good. The open source movement was cited as being the largest example of technological cooperativism, although the largest platform on which cooperative endeavour is expressed (GitHub) is a privately owned Silicon Valley entity.

There are many versions of coops. These have traditionally come out of the needs of the membership who subscribe to them. Structures of these cooperatives have generally been organised around a single class of member – workers, producers, consumers, etc. The single class structure, although creating an equitable environment for those that are members of a particular coop, can tend towards self interest and although they may be bound by the notion of the common good, the mechanism for the creation of the common good or commons is seldom explicit.

Internationally the creation of new forms of cooperatives that explicitly express the development of common good across multiple classes of stakeholders are more abundant. Social co-ops in Italy and Solidarity coops in Canada often provide services such as healthcare, education and social care. Could these types of cooperative be more relevant for our networked and distributed age?

Michel Bauwens founder of the P2P Foundation talks about the creation of these new forms of cooperatives, and how it is necessary to wean ourselves off the notion of cooperativism as a means of participation in a capitalist economy, to one that builds a commons both material and immaterial. This commons would be subscribed to by other commons creating entities and licenced to non-commons creating organisations.

Would a data cooperative necessarily adopt these newer forms of distributed and commons creating structure? There appears to be a consensus that commons creating, multi-stakeholders cooperatives are positive, but is this model easily understood? And can individual circumstances especially when dealing with communities based around sensitive issues, create an environment for sharing beyond a single class? A single class cooperative may seem to be a simpler, immediate solution for a community of people who have specific needs and issues and where strong trust relationships need to be maintained.

It is understood that personal data empowerment is not just about selling data to the highest bidder and any organisation acting as a data intermediary would need to be able to accommodate the complexity of reasons as to why people donate or give. Even though economic gain might seem an obvious attraction for people, motivations are more complex and often financial incentives can be detrimental to the process of participation and giving.

From The Good Data’s perspective data cooperatives should split the data layer from the service layer. The cooperative should control the data layer and enable/choose others to build the service layer as it is likely that data cooperatives would not have the capacity or expertise to create end to end solutions.

The structure of the data cooperative should encourage maximum participation and consent, although 100% participation and engagement is unrealistic. Flat structures have a tendency towards hierarchy through operational efficiency and founder endeavour. Even though the majority of members align with the aims of the cooperative, it doesn’t necessarily mean that they want to be constantly encumbered with the burden of governance.

A certain pragmatism and sensitivity needs to be adopted to the model of cooperative that a group may want to adopt. There are examples of communities maintaining informality to enable themselves to be less burdened by expectation, to maintain independence or minimise liability. Advocates of data cooperatives need to be sensitive to this.

Purpose

Data Cooperatives need to have a simplicity of purpose. What do they do, for whom and why? Is the building of data cooperative around particular issue enough? Or do we need to take a look at the data cooperative as being a platform that allows the representation of personal data across a broader portfolio of interests?

Although the there is a tendency to see a data cooperative as being a mechanism to generate bulk, high worth data that can then be used to draw down value from large organisations, a more appropriate application might be in enabling a smaller community of interest, perhaps around a particular health condition, to draw down certain services or to negotiate for a better deal. The notion of withholding data from public service providers might be seen to be detrimental to the delivery of that service, but it could also create a more balanced decision making process. It is also known that many providers of service collect more data than they actually need for the delivery of that service. Empowering people to take more control over their data may create a situation where the practice of excessive data gathering is curtailed.

Data literacy

Ideally for a data cooperative to be most effective, the level of data literacy amongst members would need to be raised so that members could make more informed decisions about what data was given away or used. This ideal might be difficult to achieve without a broader awareness raising campaign about the power of personal data. The revealing of the ways that security agencies collect data by Edward Snowdon was sensational and although it highlighted that we unintentionally give away a lot, it didn’t build a wider popular discourse around protection and usage of personal data.

Raising the level of data awareness amongst cooperative members would create more informed decision making, but this task would need to be delivered in a nuanced way and ultimately some people might not engage. This could be the case with people who are dependant on service and have little power or real choice as to their decisions.

For a data cooperative to represent its membership and control the flow of data it needs to have legitimacy, know and understand the data assets of the membership, and have the authority to negotiate with those data assets on the members behalf.

Decisions around data sharing and understanding the potential consequences are difficult and complex. As an intermediary the cooperative would need to ensure that individual members were able to give informed consent. We have to know what we have and what it does for us, in order to utilise it.

Mechanisms of consent

There already exist mechanisms for the creation of consent. These by and large create the environment for proxy voting in decision making processes. A mechanism such as Liquid Feedback – popularised by the Pirate Parties, where an individual bestows voting rights to a proxy who aligns to their position, is a representative democracy process, the ‘liquid’ element allows proxy rights to be revoked at any point in the decision making process. Other mechanisms might follow along the lines of the Platform Preferences initiative developed by W3C, which sought to create privacy policies that could be understood by browsers which was ultimately considered too difficult to implement. A potentially easier solution might work on the basis of preset preferences based on trusted individuals or the creation of archetype or persona based preferences that people can select.

Can one organisation be representative of the broader range of ethical positions held within a membership structure? For practical reasons the data cooperative might have a high level ethical policy but individuals within the cooperative are empowered to make data sharing choices based on their personal ethical standpoint. This could be enabled by proxy or preset data sharing preferences.

The alternative to having data coops with high level ethical aims that also represent multiple ethical standpoints could be to have smaller federated or distributed niche organisations where individuals could allow the organisation to use their data on their behalf.

 Right to personal data

In order for an individual to allow an organisation to use data on their behalf we need to have control over our individual personal data. Legislation in many countries offers a framework about how personal data is used and shared amongst organisations, but these don’t necessarily create a mechanism that allows users to retrieve their data and use it for other purposes. Often within the End User License Agreement (EULA) or Terms of Service that come with software products an individual may find that their data is inexorably tied up with the function of the service. A function of a data cooperative might be to help individuals understand these agreements and add to the commons of knowledge about them.

How would the argument for greater individual data rights be made when service providers see that personal data mediated through their product part of their intellectual property? Work has been done through the midata initiative and the development of personal data passports – where individuals grant rights to organisations to use the data for delivery of service. UK Government has supported this initiative, but has backed away from underpinning the programme with changes in legislation. This lack of regulatory enforcement may limit the efficacy of any initiative that seeks to grant individuals’ rights and agency over their data.

The development of a personal data licence may aid the creation of data cooperatives but the form of the licence and the mechanism for compliance might be weakened without an underpinning regulatory framework. At present there is a certain level of cynicism around voluntary codes of practice where power imbalances exist between stakeholders. The lack of legislation might also create a chilling effect on the ability of data cooperatives to gain the trust of their membership.

Data empowerment is promoted in Project VRM (Vendor Relation Management) developed by Doc Searls at Harvard University. The ability for an individual to have control over their data is an integral component of developing an open market for personal data-based services and theoretically giving more choice. The criticism voiced about midata and Project VRM is that they are too individualistic and focus on economic rather than social transaction with ethical aims. Even with these criticisms the development of a market logic to enable large organisations to engage with the process of individual data empowerment might be beneficial for the long term aims of data cooperatives and for the development of innovative service for social good.

Ultimately if the individual isn’t able to have control over their data or the data derived from them then the function of the cooperative would be inhibited.

Creating value from data

It could emerge that scale could dictate the eventual form of the data cooperative. Many potential clients of a data cooperative might require this, which would see the need to build a data asset that contained upwards of 500,000 users. The Good Data cooperative’s aim is to achieve this scale to become viable.

A challenge that all data cooperatives would face would be how they maintain a relationship with their membership so that service based upon, or value that is extracted from the data is not subject to unforeseen supply-side problems. If a data cooperative represented its membership and entered into licensing relationships with third party organisations on behalf of its membership, what would be reasonable for a client to expect, especially if individual members had the rights to revoke access to data at anytime? With larger scale data cooperatives this may not be too much of a problem as scale has the potential to damp down unforeseen effects. The Good Data proposes to get around these issues by only holding data for a limited amount of time essentially minimising disruptions in data supply by creating a buffer.

Smaller scale data cooperatives, especially ones that are created around single issues may have difficulty in engaging in activity that requires service guarantees. Developing a mechanism for federation, cumulatively creating data at scale might be a potential solution, but creating a federated system of consent may be more difficult to achieve. As suggested previously economic activity might be a low priority for such organisations where the main purpose might be to represent members and create the environment for informed service decisions.

The challenge facing federated data cooperatives and how they interact is undefined. It has been noted that building distributed and federated systems is difficult, and that centralised systems persist due to operational efficiencies. The advent of alternative forms of ‘block chain’ transaction could enable distributed organisations to coexist using ‘rules based’ or algorithmic democracy. But alternative transaction systems and currencies often face challenges when they interface with dominant and established forms of currency and value.

How data cooperatives could practically use these new mechanisms for exchange needs to be explored.

Attendees:

Reuben Binns
Mark Braggins
Alex Fink
Steven Flower
Robin Gower
Frank Kresin
Marcos Menendez
Annemarie Naylor
Julian Tait
Kristof van Tomme
Ben Webb

Licensing – Why it is so important

This blog post originally was originally written for FutureEverything as part of their Open Data Cities programme.

I’m no expert but I really need to be – Licensing

Licensing is a subject that comes up a lot with Open Data. The licence is a key component of the dataset. It defines the use and liability and it shapes how or what innovation will come from data release.

As mentioned in the title I am no expert in this area and I would appreciate any correction or amendments to my understanding.

Traditionally public data has been closed so that the only way you could get access to data to build products was by buying a licence to use. In many cases these licences were expensive and restrictive. The to mitigate this cost often, te licence would also have some level of service agreement built in. You paid for the licence for the data and the data provider would provide you with a level of continuity and support. This helps to limit risk and encourage investment into a product.

The closed ‘paid licence’ system generally has a high barrier to entry ‘price of licence’ limiting the amount of innovative products developed. If innovation ecosystems are ideas that live with most failing. The price of failure being too high could have a chilling effect on the whole system.

One of the first licenses used for the release of Open Data was Creative Commons CC-BY-SA. This licence allowed people to create services and products off the back of the data as long as they attribute where the data came from and share back any data that was created off the back off the originally released dataset (value-added data). The original Creative Commons licenses were devised as an answer to restrictive copyright laws relating to ‘works’ – articles, text, images, music etc., as these were deemed increasingly anachronistic in the digital age. It is up for discussion if data can be deemed as a ‘work’ in the context of this licence.

The Open Database Licence (ODbL) developed by Open Data Commons, was created to address the doubt that data could be seen as a ‘work’. It carries the same attribution and share alike clauses and is used by many datastores including the newly opened Paris Datastore.

Anyone can develop products and services that use datasets with these licences but intellectual property doesn’t extend to the value-added datasets created in the process of developing these products. Releasing value-added datasets back to the community allows further innovative products to be released off the back of these datasets, so potentially the pace of innovation could be increased – It is analogous to the ‘standing on the shoulders of giants’ idea.

By imposing further use of value-added data by other organisations might chill the development of products that create value-added data.

With the above licences there is generally no liability or guarantee of service from data providers. This creates a greater risk scenario. If you were investing in product development this potentially is a source of concern and may be an inhibiting factor

In the UK we have the recently released Open Government Data Licence. That was developed specifically for government data. It borrows from some aspects of the CC-BY-SA licence and ODbL. Unlike the those licences there is no need to share back value-added data.

Would this have any impact on products and services that are developed from Open Data? Again in the licence there is no liability or guarantee of service from the data provider but the developing organisation gets to keep all the rights on the products and services they develop – including value-added datasets.
The advantage of this could be that by allowing people to keep hold of the rights to the products that they develop might be mitigate against the exposed risk posed by the lack of liability and guarantee. The main disadvantage could be that the pace of innovation could be curtailed due to people having to replicate process and value-added datasets.