Open : Data : Cooperatives – Synopsis

This is a synopsis of the meeting held in Berlin that forms the basis of the upcoming Open : Data : Cooperation event on the 20th October 2014

Open : Data  : Cooperatives.

On the evening of the 16th July 2014 in a small bar of SingerStraße in Berlin a group of Open Knowledge Festival attendees came together for a meeting, to discuss whether cooperatives offered the potential to create formalised structures for the creation and sharing of common data assets, and whether this would enable the creation of value for their stakeholders. This discussion is sets the framework for an event that will take place in Manchester UK on the 20th October 2014

The discussion was initially broken down into seven themes of

  • Models: How do the varied models of cooperative ownership fit to data, and do new forms of cooperative and commons based structure offer potential solutions?
  • Simplicity: Can one model fit all data or do different scenarios need tailored solutions
  • Transparency: How can a cooperative that is steered by its membership along ethical grounds also be considered open?
  • Representation: Do individuals have enough control over their data to enable third party organisations such as a cooperative, to represent their data?
  • Negotiation: How can cooperative members balance control over their data with use by third parties?

  • Governance: Is it possible to create an efficient system of governance that respected the wishes of all members?
  • Mechanisms of transaction: Can a data cooperative exist within a federated cooperative structure and how would it transact and create value.

This is a synopsis of the discussion

Why create a data cooperative?

Our modern, technologised society exists on data. Our everyday interactions leave a trace that is often invisible and unknown to us. The services that we interact with, the daily transactions that we make and the way we negotiate through our everyday generate data, building a picture of who we are and what we do. This data also enables aggregators to predict, personalise and intervene seamlessly and sometimes invisibly. Even for the most technically literate, keeping track of what we do and don’t give away is daunting. There is a need to stem the unbridled exploitation of personal data by both public and private organisations, to empower individuals to have more control over the data they create, and for people to have more of a say in the services that are built upon and informed by this data. Data cooperatives may help rebalance the relationship between those that create data and those that seek to exploit it whilst also creating the environment for fair and consensual exchange.

Structure

Cooperation for the creation of common good is a widely understood concept and in a world where value is often extracted by large organisations with opaque processes and ethics, they are starting to be seen as a way of reinvigorating value transactions within smaller, often under-represented communities of interest, and between organisations that create and use data.

Finding already existing data cooperatives is not easy. Examples such as The Good Data which allow people to control data flow at a browser level and the Swiss-based Health Bank are two known examples, and as the principles of data custodianship for social good become understood there is little to challenge that more would develop.

There are organisations that exhibit cooperative traits but may not themselves be cooperatives or co-owned structures. Open Street Map (OSM) is a resource that is essentially created and administered by the community, with the underlying motivation for OSM being for common good. The open source movement was cited as being the largest example of technological cooperativism, although the largest platform on which cooperative endeavour is expressed (GitHub) is a privately owned Silicon Valley entity.

There are many versions of coops. These have traditionally come out of the needs of the membership who subscribe to them. Structures of these cooperatives have generally been organised around a single class of member – workers, producers, consumers, etc. The single class structure, although creating an equitable environment for those that are members of a particular coop, can tend towards self interest and although they may be bound by the notion of the common good, the mechanism for the creation of the common good or commons is seldom explicit.

Internationally the creation of new forms of cooperatives that explicitly express the development of common good across multiple classes of stakeholders are more abundant. Social co-ops in Italy and Solidarity coops in Canada often provide services such as healthcare, education and social care. Could these types of cooperative be more relevant for our networked and distributed age?

Michel Bauwens founder of the P2P Foundation talks about the creation of these new forms of cooperatives, and how it is necessary to wean ourselves off the notion of cooperativism as a means of participation in a capitalist economy, to one that builds a commons both material and immaterial. This commons would be subscribed to by other commons creating entities and licenced to non-commons creating organisations.

Would a data cooperative necessarily adopt these newer forms of distributed and commons creating structure? There appears to be a consensus that commons creating, multi-stakeholders cooperatives are positive, but is this model easily understood? And can individual circumstances especially when dealing with communities based around sensitive issues, create an environment for sharing beyond a single class? A single class cooperative may seem to be a simpler, immediate solution for a community of people who have specific needs and issues and where strong trust relationships need to be maintained.

It is understood that personal data empowerment is not just about selling data to the highest bidder and any organisation acting as a data intermediary would need to be able to accommodate the complexity of reasons as to why people donate or give. Even though economic gain might seem an obvious attraction for people, motivations are more complex and often financial incentives can be detrimental to the process of participation and giving.

From The Good Data’s perspective data cooperatives should split the data layer from the service layer. The cooperative should control the data layer and enable/choose others to build the service layer as it is likely that data cooperatives would not have the capacity or expertise to create end to end solutions.

The structure of the data cooperative should encourage maximum participation and consent, although 100% participation and engagement is unrealistic. Flat structures have a tendency towards hierarchy through operational efficiency and founder endeavour. Even though the majority of members align with the aims of the cooperative, it doesn’t necessarily mean that they want to be constantly encumbered with the burden of governance.

A certain pragmatism and sensitivity needs to be adopted to the model of cooperative that a group may want to adopt. There are examples of communities maintaining informality to enable themselves to be less burdened by expectation, to maintain independence or minimise liability. Advocates of data cooperatives need to be sensitive to this.

Purpose

Data Cooperatives need to have a simplicity of purpose. What do they do, for whom and why? Is the building of data cooperative around particular issue enough? Or do we need to take a look at the data cooperative as being a platform that allows the representation of personal data across a broader portfolio of interests?

Although the there is a tendency to see a data cooperative as being a mechanism to generate bulk, high worth data that can then be used to draw down value from large organisations, a more appropriate application might be in enabling a smaller community of interest, perhaps around a particular health condition, to draw down certain services or to negotiate for a better deal. The notion of withholding data from public service providers might be seen to be detrimental to the delivery of that service, but it could also create a more balanced decision making process. It is also known that many providers of service collect more data than they actually need for the delivery of that service. Empowering people to take more control over their data may create a situation where the practice of excessive data gathering is curtailed.

Data literacy

Ideally for a data cooperative to be most effective, the level of data literacy amongst members would need to be raised so that members could make more informed decisions about what data was given away or used. This ideal might be difficult to achieve without a broader awareness raising campaign about the power of personal data. The revealing of the ways that security agencies collect data by Edward Snowdon was sensational and although it highlighted that we unintentionally give away a lot, it didn’t build a wider popular discourse around protection and usage of personal data.

Raising the level of data awareness amongst cooperative members would create more informed decision making, but this task would need to be delivered in a nuanced way and ultimately some people might not engage. This could be the case with people who are dependant on service and have little power or real choice as to their decisions.

For a data cooperative to represent its membership and control the flow of data it needs to have legitimacy, know and understand the data assets of the membership, and have the authority to negotiate with those data assets on the members behalf.

Decisions around data sharing and understanding the potential consequences are difficult and complex. As an intermediary the cooperative would need to ensure that individual members were able to give informed consent. We have to know what we have and what it does for us, in order to utilise it.

Mechanisms of consent

There already exist mechanisms for the creation of consent. These by and large create the environment for proxy voting in decision making processes. A mechanism such as Liquid Feedback – popularised by the Pirate Parties, where an individual bestows voting rights to a proxy who aligns to their position, is a representative democracy process, the ‘liquid’ element allows proxy rights to be revoked at any point in the decision making process. Other mechanisms might follow along the lines of the Platform Preferences initiative developed by W3C, which sought to create privacy policies that could be understood by browsers which was ultimately considered too difficult to implement. A potentially easier solution might work on the basis of preset preferences based on trusted individuals or the creation of archetype or persona based preferences that people can select.

Can one organisation be representative of the broader range of ethical positions held within a membership structure? For practical reasons the data cooperative might have a high level ethical policy but individuals within the cooperative are empowered to make data sharing choices based on their personal ethical standpoint. This could be enabled by proxy or preset data sharing preferences.

The alternative to having data coops with high level ethical aims that also represent multiple ethical standpoints could be to have smaller federated or distributed niche organisations where individuals could allow the organisation to use their data on their behalf.

 Right to personal data

In order for an individual to allow an organisation to use data on their behalf we need to have control over our individual personal data. Legislation in many countries offers a framework about how personal data is used and shared amongst organisations, but these don’t necessarily create a mechanism that allows users to retrieve their data and use it for other purposes. Often within the End User License Agreement (EULA) or Terms of Service that come with software products an individual may find that their data is inexorably tied up with the function of the service. A function of a data cooperative might be to help individuals understand these agreements and add to the commons of knowledge about them.

How would the argument for greater individual data rights be made when service providers see that personal data mediated through their product part of their intellectual property? Work has been done through the midata initiative and the development of personal data passports – where individuals grant rights to organisations to use the data for delivery of service. UK Government has supported this initiative, but has backed away from underpinning the programme with changes in legislation. This lack of regulatory enforcement may limit the efficacy of any initiative that seeks to grant individuals’ rights and agency over their data.

The development of a personal data licence may aid the creation of data cooperatives but the form of the licence and the mechanism for compliance might be weakened without an underpinning regulatory framework. At present there is a certain level of cynicism around voluntary codes of practice where power imbalances exist between stakeholders. The lack of legislation might also create a chilling effect on the ability of data cooperatives to gain the trust of their membership.

Data empowerment is promoted in Project VRM (Vendor Relation Management) developed by Doc Searls at Harvard University. The ability for an individual to have control over their data is an integral component of developing an open market for personal data-based services and theoretically giving more choice. The criticism voiced about midata and Project VRM is that they are too individualistic and focus on economic rather than social transaction with ethical aims. Even with these criticisms the development of a market logic to enable large organisations to engage with the process of individual data empowerment might be beneficial for the long term aims of data cooperatives and for the development of innovative service for social good.

Ultimately if the individual isn’t able to have control over their data or the data derived from them then the function of the cooperative would be inhibited.

Creating value from data

It could emerge that scale could dictate the eventual form of the data cooperative. Many potential clients of a data cooperative might require this, which would see the need to build a data asset that contained upwards of 500,000 users. The Good Data cooperative’s aim is to achieve this scale to become viable.

A challenge that all data cooperatives would face would be how they maintain a relationship with their membership so that service based upon, or value that is extracted from the data is not subject to unforeseen supply-side problems. If a data cooperative represented its membership and entered into licensing relationships with third party organisations on behalf of its membership, what would be reasonable for a client to expect, especially if individual members had the rights to revoke access to data at anytime? With larger scale data cooperatives this may not be too much of a problem as scale has the potential to damp down unforeseen effects. The Good Data proposes to get around these issues by only holding data for a limited amount of time essentially minimising disruptions in data supply by creating a buffer.

Smaller scale data cooperatives, especially ones that are created around single issues may have difficulty in engaging in activity that requires service guarantees. Developing a mechanism for federation, cumulatively creating data at scale might be a potential solution, but creating a federated system of consent may be more difficult to achieve. As suggested previously economic activity might be a low priority for such organisations where the main purpose might be to represent members and create the environment for informed service decisions.

The challenge facing federated data cooperatives and how they interact is undefined. It has been noted that building distributed and federated systems is difficult, and that centralised systems persist due to operational efficiencies. The advent of alternative forms of ‘block chain’ transaction could enable distributed organisations to coexist using ‘rules based’ or algorithmic democracy. But alternative transaction systems and currencies often face challenges when they interface with dominant and established forms of currency and value.

How data cooperatives could practically use these new mechanisms for exchange needs to be explored.

Attendees:

Reuben Binns
Mark Braggins
Alex Fink
Steven Flower
Robin Gower
Frank Kresin
Marcos Menendez
Annemarie Naylor
Julian Tait
Kristof van Tomme
Ben Webb

The Economics of Open Data

Data doesn’t make for a very good tradable commodity. It’s benefits spread well beyond the people who trade in it, it’s almost impossible to stop people from copying and sharing it, and it can be enjoyed by multiple people at the same time.

In a post written for Open Data Manchester on The Economics of Open Data, regular member Robin Gower, explains how these characteristics mean that data will have a much greater economic and social impact if it is made available as open data. He also discusses the implications for established “closed-data” business models and for the government.

ODM Response to the Public Data Corporation consultation

Charging for Public Data Corporation information

1. How do you think Government should best balance its objectives around increasing access to data and providing more freely available data for re-use year on year within the constraints of affordability? Please provide evidence to support your answer where possible.

This question is framed incorrectly. For open data to be truly sustainable there has to be a shift away from the notion of affordability and access. Open Data is part of a transformation of how services are delivered within and by government and how government relates to people and business. What we should be moving to is the notion of Government as platform where the data that the government uses for its own purposes is also seamlessly available for reuse.

2. Are there particular datasets or information that you believe would create particular economic or social benefits if they were available free for use and re-use? Who would these benefit and how? Please provide evidence to support your answer where possible.

We see that there are a number of core ‘infrastructure’ datasets that have allowed systems to be developed within the UK. The majority being run by trading funds. Consolidating their charging position within the PDC will have a chilling effect not only on the direct creation of applications and services but on an underlying data ecosystem that will create social and economic value. It has impact on future technological developments where applications need to be aware of their relation to core data infrastructure. This is particularly important with the emerging development of the Internet of Things and pervasive technologies.
Whilst developing the Open Data Cities project in 2009 and DataGM – The Greater Manchester Datastore with Trafford Council it became apparent that local authority and community access to certain data such as Land Registry data was creating problems. Anecdotally it had been suggested that easy and open access to Land Registry data would help combat cross boundary housing benefit fraud and would of eliminated the MPs second home scandal.

3. What do you think the impacts of the three options would be for you and/or other groups outlined above? Please provide evidence to support your answer where possible.
The charging options outlined will all have impact on the development of open data services/applications and future technologies where open data is an enabler.
All three models are flawed in that they are trying to predict and extract value from an emergent field. They fail to take into account what is needed to create a sustainable, innovative and disruptive data ecosystem. Disruptive innovation in emerging fields needs to have a low barrier to entry and the creation of an ecosystem where ideas can be tested, fail and succeed with marginal cost.

4. A further variation of any of the options could be to encourage PDC and its constituent parts to make better use of the flexibility to develop commercial data products and services outside of their public task. What do you think the impacts of this might be?
By encouraging public organisations to develop services outside the public task has the potential to distort an emerging market and should be treated with caution. The knowledge that many public organisations hold in regard to their task is unique and could be encouraged as long as the underlying raw data resources are available to all.

5. Are there any alternative options that might balance Government’s objectives which are not covered here? Please provide details and evidence to support your response where possible.

5. There needs to be an appraisal of the wider value and impact of releasing public data. This impact should not just be seen as a simple transactional value but a broader impact on the engagement and wellbeing of society.

Licensing
 
1. To what extent do you agree that there should be greater consistency, clarity and simplicity in the licensing regime adopted by a PDC?
It is understood that having multiple licensing regimes can create confusion and hence hinder the development of interpretations, applications and services. The danger of ‘double licensing’ is real especially as products become more complex. The adoption of OGL should be seen as a default position for raw open public data. At the moment within public datastores such as DataGM there are numerous licensing options most with a potential to cause confusion and contaminate downstream data usage. This confusion has also been used as an excuse for not releasing data.

2. To what extent do you think each of the options set out would address those issues (or any others)? Please provide evidence to support your comments where possible.
The potential impact of different organisations within the PDC to define their own licenses to suit different uses of data usage presupposes that the data provider has an appreciation of the potential uses of the data. This may work in an environment where products are developed in one specific domain but when innovation is cross cutting the need for standardisation and clarity becomes clear. Whilst the third option of a single PDC licence with adapted schedules of use would seem easiest. The question fails to recognise that raw open public data should be free by default with exemptions being rigorously justified.

3. What do you think the advantages and disadvantages of each of the options would be? Please provide evidence to support your comments
Please see above

4. Will the benefits of changing the models from those in use across Government outweigh the impacts of taking out new or replacement licences?
Yes, as the current licensing regime is opaque and hinders innovation and innovation drives the economy.

Oversight

1. To what extent is the current regulatory environment appropriate to deliver the vision for a PDC?
You cant have a system of oversight which fails to engage users. It is necessary to have one robust and representative regulatory environment that has real powers to make PDC based organisations compliant. The representation should be a balance of suppliers and users of data.

2. Are there any additional oversight activities needed to deliver the vision for a PDC and if so what are they?
Apart from making sure that raw public data is made open and freely available, No

3. What would be an appropriate timescale for reviewing a PDC or its constituent parts public task(s)?
Six monthly initially then after the initiative becomes embedded less often

Licensing – Why it is so important

This blog post originally was originally written for FutureEverything as part of their Open Data Cities programme.

I’m no expert but I really need to be – Licensing

Licensing is a subject that comes up a lot with Open Data. The licence is a key component of the dataset. It defines the use and liability and it shapes how or what innovation will come from data release.

As mentioned in the title I am no expert in this area and I would appreciate any correction or amendments to my understanding.

Traditionally public data has been closed so that the only way you could get access to data to build products was by buying a licence to use. In many cases these licences were expensive and restrictive. The to mitigate this cost often, te licence would also have some level of service agreement built in. You paid for the licence for the data and the data provider would provide you with a level of continuity and support. This helps to limit risk and encourage investment into a product.

The closed ‘paid licence’ system generally has a high barrier to entry ‘price of licence’ limiting the amount of innovative products developed. If innovation ecosystems are ideas that live with most failing. The price of failure being too high could have a chilling effect on the whole system.

One of the first licenses used for the release of Open Data was Creative Commons CC-BY-SA. This licence allowed people to create services and products off the back of the data as long as they attribute where the data came from and share back any data that was created off the back off the originally released dataset (value-added data). The original Creative Commons licenses were devised as an answer to restrictive copyright laws relating to ‘works’ – articles, text, images, music etc., as these were deemed increasingly anachronistic in the digital age. It is up for discussion if data can be deemed as a ‘work’ in the context of this licence.

The Open Database Licence (ODbL) developed by Open Data Commons, was created to address the doubt that data could be seen as a ‘work’. It carries the same attribution and share alike clauses and is used by many datastores including the newly opened Paris Datastore.

Anyone can develop products and services that use datasets with these licences but intellectual property doesn’t extend to the value-added datasets created in the process of developing these products. Releasing value-added datasets back to the community allows further innovative products to be released off the back of these datasets, so potentially the pace of innovation could be increased – It is analogous to the ‘standing on the shoulders of giants’ idea.

By imposing further use of value-added data by other organisations might chill the development of products that create value-added data.

With the above licences there is generally no liability or guarantee of service from data providers. This creates a greater risk scenario. If you were investing in product development this potentially is a source of concern and may be an inhibiting factor

In the UK we have the recently released Open Government Data Licence. That was developed specifically for government data. It borrows from some aspects of the CC-BY-SA licence and ODbL. Unlike the those licences there is no need to share back value-added data.

Would this have any impact on products and services that are developed from Open Data? Again in the licence there is no liability or guarantee of service from the data provider but the developing organisation gets to keep all the rights on the products and services they develop – including value-added datasets.
The advantage of this could be that by allowing people to keep hold of the rights to the products that they develop might be mitigate against the exposed risk posed by the lack of liability and guarantee. The main disadvantage could be that the pace of innovation could be curtailed due to people having to replicate process and value-added datasets.

Why Open Data?

Back in May 2009 after the final presentations at Futuresonic 09. I sat down with Adam Greenfield and we talked about how cities evolved and grew, and how they developed inequalities through those that have access to information and those who don’t. This coupled with an individual’s ability to act on that information in a meaningful way begged the question, that if all information/data was open and available, how would a city evolve? Would it grow with the same asymmetries, as Adam suggested in his Futuresonic presentation, is this inequality a preconfigured state?

At the time there were few cities who had embarked down the route of fully opening up their datasets although some cities in North America had started a process that would eventually, as in the case of Vancouver, lead to an adoption of open source, open standards and open data principles.

It was through seeing this emergence of open systems that the Open Data City project began to evolve. Data is is the lifeblood of our modern technologised society. It tracks, evidences and creates mechanisms for decisions. Much of this data doesn’t exist outside the confines of City Hall but we see evidence of the impact of this data everyday. Speed humps suddenly appear on your road or your bus doesn’t turn up when you thought it would. Bins only get emptied every two weeks or your local school closes down. This is the physical manifestation of the publicly held data that few have access to.

The inability to connect action taken by a public body with the evidence on which the decisions are made can have an insidious and corrosive effect on the relationship between the citizenry and government. Just as Louis Brandeis said ‘Sunlight is the best disinfectant’ with regards to transparency and corruption, the opposite is also true. In a closed system even though the decisions might be taken with the most honourable of intentions, the lack of evidence for the decision creates doubt, rumour and misrepresentation. In a closed system the power of the media increases as the distrust of the political sphere decreases. The media becomes the interlocutor and which can interfere with the relationship between citizen and government. This all presumes that those that govern have nothing to hide. The lack of transparency in government creates the opportunity for the media to expose the bad apples using a system of clandestine briefings and investigative reporting. This process of exposé undermines the trust the public has in the system of government because there is no evidence to the contrary or that the evidence that people can see has been derived from a seemingly arbitrary decision making process.

The opportunity has arisen for public bodies to create a new relationship with the people who they serve. A more transparent and open system can lead to a more equitable environment, where the citizen is not a customer or passive consumer of service and information, but an engaged citizen who is able to make decisions based upon facts, not rumour and can hold to account public servants with less than honest intentions.

The Sunlight Foundation www.sunlightfoundation.com, named after the Louis Brandeis quote, are an American lobby group advocating transparency in government. They have produced this graphic which they call the Cycle of Transparency which aptly illustrates the benefits of transparency in government. As each element of the Cycle of Transparency moves forward concurrently, bringing about the changes needed to create a more transparent government whilst identifying new needs.

The Cycle Of Transparency highlights the use of technology to make information open and accessible. It can be argued that transparency and openness has been enabled by digital technology. People are now able to access, interpret and distribute information easily. Until quite recently, the channels for making information open and accessible where limited and to a certain extent controlled.

The landscape is changing. The opening up of data will have a seismic effect on the way we access and share information. New services will be created, as citizens and institutions demand the ability to interpret and navigate through data in the way they want. It will create a more efficient data environment where information is shared rather than duplicated, and it will highlight errors in the system with anomalies being addressed rather than hidden.