Chapter 4 – Charging for PDC information
1. How do you think Government should best balance its objectives around increasing access to data and providing more freely available data for re-use year on year within the constraints of affordability? Please provide evidence to support your answer where possble.
Placr Ltd. published an open letter to Andrew Tyrie MP on our blog which presented arguments in response to this question:
Monopolies are bad; government monopolies are worse
All of the government’s options for the PDC in this consultation envisage the creation of a state trading body with a monopoly over the sale of the most important items of UK digital data infrastructure, viz. maps, land records, addresses and weather data. They are likely to be joined by company records, court proceedings and statistical data after the PDC is established. This data is already collected and paid for by government as part of its public task. In the digital era it is not necessary or desirable for the government to monopolise the distribution of this data: it can be released at marginal cost as open data through data.gov.uk. Monopolies face no pressure on their costs, and they act anti-competitively by instinct… yet almost every digital service business will have to pay a stealth tax to this state monopoly if the government proceeds with its plan for the PDC.
Growth needs free markets
Entrepreneurs can only invest with confidence when markets are free. The presence of a monopoly state body in the market for digital services is bad for investor confidence, and the uncertainty this engenders is hindering SMEs like mine from raising funding to power growth. By taking a significant amount of revenue out of this market in new tax government will make it harder for UK digital service businesses to grow quickly and create jobs. Meanwhile our competitors (think Ireland, Netherlands, USA) are liberalising their markets by releasing this core reference data freely and ‘getting out of the way of innovation’ as Michael Cross pointed out in his excellent article on the PDC in the Telegraph.
A failure to understand digital infrastructure
The government’s consultation document on the PDC fails to see digital core reference data as part of 21st century digital infrastructure. The use of addresses, maps, weather, statistics etc. in digital form is so pervasive in public administration and business that its wide availability and low cost is a necessary component of growth across the economy. The agencies to be brought together into PDC have just created the first common national address database (known as Geoplace). Despite it being a crucial piece of digital infrastructure it is only available as a high cost data product. This will limit the use of a dataset that should be used by absolutely every organisation in the country, with the loss of all the standardisation and efficiency benefits that this would bring.
Charging for transparency
Open data releases have created new services (e.g. live bus departures through Traveline), increased transparency (e.g. Treasury COINS database) and shown that key policy-making datasets (e.g. crime data) have major errors that need fixing. Every one of these open datasets will be dependent on the core reference data that the proposed PDC would control and tax. The government has seriously conflicted its policy on transparency by proposing to create a state monopoly that will hold a veto over data pricing and distribution. As an example, national transport datasets such as public transport performance, accident statistics and traffic counts will be free, but the maps against which they need to be referenced will be charged for by the PDC. New Zealand’s attempt to create a PDC in 2001 led to disaster with the Terralink trading fund going bankrupt without the possibility of a state rescue. The private sector picked up the state assets very cheaply from the liquidator leading to a total loss of the core data from the public sector.
Government trading with itself
In the consultation the government rejects the ‘Data utility’ (aka ‘open data’ publication) option for the PDC as it is “unaffordable” (para 4.16), and it excludes this option from the consultation. The existing agencies are trading funds with a combined turnover of £675m and costs of £585m, yet most of their revenues are from government itself (e.g. Met Office 85%, Ordnance Survey 60%, from their Report & Accounts). These agencies are also spending heavily on business development: for example, the Ordnance Survey Accounts for 2010-11 show that it employs 130 Sales /marketing staff, which is 12% of the total. Removing internal trading and cutting out business development makes the release of core reference data eminently affordable. Some income would be retained in services say £100m and possibly a third of the costs could be saved reducing costs to £410m. Given its importance as infrastructure for key growth industries like digital services, the more important question is… can we afford not to free this data at a cost of c£300m invested in digital infrastructure?
2. Are there particular datasets or information that you believe would create particular economic or social benefits if they were available free for use and re-use? Who would these benefit and how? Please provide evidence to support your answer where possible.
Some of the most important datasets created by the government as part of its public task are still not released or subject to charging or access constraints. These datasets could be powering growth but government monopoly producers are blocking release by pursuing their institutional self-interest rather than the national interest. The attitude of many data owners in government is that: “we like the control that our data monopoly gives us, we keep the (relatively small) amounts of money we make, and no one can ask awkward questions about what we do.” This must end if open data is to reduce data costs across the economy and to enable SMEs to start new businesses powered by open data.
The most important datasets/services that would produce economic/ social benefits are:
- GEOPLACE, comprising addresses, postcodes, national land and street gazetteers. This data is critical to the success of re-use of almost all of the rest of the government’s open data as every activity has to be referenced against these datasets. The sheer irony and self-contradiction of charging for the core datasets and releasing other dependent data openly is astonishing as Geoplace is the core of the UK’s digital data infrastructure. This is the most important reference dataset held by government: growth will be taxed and constrained in every other area by retaining this behind a paywall. Each of the datasets below viz. NPAR, CH, Edubase and Trust need Geoplace to be reused as does transport, streetworks, health and crime data. Placr is forced to use OpenStreetMap to publish bus departures from Traveline and to support bus stop defect reporting by travellers.
- NATIONAL PLANNING APPLICATION REGISTER, comprising a record of all planning applications to local authorities and aggregated by the government’s Planning Portal. This was operated as a 5-year pilot through private contractor Emap Glenigan from 2005-10. When the service was closed down no attempt was made to offer the service as open data. Planning applications are of considerable public interest and there could be significant innovation in property searching and valuation if this data was made freely available as open data. Placr is looking for opportunities to aggregate and freely distribute open data where there is actionable information, as in this case. We have published our business models for this type of data in a blog post “The growth case for open data”
- COMPANIES HOUSE RECORDS, comprising details of all private companies that are collected by law. The data is behind a paywall and there is no open API. This is preventing the state register becoming the dominant way to reference companies as a commercial competitor Dunn & Bradstreet DUNS is charging both government and commerce for use of its DUNS number. This is a classic case of government failing to return benefits from fulfilment of its public task.
- EDUBASE, comprising the register of educational establishments in England & Wales. This database is available for reuse under the OGL, yet bizarrely you have to pay for more than two database extracts per year: “Under the Power of Information framework, public and commercial users are entitled to two free extracts per year, after which there is a service fee”, from FAQ. Placr already publish Schoolbrowser with attainment data, but we need to be able to add data from Edubase to add sufficient value to be able to make a return on the data.
- TRUST, comprising all Network Rail short (<24 hours), medium (weekly) and long term timetables and real-time running data as exclusively sub-licensed to the Association of Train Operating Companies. This data is produced as part of Network Rail's public task funded by the taxpayer and is already distributed to the rail industry through the Network Rail External Services Gateway and TDnet. Without this data no independent scrutiny of rail performance can be undertaken. Placr already publishes a national bus timetable service at placr.mobi with a social media feed for every bus stop in the UK. We cannot turn this into a free national public transport app without access to open rail departures, and this omission restricts the value addition we can achieve.
- COURT LISTINGS, comprising details of court cases in the Royal Courts of Justice, Crown Courts, selected County Courts, along with Case Archives and Legal News. This data is published via exclusive licences through Courtel and Bailii but is not available as open data. Thus it is impossible to compare open crime data with court outcomes without paying to access one of the licensed services.
3. What do you think the impacts of the three options would be for you and/or other groups outlined above? Please provide evidence to support your answer where possible.
The consultation document identifies five charging options and then dismisses two of them (‘Data utility’ and ‘profit maximisation’ models) leaving only models based on charging. As the ‘Data utility’ model is equivalent to the open data model this would leave the extra-ordinary situation of core reference data behind a paywall and most other government data released openly, but dependent on the data behind the paywall. Placr are concerned that the dismissal of the ‘Data utility’ model on the basis of affordability (§4.16) has been done on anecdotal evidence from 4 unrepresentative Cabinet Office ‘data user seminars’ who were asked ‘leading’ questions. In §4.3 a 146 page Treasury-commissioned Cambridge University study is set against “anecdotal evidence” in a way that reflects very badly on the methodology of the Consultation. If the authors of the Consultation wished to challenge the conclusions of the ‘Cambridge Study’ then they should have presented evidence of equal weight and detail to support their argument. In the circumstances the exclusion of the ‘Data utility’ option from consultation is a serious mistake without sufficient justification, and Placr believe that it should be examined formally as part of the Consultation outcome, despite the position taken in the document.
As argued in the open letter on the PDC quoted above in (1), all charging options will damage growth opportunities for re-users of government data by entrenching a monopoly state agency at the heart of the UK’s digital infrastructure. The PDC will need to find business models to sell government data already collected and paid for as part of the public task so that it can cover its own costs: these charges will be an additional tax. Yet there will be no market mechanism to bear down on costs and prevent excessive pricing, and we do not believe that the Shareholder Executive can provide market-equivalent disciplines.
There will also be a structural governance problem in the PDC if it is set up to charge for government data collected as part of the public task. Under any of the charging models, the PDC management will supervise the market as the monopoly provider. This model has already caused conflict between the private sector and trading funds as monopoly providers conduct ‘beauty contests’ to decide which business partner to select when technology enables new products and services to be created. There has been litigation over licensing decisions eg between Ordnance Survey and GetMapping (http://www.guardian.co.uk/media/2002/feb/25/newmedia). This governance problem creates uncertainty for investors and makes it hard for SMEs to enter the market for data products as the PDC is required to risk its rate of return with unproven new companies. The Trading Funds have rarely been willing to do this and have taken the safe option of cooperating with larger corporations. In the Data Utility option, SMEs take the risks of innovating with open data using their own capital to create new products and services.
If the PDC is allowed to charge for data using Freemium models there will also be arbitrary divisions between data sets selected for free release and to become chargeable, as for example with Edubase, which allows two free extracts per year. Even if the ‘harmonisation of charging’ option were adopted, the sheer complexity of the various datasets would make it hard to achieve consistency.
These problems can only be mitigated by adopting the Data Utility option. This option also offers the growth advantages of a tax cut (eg removing Land Registry map charges from house purchase property searches) and a market liberalisation for digital services. It also eliminates government to government trading by a monopoly state corporation- surely the most economically inefficient mechanism ever created.
4. A further variation of any of the options could be to encourage PDC and its constituent parts to make better use of the flexibility to develop commercial data products and services outside of their public task. What do you think the impacts of this might be?
Allowing the government to operate in the market as a commercial player in digital services would be an unprecedented step for a government that recently attempted to roll back the state with a bonfire of the quangos. In the Trading Fund model of trading outside public task the government would create a monopoly state corporation with no pressure on its costs and a management with no risk capital to provide commercial and market disciplines. In a high profile previous experiment with this model the New Zealand government created a state corporation called Terralink to group together the same agencies as the PDC would have. Terralink went bankrupt after 2 years of operation in 2001 when it could not fulfil a commercial contract because it accepted very stringent terms that no entrepreneur risking their own capital would do. The data assets were sold to the private sector by the liquidator and they were lost to the public sector. With its structural governance problem, a state monopoly like the PDC would be vulnerable to making wrong decisions when dealing with global corporations with massive market capitalisations. Note that the Department of Transport has already given Google access to real time traffic data on a privileged basis on the basis that Google can reach the market with its services. The PDC would be likely to make such deals again even though the long-term security of the government’s data would much enhanced by a large number of SMEs all innovating with open data using their own capital.
Rather than create a state corporation to operate a digital services monopoly the government should do the opposite: release the data as open data, following the excellent example of a number of publicly funded bodies. Hence, NHS Choices releases a large amount of health data to a heterogeneous collection of corporations and SMEs. Traveline, the national public-private bus information partnership releases its bus departure information freely and there are now 50+ apps for smartphones available across the country remixing bus departures with other data as for example placr.mobi does with social media updates. London Underground releases all its live tube running information freely and has struck a sponsorship deal with Microsoft for free use of its Azure cloud services platform to power real time distribution to dozens of apps, like for example, Busmapper and UK TravelOptions, which use Placr data-as-a-service solutions.
5. Are there any alternative options that might balance Government’s objectives which are not covered here? Please provide details and evidence to support your response where possible.
Placr offered an alternative vision for the PDC in the conclusion to its open letter, as follows:
An alternative vision for the PDC
The government is currently allowing an administrative reorganisation to set the agenda for 21st century digital infrastructure planning. However, we should not be listening to civil service managements when we are faced with global power shifts in digital services and media content. The Apple iTunes App store opened just over 3 years ago and it has already restructured the newspaper and mobile phone industries, with publishing and transport next in line. The way to grab the biggest slice of this massive new global market is liberalisation at home so that the UK is the natural place to base investments. At this precise moment, when we desperately need growth, if the PDC freed the maps, land search charges, company data, statistics, addresses etc. it would be akin to tax cut across the economy *and* a simultaneous industrial stimulus in a key growth area.
PDC plans must be re-thought
Our company is one of hundreds of UK start-ups in digital services that can immediately exploit liberalised core reference data alongside open data releases. We re-use government data in transport, education, crime and health to feed smartphone apps and provide business services. We are an early stage business that creates jobs in London and pays UK taxes, already turning over £120K a year from organic growth after a year and half. We want to expand in the UK and export our expertise. But if we have to pay a tax for the use of core reference data to support a government monopoly, it makes it hard to develop profitable business models around open data and the scope for growth is greatly reduced. This is a decision of huge importance for the UK digital services industry… one of the few new industries that can really get us back to growth.
Chapter 5 – Licensing
6. To what extent do you agree that there should be greater consistency, clarity and simplicity in the licensing regime adopted by a PDC?
Placr view licencing questions as greatly subservient to charging policy and governance. If the Data Utility model is adopted, then the Open Government Licence can be used everywhere. Creating a huge state monopoly to try to protect intellectual property to recover income is exactly the opposite of the recommendations of the Hargreaves Review. One of the great advantages of open data releases is the simplicity of the mission: to release data freely and to enhance the operation of the market, in particular to ensure:
access follows public money
government itself does not compete with open data users
there is equality of access (net neutrality for data)
there is stability of distribution
the distributing agencies do not act like monopolies
7. To what extent do you think each of the options set out would address those issues (or any others)? Please provide evidence to support your comments where possible.
Use of charging models necessitates a more commercial licensing regime. Placr’s experience is that these licenses are complex and expensive to administer and enforce. Use of the open data model mitigates these risks and reduces the cost of managing licensing. SMEs in particular cannot afford to go to law and the need to sign licences with the PDC will be a show-stopper for many innovators.
8. What do you think the advantages and disadvantages of each of the options would be? Please provide evidence to support your comments
See answer to question 7.
9. Will the benefits of changing the models from those in use across Government outweigh the impacts of taking out new or replacement licences?
See answer to question 7.
Chapter 6 – Regulatory oversight
10. To what extent is the current regulatory environment appropriate to deliver the vision for a PDC?
The current regulatory environment around data is very complex as it involves the Office for Public Sector Information, the Office for Fair Trading, Consumer Focus, Passenger Focus, the National Audit Office, the Office for Rail Regulation, the Patients Advisory and Liaison Service, Ofcom, the Advisory Panel on Public Sector Information, the Information Commissioners Office, the Transparency Board and many others. The introduction of a PDC can help integrate these functions and supervise a Right to Data. However, it will be very difficult for a PDC operating in the market to fulfil this role as there will be a conflict of interest. Placr believe that the vision of the PDC as a coordinating and regulatory agency would be served by the open release of government data produced as part of the public task.
11. Are there any additional oversight activities needed to deliver the vision for a PDC and if so what are they?
There are some problems of regulation already inhibiting growth from open data. Giving the PDC statutory regulatory powers would help mitigate these issues, for example:
- approving contracts for commercial processing of government data to ensure that the system integrators who, do this work do not acquire any unintended rights over government data or its distribution
- reforming isolated examples of data distribution that do not conform to the Transparency principles eg harmonising legal data releases by the Courts Service and Registry Trust, a non-profit body established to distribute court judgments on credit and debt
- managing data.gov.uk as a national metadata service for data
12. What would be an appropriate timescale for reviewing a PDC or its constituent parts public task(s)?
As most departments of government do not have a formal statement of their public task, a first step should be for a PDC to identify this and express it as a set of data and services produced by it and subject to data release. This role can be overseen by Cabinet Office and Parliament. However, Placr believe it is hard to see as PDC fulfilling this role if it is trading in the market.
What it will cost to free the rest of UK government data (spoiler: £0)
First, the good news. The UK government has made good on its promises to release open data across government in 2011, and this year has seen a dizzying sequence of open data announcements, most recently in the Open Data Measures in the Autumn Statement. Not only has the government opened the data, but it has put in place institutions (like the Transparency Board), portals (like data.gov.uk) and funding (through Technology Strategy Board). This is all profoundly good news and has enabled the growth of a cadre of open data companies like Cycle Streets, Open Corporates and my own company Placr. We are racing to build new companies built on the open data and we are already paying taxes that go back into the Exchequer, offering free services to the public and value-added offerings to businesses.
However, there is still a cloud on the horizon. Some of the most important reference data is still locked up like the detailed maps, addresses, land records, school databases, the national planning application register and court records (details in my blog post here). The government held a consultation in the summer over the formation of a Public Data Corporation (PDC), and we presented arguments as to why embedding a government trading monopoly at the heart of open data was a bad idea, and this seemed to resonate. However, when we read the government’s Open Data Measures in the Autumn Statement we were very disappointed to see that all they have done is change the acronym from Public Data Corporation (PDC) to Public Data Group (PDG), and kept the substance of the previous proposal. This leaves us with a problem, as we are not going to be the “world leader in open data” as George Osborne wants by taxing every digital service transaction in the UK for the core reference data that the government has already paid for!
To understand why this is happening and how to fix it, we need to see what the Autumn Statement is proposing to do. I have drawn the following diagram after a brainstorming session at the Open Rights Group to show the government’s plan:
The Public Data Group will be a merger of the Land Registry, Meteorological Office, Ordnance Survey and Companies House. Analysis of their 2010-11 Report and Accounts shows that the agencies collectively have revenues of £741m, costs of £649m and that they make profits of £92m. We are told that this trading is necessary to save the taxpayer the cost of these activities. However, when you realise that 83% of Met Office and 58% of Ordnance Survey revenues are from government itself, you can see that these costs are mainly being paid by taxpayers anyway. The non-government sales income from these two agencies (MO/OS) is only £84m. Companies House and Land Registry are different because they operate registries and manage transactions for business and house buyers, and so government usage is low, with users paying all the costs.
In the rest of the diagram you see how the government proposes to add two new agencies into the mix to moderate the operation of the PDG government trading monopoly. The DSB will take some dividends from the PDG trading operation and will be able to spend this on buying data to be freed, or on services. However the scale of its suggested funding is orders of magnitude lower than the costs freeing all the data outright and so its influence will be marginal unless it also directs the PDG business plan. In this system the taxpayer would be paying twice for the data: once for the core operations through taxes and then a second time to free the data through the DSB dividend income, which the Treasury would forgo. Meanwhile the Open Data Institute will spend its money on research over the long term… and cannot influence the trading activities.
Surely this is the wrong model for the future of government data as the paywall around the core reference data will continue to inhibit private sector investment and reduce the tax revenues from innovation. There are too many new players in the system, all incurring costs and adding friction to the movement of the data. I think the government should instead release all of the data freely to stimulate private sector growth, as I show in this diagram:
If government releases the data freely and encourages the agencies to do consultancy and packaging of the free data, these costs of these agencies will have to be (or are already being) restructured as follows:
This analysis (Excel spreadsheet here) implies that the only real shortfall on full release for these agencies would be the loss of OS non-government income of £32m net of savings on business-as-usual, plus ?£5m for the Land Registry to give open access to its data (=£37m). To keep the costs low the Land Registry could keep its registration income, but provide access to its land records as open data.
The moves by the Met Office and Companies House show that the government believes that agencies can release their data, cut their costs and still deliver core data. Therefore I don’t think free data release is “unaffordable” in any sense, and the Ordnance Survey also needs to be restructured to deliver this change in a broadly revenue neutral way. For example, it could close services that duplicate the private sector such as the ‘Get a Map’ service. Crucially, amongst the OS data releases would be AddressBase… the full UK national address database, which is a vital open dataset needed to power almost all digital services. The overall value to the economy of removing monopoly pricing from detailed maps and addresses would be a marginal increase in economic efficiency across a vast range of transactions.
I have shown here how the costs of these four agencies are either internal government trading, being cut voluntarily or being levied as transaction costs (which could remain while the data was opened). However, in theory losing the £92m profits of these agencies would also be a cost to government funds. On closer analysis… although MO, OS and CH returned £15m to the government, Land Registry had to pay a massive £87m for restructuring costs and returned no cash to the government. So if we add the “loss” of the £15m profit to the costs of releasing data, then the total bill for releasing all the data and losing profits could reach £52m on a business-as-usual scenario, though restructuring OS can probably eliminate most of this cost. If LR continued to trade and charge for transactions (whilst opening its data), then in a ‘normal’ year its gross profits of £69m could actually pay all the bills for data releases elsewhere!
So… given the “unaffordable” costs of data release are actually a mirage… the government should be acting boldly to release all the data and stimulate the economy. Therefore the only remaining job that the government has to do to get the final open data #WIN is to reform these agencies. As far as I can see the only people who still oppose this are the managements of these agencies… and you would expect that wouldn’t you?
Jonathan Raper