Glossary

Anonymisation
The process of adapting data so that individuals cannot be identified from it.
Anonymization
See Anonymisation.
API
See Application Programming Interface.
Application Programming Interface
A way computer programs talk to one another. Can be understood in terms of how a programmer sends instructions between programs.
AR
See Information Asset Register.
Attribution Licence
A licence that requires that the original source of the licensed material is cited (attributed).
Attribution License
See Attribution Licence.
BitTorrent
BitTorrent is a protocol for distributing the bandwith for transferring very large files between the computers which are participating in the transfer. Rather than downloading a file from a specific source, BitTorrent allows peers to download from each other.
Connectivity
Connectivity relates to the ability for communities to connect to the Internet, especially the World Wide Web.
A right for the creators of creative works to restrict others’ use of those works. An owner of copyright is entitled to determine how others may use that work.
DAP
See Data Access Protocol.
Data Access Protocol
A system that allows outsiders to be granted access to databases without overloading either system.
Data protection legislation
Data protection legislation is not about protecting the data, but about protecting the right of citizens to live without fear that information about their private lives might become public. The law protects privacy (such as information about a person’s economic status, health and political position) and other rights such as the right to freedom of movement and assembly. For example, in Finland a travel card system was used to record all instances when the card was shown to the reader machine on different public transport lines. This raised a debate from the perspective of freedom of movement and the travel card data collection was abandoned based on the data protection legislation.
Database rights
A right to prevent others from extracting and reusing content from a database. Exists mainly in European jurisdictions.
EU
European Union.
EU PSI Directive
The Directive on the re-use of public sector information, 2003/98/EC. “deals with the way public sector bodies should enhance re-use of their information resources.” Legislative Actions - PSI Directive
IAR
See Information Asset Register.
Information Asset Register

IARs are registers specifically set up to capture and organise meta-data about the vast quantities of information held by government departments and agencies. A comprehensive IAR includes databases, old sets of files, recent electronic files, collections of statistics, research and so forth.

The EU PSI Directive recognises the importance of asset registers for prospective re-users of public information. It requires member states to provide lists, portals, or something similar. It states:

Tools that help potential re-users to find documents available
for re-use and the conditions for re-use can facilitate
considerably the cross-border use of public sector documents.
Member States should therefore ensure that practical arrangements
are in place that help re-users in their search for documents
available for reuse. Assets lists, accessible preferably online,
of main documents (documents that are extensively re-used or
that have the potential to be extensively re-used), and portal
sites that are linked to decentralised assets lists are examples
of such practical arrangements.

IARs can be developed in different ways. Government departments can develop their own IARs and these can be linked to national IARs. IARs can include information which is held by public bodies but which has not yet been – and maybe will not be – proactively published. Hence they allow members of the public to identify information which exists and which can be requested.

For the public to make use of these IARs, it is important that any registers of information held should be as complete as possible in order to be able to have confidence that documents can be found. The incompleteness of some registers is a significant problem as it creates a degree of unreliability which may discourage some from using the registers to search for information.

It is essential that the metadata in the IARs should be comprehensive so that search engines can function effectively. In the spirit of open government data, public bodies should make their IARs available to the general public as raw data under an open licence so that civic hackers can make use of the data, for example by building search engines and user interfaces.

Intellectual property rights
Monopolies granted to individuals for intellectual creations.
IP rights
See Intellectual property rights.
Machine-readable
Formats that are machine readable are ones which are able to have their data extracted by computer programs easily. PDF documents are not machine readable. Computers can display the text nicely, but have great difficulty understanding the context that surrounds the text.
Open Data
Open data are able to be used for any purpose. More details can be read at opendefinition.org.
Open Government Data
Open data produced by the government. This is generally accepted to be data gathered during the course of business as usual activities which do not identify individuals or breach commercial sensitivity. Open government data is a subset of Public Sector Information, which is broader in scope. See http://opengovernmentdata.org for details.
Open standards
Generally understood as technical standards which are free from licencing restrictions. Can also be interpreted to mean standards which are developed in a vendor-neutral manner.
PSI
See Public Sector Information.
Public domain
No copyright exists over the work. Does not exist in all jurisdictions.
Public Sector Information
Information collected or controlled by the public sector.
Re-use
Use of content outside of its original intention.
Share-alike Licence
A licence that requires users of a work to provide the content under the same or similar conditions as the original.
Share-alike License
See Share-alike Licence.
Tab-seperated values
Tab-seperated values (TSV) are a very common form of text file format for sharing tabular data. The format is extremely simple and highly machine-readable.
Web API
An API that is designed to work over the Internet.