In the age of Big Data the value of data does not lie in data or text considered in their isolation, but rather in the extraction of such value. This requires that text and data be analyzed, to thus enable the discovery of new patterns and relations. While a task of this kind would be virtually impossible to perform manually, text and data mining (TDM) techniques allow it easily.
In 2016 the EU Commission issued a proposal for a Directive on Copyright in the Digital Single Market (DSM Directive) which, among other things, includes a new mandatory exception allowing research organizations to make reproductions and extractions in order to carry out TDM of works or other subject-matter to which they have lawful access for the purposes of scientific research. Although including commercial and non-commercial uses alike, the beneficiaries of the exception would be limited (at least in the original EU Commission’s proposal) to universities, research institutes, non-profit or public interest research-intensive organizations. In principle the draft directive does not exclude applicability of the TDM exception to public-private partnerships, but rules out that this could be possible when a commercial undertaking has a decisive influence and control over the research organization in question.
Does it make sense to limit the availability of a mandatory TDM exception to research organizations alone? Probably not, and this is so for three main reasons.
First, because—at least in Europe—it is not ‘research organizations’ that are mostly or solely engaged in TDM activities. As the EU Commission itself noted in its Impact Assessment accompanying the proposed DSM Directive, there seems to be general agreement among stakeholders that TDM ‘is still a nascent tool, in particular in the non-business sector, ie for research carried out by organisations such as universities or research institutes’.1
The second reason is connected to the realities of licensing practices. Although certain publishers offer the possibility of undertaking TDM activities as part of their licensing models, that is not the case of all. At the time when the UK was discussing the introduction of a TDM exception into its own law (this eventually occurred in 2014 within the framework offered by Article 5(3)(a) of Directive 2001/29, ie the InfoSoc Directive), UK Government deemed it preferable not to have any particular restrictions as regards the beneficiaries of the resulting exception. Although UK Government acknowledged ‘that some publishers take an active role in developing text and data analytic technologies, and that some offer contracts that support the use of these technologies’, ‘under current conditions, research projects may in some cases require specific permissions from a large number of publishers in order to proceed’, and that this ‘is in some cases an insurmountable obstacle, preventing a potentially significant quantity of research from taking place at all’.2
The final reason relates to the nature of TDM, which is not about competing with existing content or disrupting existing models. Access to, extraction and copying of content for TDM purposes are all incidental stages that do not ultimately result in the external re-use of protectable (expressive) parts of a work, but are rather functional to accessing those parts that are unprotected, including ideas, data, and facts considered on their own.
In conclusion, EU legislature should not limit the catalogue of beneficiaries of its forthcoming TDM exception. Any alternative model would lead to legal uncertainties (those that the EU Commission intended to remove), hinder the development of TDM practices in Europe and, above all, overlook the fact that TDM is not about displacing existing content but rather extracting further knowledge from it and, in doing so, rendering it more valuable.


EU Commission (2016), Commission Staff Working Document – Impact Assessment on the Modernisation of EU Copyright Rules Accompanying the Document Proposal for a Directive of the European Parliament and of the Council on Copyright in the Digital Single Market and Proposal for a Regulation of the European Parliament and of the Council laying down Rules on the Exercise of Copyright and Related Rights Applicable to Certain Online Transmissions of Broadcasting Organisations and Retransmissions of Television and Radio Programmes, SWD(2016) 301 final, Part 1/3, §4.3.1 (emphasis added).
HM Government (2012), Modernising copyright: a modern, robust and flexible framework. Government response to consultation on copyright exceptions and clarifying copyright law, p 37.