Reverse engineering and copyright in programming languages and data file formats

Author: Francis Davey

SAS Institute Inc v World Programming Limited [2013] EWHC 69 (Ch), High Court of Justice, England and Wales, 25 January 2013

Journal of Intellectual Property Law & Practice (2013), doi: 10.1093/jiplp/jpt068, first published online: May 31, 2013

In rejecting an application to amend by SAS Institute, the High Court expressed the provisional view that computer languages and data file formats were probably not capable of being works protected by copyright within the meaning of the Information Society Directive (Infosoc). The court also held that WPL was permitted to reverse engineer the SAS Institute's software even though, in doing so, WPL went beyond the scope of their software licence.

Legal context

The vendor of a software product will often want to reduce the risk that a competitor will use reverse engineering to produce a competing product. Such a vendor might want to rely on copyright in their computer program, which is protected by copyright harmonized as a result of the Computer Programs Directive (91/250), or they might rely on copyright in other aspects of their work, for example the user interface or the data file formats.

In principle, these might be protected as computer programs or as more ‘general’ copyright works as defined in the Information Society Directive (Infosoc Directive 2001/29) or even as works subject to database copyright, harmonized by the Database Directive (96/9).

In C-393/09 Bezpečnostní softwarová asociace—Svaz softwarové ochrany v Ministerstvo kultury [2010] ECR I-13971 the Court of Justice of the European Union (CJEU) held that a graphical user interface was not protectable as a computer program, but that it could be a work protected by general copyright. In a reference to the CJEU in the instant case, the court held that none of the functionality of a computer program, ‘the’ programming language (ie the programming language which it interprets) nor its data file formats were protectable as computer programs (Case C-406/10 SAS Institute Inc v World Programming Ltd (2 May 2012), discussed in David Nickless, ‘Functionality of a computer program and programming language cannot be protected by copyright under the Software Directive’ (2012) 7(10) JIPLP 709).

The CJEU left open the question of whether a programming language or a data file format could be protected as a general copyright work.

Article 5(3) of the Software Directive is intended to permit reverse engineering the purpose of which is to create a new, but differently expressed, computer program that will behave equivalently to the one reverse engineered—in other words a clone.

In the High Court, Arnold J considered (among other things) two questions: first, could a computer language or data file format be works protected by general copyright and, secondly, could the licensor of software place restrictions on the use of their licensed software which would have the effect of restricting the purchaser of the licence from carrying out reverse engineering?


The SAS Institute (SAS) sells a data analysis system consisting of a core system (SAS Base) and numerous optional components which may be separately licensed. The system is driven by a special purpose programming language (the SAS language). SAS originated in the late 1960s and has been continuously developed since. A very large body of programs has been written in the SAS language.

WPL created a competitor system (WPS) that was intended to emulate the behaviour of several SAS components, including the core, referred to collectively in the case as the ‘SAS components’. Of necessity, WPS needed to be able to run the SAS programming language and to read and write the same file formats as SAS.

In order to ensure compatibility, WPL licensed a version of the SAS system known as the ‘Learning Edition’ in order to reverse engineer the behaviour of the SAS components. The licence under which SAS distributed the Learning Edition restricted it to use by a single individual for the purposes of their learning how to use the SAS system and prohibited its use for production purposes.

WPL also produced documentation (the WPL manuals and the WPS guides), which included a description of the operation of each command. Those descriptions inevitably bore a strong resemblance to descriptions of SAS commands in SAS's manuals.

SAS claimed infringement of copyright of (i) the SAS Manuals by creating WPS and the WPL Manuals and WPS guides; (ii) the SAS components by copying the SAS Manuals; and (iii) the Learning Edition, by contravention of the terms of use.

At the trial ([2010] EWHC 1829 (Ch)) the judge found that WPL had infringed copyright in the SAS Manuals by creating the WPS Manual, but he thought that further assistance was required from the CJEU on the remaining points. Nine questions were referred to the court.

On return to the High Court, SAS sought to argue that the SAS language and data file formats were works protected by general (Infosoc) copyright.


The SAS Institute had pleaded that the SAS language and data file formats were protected by software copyright rather than that they were works protected by general copyright. The judge ruled that in order to argue for general copyright protection, the SAS Institute would have to apply to amend its pleadings. He would refuse an application on two principal bases. First, the application to amend would be at a very late stage, after trial and after the matter had been referred to the CJEU. Secondly SAS's case would require the investigation of wholly new factual issues.

For example, in order to decide whether the SAS language was a copyright work, its authorship would need to be determined—the language's very long history would make that a complex inquiry—and then there would have to be an assessment as to whether it exhibited its authors' own intellectual creation. This would require expert evidence. SAS had argued that a computer language was an abstraction, much like the plot of a novel, but in that case was it an abstraction from the computer programs or the manuals? That would again require factual inquiry.

In considering whether the SAS Language was capable of being a work, Arnold J thought that, even if, in the light of decisions of the CJEU such as C-5/08 Infopaq v Danske Dagblades Forening [2009] ECR I-06569, a work did not have to be one of the kinds of work listed in s 1(1)(a) of the Copyright, Designs and Patents Act 1988 it must still be a literary or artistic work within the meaning of Article 2(1) of the Berne Convention. That definition was not unlimited in scope.

The SAS Institute had argued that because the SAS Language was an intellectual creation it was therefore a work. Arnold J rejected this as a non sequitur. A scientific theory is an intellectual creation but it is not a work.

Arnold J expressed the provisional view that a programming language could not be a ‘work’ within the meaning of Infosoc at all. In the same way that a dictionary and a grammar describe a language but the language itself is not a work. A grammar and dictionary are a set of rules for generating meaningful statements. Even if the statements are protected by copyright, it does not follow that the language ought to be.

The same considerations arguing against copyright protection applied at least as strongly to the data file formats but here there was a further question of fixation. It was also hard to see where the authors had stamped their ‘personal touch’ through the creative choices they have made—a required element of the ‘own intellectual creation’ threshold according to the CJEU (C-145/10 Painer v Standard Verlags, 1 December 2011). Evidence would be needed to be adduced to show that this requirement was satisfied.

On the licence question, the CJEU held that:
a person who has obtained a copy of a computer program under a licence is entitled, without the authorisation of the owner of the copyright, to observe, study or test the functioning of that program so as to determine the ideas and principles which underlie any element of the program, in the case where that person carries out acts covered by that licence and acts of loading and running necessary for the use of the computer program.
SAS argued that WPL's acts were not ‘covered by [the] licence’ of the Learning Edition because it had gone beyond the scope of the licence. As a result, WPL could not take advantage of Article 5(3)'s protection of reverse engineering.

The judge disagreed. Although the Learning Edition licence was a single-user licence, and the licensee was defined to be the individual who had purchased the licence online from the SAS Institute, the licensee had done so on behalf of WPL. That meant that WPL was ‘a person who [had] obtained a copy of a computer program under a licence’. It was immaterial that the WPL was not the licensee.

The judge held that the ‘acts’ referred to by the CJEU were the acts of loading and running necessary for the use of the Learning Edition. The CJEU had not meant to narrow the definition of ‘acts’ to those that were permitted by the licence. To do so would be to bypass the anti-contracting-out provisions of Article 5(3). This meant that even though unlicensed employees of WPL had used the Learning Edition in order to carry out reverse engineering and even though doing so went beyond the scope of the Learning Edition's licence, that reverse engineering was lawful under Article 5(3).

Practical significance

The creators of general purpose programming languages rarely try to restrict their use by asserting intellectual property rights. This is partly due to computer science tradition—computer languages were never treated as ‘owned’ by their authors in that sense—but it will also usually make sense. It will almost always be true that the wider the community of those using a language the better it is for the language's development. The copying of a programming language by making compatible interpreters or compilers is a driver for that broader reach.

But application-specific programming languages such as the SAS language or the commands used in Navitaire v Easyjet [2004] EWHC 1725 (Ch), are intended to drive specific programming systems. The vendors of those systems have a strong commercial interest in preventing the creation of a competing system. Such a system would need to be able to interpret exactly the same programming language so that if that language is protected by copyright, the system vendor could prevent competition.

The situation with data file formats is the same. Where a data file format has become a de facto standard, for example Microsoft Word documents and Adobe's Portable Document Format, any competitor seeking to enter the market will need to be able to create a compatible application capable of reading the same file formats. File format compatibility is implicated in a much wider class of software applications than programming languages, so the decision on data file formats is more significant for the market as a whole.

More significant is the analysis of Article 5(3) of the software directive. If the phrase ‘which he is entitled to do’ were limited to those acts permitted under the licence and if a licensor were able to limit the purpose for which those acts were done, it would be very easy to prevent reverse engineering of by writing suitably restrictive licences. Such an interpretation would make the phrase ‘without the authorization of the rightholder’ useless.

Being able to obtain a restricted licence to software but still to be able to reverse engineer it sufficiently so as to create a competing—but original in the copyright sense—work is very important. If followed, this decision will make life easier for those seeking to break into existing software markets.

No comments:

Post a Comment