The European Union is in the process of reforming copyright laws that date back to 2001, as part of a wider strategy to establish a Digital Single Market across the 28 Member States of the bloc, aiming to break down regional barriers to e-commerce.

Earlier this year an agreement was reached on ending geoblocks on travelers’ digital subscriptions by 2018. And EU consumers are set to say adios to mobile roaming fees from this June. So far, so good, you could say.

But when the European Commission’s draft proposals for digital copyright reform were published last September, they were criticized by tech companies as regressive, and by copyright reformists as a missed opportunity to modernize ill-fitting laws to make them fit for purpose in the internet age.

There have also been warnings of the potential impact on startups of the copyright reform, although it’s fair to say that the loudest complaints are coming from big U.S. tech companies that appear to be core targets in the European Commission’s draft proposal, on account of the size and power of their content-sharing platforms.

In the supporters’ camp, EU sources argue that the Commission’s proposals will help European creative and copyright-centered industries flourish in a Digital Single Market, and European authors reach new audiences — while making regional works widely accessible to EU citizens and across borders.

“The aim is to ensure a good balance between copyright and relevant public policy objectives such as education, research, innovation and the needs of persons with disabilities,” one EU source told us. “We trust that the discussions in the Council and the European Parliament will aim to maintain this ambition and will facilitate access to and use of copyright-protected content online and ensuring a well-functioning copyright marketplace.”

At the launch of the draft proposals last year, Andrus Ansip, VP for the Digital Single Market, summed up the balance the EC is seeking to strike thus: “Europe’s creative content should not be locked-up, but it should also be highly protected, in particular to improve the remuneration possibilities for our creators.”

Neighboring rights for news

Among the most controversial elements of the proposals are an extra copyright provision for using snippets of journalistic content online — a so-called “neighboring right” for news sites, which critics describe as an attack on the hyperlink.

This could apply to link previews generated by news aggregators like Google News, for example, or social network sites like Facebook linking out to news articles. But there are also suggestions it may disproportionately impact startups in the news aggregation and/or media monitoring space.

Although EU sources emphasize there is no requirement that publishers levy a charge for their content — rather, it is up to publishers to decide on conditions for use of their content, with the argument being that the neighboring right would give publishers a stronger legal basis to negotiate with third parties.

A similar law was enacted in Germany in 2013, but uncertainty remains about what actually constitutes a snippet — and local publishers ended up offering Google free consent to display their snippets after they saw traffic fall substantially when Google stopped showing their content rather than pay for using them.

Spain also enacted a similar ancillary copyright law for publishers in 2014, but its implementation required publishers to charge for using their snippets — leading Google to permanently close its news aggregation service in the country. A subsequent economic study found the significant drop in traffic associated with the shuttering of Google News in Spain mostly affected smaller, niche or newcomer publishers. But even large media entities there have come out against the law.

Content monitoring

Another highly controversial portion of the copyright proposal is a requirement on websites that host large amounts of user-generated content to monitor user behavior to identify and prevent copyright infringement. So, in other words, to shift from reviewing reported content after it has been published to proactively scanning at the point of upload to try to prevent copyright infringements happening in the first place.

Critics complain this approach would compel private companies to police the internet on behalf of rights holders. They also suggest it’s a surveillance risk and that requiring indiscriminate monitoring of citizens’ online activity is disproportionate and therefore potentially violates fundamental EU privacy rights.

Countering these criticisms, EU sources emphasize that the Commission’s proposals are specifically targeted at services that store and give access to “large amounts of copyright protected content” — pointing out that such platforms have become important players on the content market.

“Due to the nature and significance of these services for the distribution of copyright protected content, they are required to take certain measures to allow a better functioning of the content market,” one source told us.

Measures taken must also be “proportionate,” and should not be “ unnecessarily complicated or costly for the service providers,” according to the source. Nor are specific technologies or solutions imposed.

“ It is for the services to find the appropriate and proportionate measures, which could be developed either internally or using for example third party services, as done by a number of services already today,” the source added, arguing that the proposal “strikes a balance between different interests.”

“It imposes obligations on platforms with large amounts of copyright protected content, which can be expected, due to their role on the content market, to have certain responsibilities. It also introduces safeguards for businesses and users. It does not introduce a general obligation to monitor content.”

Text and data mining

The copyright reform also proposed to establish a new EU-wide exception for text and data mining — but only for research institutions conducting scientific research, which has raised questions over whether commercial data-mining activity might suddenly be considered to lie outside the law.

Responding to this concern, EU sources argue that the proposal does not regulate or extend access to data for any stakeholders, nor does it change the current situation for other users of text and data mining — adding that these users “can continue exercising their activities under the same conditions as today.”

There has also been disappointment among copyright reformists that the EC has not sought to harmonize rules across the EU to recognize and put beyond legal doubt digital remix culture, such as the ability to create GIFs, memes, supercuts etc. — types of digital content which may currently, at least technically, be copyright infringements in some EU member states.

The European Parliament has been debating the copyright reform proposals for the past few months, as it formulates its official reaction to the draft proposals and seeks to push for specific changes. And this week members of the European Parliament submitted their amendments to the Commission’s proposals, as part of that process.

I’ve submitted my amendments to the #copyright reform. At least 78 MEPs have tabled amendments against #Ancillarycopyright to #SaveTheLink! pic.twitter.com/VKINqfFIqe — Julia Reda (@Senficon) April 13, 2017

TechCrunch spoke to MEP Julia Reda, a long-running proponent for copyright reform — who has called for bold and ambitious reforms, yet instead finds herself fighting a set of proposals that she argues could usher in additional restrictions on web users — while also disadvantaging regional startups.

TC: What was the original impetus for the EU’s digital copyright reform, and what did the Commission eventually propose?

Reda: When EU first announced the copyright reform it was saying that the purpose was to really make life easier for everybody — businesses who wanted to scale up throughout the entire EU but also for citizens or consumers who wanted to access different services across borders. And what would be needed for that would be a more European copyright. We’re currently stuck with 28 different national laws that are often contradictory, and that’s often causing problems in the online environment. So when the Commission came out with this proposal there was very little of this ambition to be found. There are a few exceptions that the Commission is proposing to make mandatory across the EU when it comes to teaching — so use of digital content in teaching, and preservation copies being made by libraries and archives, but this is really — while it’s a step in the right direction, it doesn’t really do much more for the market.

At the same time, when it comes to the measures that are proposed on the marketplace I think they are actively harmful, so on the one hand you have a provision that would force any company or not even company any host provider that is basically giving users the possibility to upload content on their own an obligation to monitor what the users are doing — and this is not only extremely costly for all the providers, it could be anyone from Wikipedia to GitHub to photo communities, but it’s also a violation of fundamental rights. In the past the European Court of Justice has made it very clear that member states are not allowed to impose a general obligation on internet providers to monitor what users are doing. And this is exactly what this law would do. But this is the one big criticism that I think is relevant when it comes to how this would affect the internet ecosystem.

The other one is the proposal to extend copyright for press publishers and allow them to ask for licence fees for the reproduction of even the smallest snippets of content — so, for example, the headline of a news article. This directly interferes with the possibility to link to content on the internet because of course if you’re linking to something you want the link to be meaningful, and at the very least to include the title of the article you’re linking to.

TC: How have we arrived here? Who most stands to benefit from the most controversial proposals?

Reda: I think both of these proposals are examples of really blatant industry lobbying. So in the case of these content monitoring provisions, this has been very clearly pushed for by the music industry. And it’s actually a parallel development to the discussions that are going on in the U.S. So the music industry has quite successfully convinced a lot of lawmakers that they basically need to be paid more by YouTube . The entire purpose of this article is really to settle a fight between music labels and YouTube. The problem with this proposal is of course that its effects would go far beyond YouTube. And, in fact, probably YouTube would be one of the only hosting websites that could easily comply with this website because they already have a content monitoring facility in place. So even though it’s intended to strengthen the position of the music industry when it’s negotiating with YouTube, probably the collateral damage on other hosting websites would be a lot higher. But this is simply not something that the Commission has been thinking about when it was drafting this law. It’s very clear that they had a very specific type of website and a very specific type of content in mind, where such automated filtering may be more realistically possible.

Because if you’re trying to find a music recording, at least technologically this is comparatively simple because a music recording is more or less unique. But copyrighted content is a lot more than that. And if, for example, software would have to detect any type of copyright infringement — which is basically what this law is saying — the technology for that doesn’t even exist. So it could be things like being able to transfer to detect translations of a text that can be a copyright infringement, or pictures of a sculpture from different angles. It can be compositions rather than just musical recordings. So it’s really a huge technological challenge and it’s very clear from the fact that in all its reporting documents the Commission is only talking about the music industry that this is really what they had in mind. And there has been quite clear lobbying from the industry for this.

And in the case of the extra copyright for press publishers, it’s not even the publishing industry in general that’s in favor of this. It’s a relatively small number of — in particular two German publishing houses — that want to have this. And everybody else is a bit more puzzled by it. But because we had a German commissioner at the time that this proposal was being produced, they had very easy access to the highest levels of the Commission. But there are a lot of publishers who are actually quite critical of this proposal because they are saying that being able to be found on news aggregators and being able to be linked to by people on social media is absolutely crucial to their business model and to finding their audience. So it’s not like the entire publishing industry is in favor of this either.

[In Germany an ancillary copyright] was passed into law in 2013, and since then there has been court battles going on about what it actually means. Like how many words are you allowed to use before it becomes an infringement? And none of these questions has been solved by now. But a number of startups who have been doing media monitoring and stuff like that have had to go out of business because of the legal uncertainty, and they just can’t get funding — if they don’t know whether what they are doing is legal. And they’re probably not going to find out for several years.

TC: Setting aside the problem of a lack of ambition in the reform, it sounds like it has been overly broadly drafted –- could the Commission fix what it has, or do you think it should be scrapped entirely?

Reda: I think it should be scrapped because there’s not one problem with the proposal but several ones. So I think it’s a fundamentally bad idea to write content recognition technology into law. Not just because it’s extremely invasive but because it systematically ignores users’ rights. So the way that copyright is designed in Europe is that we have exclusive rights, and then have a list of specific exceptions under which users are allowed to use copyrighted content. So, for example, in most member states of the EU you are allowed to use works for purposes of quotation, within certain limits of course. The technology is not able to distinguish between a lawful use of copyrighted content under an exception, and an unlawful use –- so it simply takes down every use of the content that is not licensed. And this of course leads to takedowns of lots of EU content and it systematically undermines the purpose of the exception, which is usually the protection of freedom of expression. So I think as long as this proposal talks about forcing anyone to use content recognition technologies it’s systematically undermining the copyright exceptions and it’s basically throwing the copyright system even more out of balance. So I find it very difficult to imagine how this could be fixed.

I think it’s a fundamentally bad idea to write content recognition technology into law. Not just because it’s extremely invasive but because it systematically ignores users’ rights.

The other problem is that it’s trying to misrepresent the legal status of hosting providers in the EU. Because at the moment, if a user uploads something to a platform it’s primarily the user who is responsible for it, so they are the ones who have to check whether the content they are uploading is legal and so on — and this make sense because otherwise it wouldn’t be possible to run a platform that has a lot of user-uploaded content. If you had to check every YouTube video before it’s uploaded or every picture before it can be used on Wikipedia, these platforms simply wouldn’t work the way that they work today. And so that’s why there is a limited liability for these host providers that, no, they don’t have to pro-actively check everything that is uploaded. But in return they have to take down content once they’re informed, or once they learn that there’s something illegal there. And they’re doing this. So I think that as long as the proposal first of all doesn’t recognize this legal regime and this limited liability, and at the same time speaks about content recognition, I don’t see how it can be fixed.

TC: At the moment in the EU there’s a lot of political pressure on social platforms to get better and faster at taking down problem content such as hate speech, terrorist propaganda and child abuse imagery — including governments talking about wanting the tech companies to build tools to help automate this process. Might this sort of thinking be feeding into the Commission’s proposals on copyright, too?

Reda: I think the problems associated with copyright infringement, with hate speech and with images of child abuse, are fundamentally different. So, first of all, with hate speech, the biggest problem is that according to numbers by the Council of Europe, only 15 percent of hate speech is even illegal in the first place. So the companies are often being asked to take down content that is technically legal. And then of course it’s extremely difficult because then the problem is not that the companies are not complying with their obligations under the limited liability regime, but the problem is that the laws are not fit for purpose to actually address hate speech –- so there we have a problem, and it’s the problem with the criminal provisions in the member states and not with the enforcement of the law by the platforms.

Then in the case of images of child abuse, it’s relatively clear — the legal situation is essentially the same all around the world. These images are illegal to spread and therefore if you have an exact copy of the same content then it’s very easy for a platform to say this is illegal, this needs to be taken down. And there I think the use of automated recognition of these images can be justified. And then it can be taken down at the source. The problem is this doesn’t work for copyright because with the copyright exceptions, just because something is using copyrighted content does not mean that it is actually infringing. And the problem is of course if you start putting in place infrastructure for one type of content — perhaps it’s justified with terrorism — then there will invariably be a strong push to use it for all types of other content where it is not justified. And I think — well, there are lots of examples for this — but I think for copyrighted content these automated tools simply undermine copyright exceptions. And they are not proportionate. I mean we are not talking about violent crimes here in the way that terrorism or child abuse are. We’re talking about something that is a really widespread phenomenon and that’s dealt with by providing attractive legal offers to people. And not by treating them as criminals.

TC: How do you believe startups might be disadvantaged by the current proposals for the EU copyright reforms? Big companies like Google have some clear risks but also big resources to respond to new laws. What specific risks do you see for startups?

Reda: There’s a certain cognitive dissonance among a lot of the regulators in Europe because on the one hand they are kind of upset about the fact there are so few European startups and they’re wondering how we can better compete with the U.S., but at the same time they’re putting in place laws that are targeted at the big U.S. tech giants but that actually end up hitting the domestic startups a lot harder because they have to comply with pretty strict regulations from the start that they’re not equipped to actually deal with, and that often hampers their possibility to get funding.

I mean something that an investor certainly does not want to have is legal uncertainty. And a big flaw of the proposals that are put on the table by the Commission is that they are unclear. If you took, for example, the neighboring rights for press publisher by its word you would have to conclude that taking a single word, or even a single letter from a publication would be an infringement because, unlike copyright, neighboring rights do not have a threshold of originality. But at the same time, of course, common sense dictates that you cannot have an exclusive right on a single word or a single letter. So it’s clear that interpreting what exactly this law protects would be up to the courts. And probably the courts in different countries would come to different conclusions. So this is a huge source of legal uncertainty and it’s particularly hitting those who are trying to create new and innovative business models. And I think this is quite tragic. It’s precisely startups that have the possibility to actually find the new business models that the cultural sector so dearly needs. It’s just that the large incumbents — such as those two publishing houses that are behind the press publishers’ rights, they don’t have a particular interest in having new competitors on the market that might be more efficient at bringing the news to people. So they have a clear interest in introducing this law. Even if they don’t think that they’re actually going to get any money from Google for using their snippets — it’s simply about making it more difficult for new market entrants to compete with them.

For the neighboring right [the biggest impact will be on] news startups, everybody who is dealing with news analysis. We had a couple of examples of startups like that that are, for example, trying to find ways to detect fake news, or to give people different sources or propose different sources to try to corroborate a story. Things like that would be extremely difficult with the neighboring right. It would also affect companies that are engaged in big data mining, because there is a new exception in the proposal that explicitly allows text and data mining for research organizations but not for anybody else. So this is an area where it’s currently quite unclear whether big data mining constitutes copyright infringement in the first place. But if you explicitly allow it for some then it kind of implies that it’s forbidden for others.

And I think the third kind of startup that is particularly affected by this is any kind of platform for sharing user-generated content. For example, we had an example of a Belgian startup called MuseScore, which is quite a popular platform for people to exchange sheet music — and it’s usually people simply sharing their own compositions. But of course there is no software that could automatically detect copyright infringements in sheet music because it’s not simply somebody copying the sheet music one on one. But rather whenever a composition to which the person who uploads the sheet music doesn’t have the rights, is included there this would constitute a copyright infringement so you would have to somehow technologically make the leap from a particular melody to that melody being expressed in sheet music and that technology is not available.

TC: Could this reform mean companies using large amounts of data for building AI models might technically be committing a copyright infringement — if they’re using copyrighted data to train a machine learning algorithm?

Reda: Yeah, if they’re, for example, learning to detect cats in pictures and using a bunch of cat pictures from the internet to train their algorithms, then the argument goes that by copying these images they are using a copyrighted work and they would need a license for that. In most countries it’s kind of clarified either that this kind of use is fair use or there’s specific exceptions for text and data mining — for example Japan has introduced a text and data-mining exception that clarifies that it’s not a copyright infringement. But there’s also the question should this be covered by copyright in the first place? Because you are not using the work as an intellectual creation you are just using the data in the work. For example, if you’re mining text and you’re looking for particular patterns, you’re not really interested in what the text means, you’re interested in how often a particular word is used or something like that. So arguably this is not actually a use of the work as such but rather just of the data that’s carrying this work. So if we introduce a text and data-mining exception only for certain organizations and startups are not included in that, then we’re basically saying that any kind of startup that if you’re using copyrighted content for training their AI would be performing a copyright infringement.

TC: On the flip side, you could argue that while algorithms may not be using the work itself there is a kind of value exchange going on, based on extracting something useful (and potentially profitable) from the data…

Reda: Copyright law was never designed to be based on whether or not you are commercially benefiting from the use or not — I mean, if this were the case then all non-commercial use of copyrighted works should be legal, but it’s not. It’s always based on whether or not you’re performing certain protected uses such as making a copy. And in the digital world you just need to make copies a lot more than in the analogue world. I think that would have been perfectly legal in an analogue content — such as reading a book and counting the number of times a certain word is used is not a copyright-relevant act in any way. And just when you’re using a computer to do the same thing, then it suddenly is.

The other issue is that it only makes sense to require people to get a license if it’s actually possible to get a license. But how would this work? If somebody’s just scraping loads of images off social media, for example, the rights holders of those images are spread all over the world — there are millions of them, and if you actually contacted them and said “hey I want to use your cat picture that you posted on Twitter for training my AI, can you please given me a license,” they would not know what the hell you’re talking about. The transaction cost of actually trying to do this legally would be so high that it would simply not pay to do this kind of research anymore. So basically by saying this is something that requires a license you are guaranteeing that it is simply not going to be done legally. But you’re not actually creating new business opportunities for anyone.

TC: I haven’t personally heard many European startups voicing concerns about the EU copyright reform — do you think there’s an awareness problem here? Or maybe they don’t yet realize the potential implications down the line?

Reda: I have a somewhat different impression. Because when we invited some startups to come to Brussels to speak about their experience it was extremely easy to find startups that were concerned about this, and had very specific concerns about either the neighboring right or the content monitoring. Of course, if you’re a startup founder you probably don’t have the resources to lobby in the same way that a large company does because you’re basically spending all of your time on developing your product, but nevertheless there are a number of startups that are actually coming to Brussels and talking to policymakers. They have formed a business association — Allied for Startups — which is also organizing their activities. And they focus quite a lot on copyright — so for example Allied for Startups has done this startup manifesto — scale up manifesto — that they have presented to the European Commission where they are extremely critical of these proposals. So of course I don’t expect every startup founder in the EU to know about this because it is still quite a complex legislative process. But I wouldn’t share the impression that they’re not concerned about this. My impression is more that if they know about it they are concerned.

TC: What arguments are you hearing from larger tech companies — like Apple, Google, Facebook, Spotify — about the copyright reform?

Reda: Apple, I have to say, has not been particularly active on this. And also Google. They’re mostly active through their business associations. So it’s extremely difficult to say what exactly is the position of which particular players. Google was invited to one of the hearings that we had in the legal affairs committee. And they were basically spending their time explaining how Content ID works, what they’re already doing voluntarily, and kind of also explaining the limits of what the technology can do — so, for example, they were quite open about the fact that it’s not capable of interpreting copyright exceptions and limitations.

Generally I would say the tech companies have been most concerned about the content-monitoring provision. Because it really affects a very broad range of companies, where the neighboring right is more targeted at a specific kind of company that is active in the news sector in some way.

I met with Apple this week but they were more concerned about the Electronic Communications Code, so the telecoms review that is going on at the moment. They did have concerns about the content-monitoring provision… I’ve spoken to SoundCloud and they are really quite concerned about this, and they were quite open in saying that if this kind of provision had existed when they started out, they would have never managed to survive. And nevertheless, they are kind of a licensed service nowadays and are able to work with the rights holders. So they’ve been quite active on this… I’ve met with Facebook at some point. And I mean they were just reiterating their concerns about the content monitoring and the neighboring right. It’s certainly on their radar.

I think generally [the big tech companies are] trying to emphasize that they’re already doing a lot of things on a voluntarily basis.

TC: You’ve personally been pushing for copyright reform for years — and made it your legislative priority. Why is that? And what would you really like to see happen? What would be your ideal copyright reform?

Reda: I think that copyright reform is absolutely crucial for access to knowledge and empowerment of people. I think the cultural sector is just one small element of this. I think where the negative effects of the copyright system are much more apparent is the academic sector, where basically you have a small number of extremely powerful publishing companies that have profit margins of upwards of 30 percent, that are basically living off getting articles for free from researchers at universities and then selling them back to the universities at astronomical prices. And I think this is an extremely unhealthy system, it’s contributing to global inequality because basically universities in developing countries and increasingly also in industrialized countries are not able to afford access to the content that is actually necessary to get a good education. So this is really what my motivation behind this copyright reform is.

Copies that are made in a digital environment should not be treated the same way as copies in the analogue age.

I’ve worked as a student assistant at a university — and I know firsthand the problems that exist with simply being able to access the knowledge that has been produced with public money because of the way that the copyright system is set up. What I would really like to see — I think where a huge mistake has been made in translating the copyright system to the digital world is that copies that are made in a digital environment should not be treated the same way as copies in the analogue age. If you have 20,000 copies of a digital book in your basement it’s very clear that your intention is to distribute them and so it’s kind of a short cut of the law to simply make the copies themselves illegal, and not just the distribution. But with digital technology that’s completely different because any kind of use of digital technologies requires the making of copies and it is not implied that just because you’re making copies your intention is to give those copies to somebody else.

Just to give you an example, a friend of mine has a digital hearing aid — a cochlear implant which is basically implanted into his brain and it translates an audio signal into a digital signal, and that’s why he’s able to hear again. And if there were no exceptions to copyright that allow for example this copy from analogue to digital then he would be committing a copyright infringement every time he’s listening to music. And this obviously doesn’t make any sense. So what I would really like to see would be a reform that simply does not take digital copies as the basis for what is considered to be a copyright infringement anymore.

TC: What do you see as the likely result of the copyright reform process — are you hopeful of being able to make substantial changes to the proposals?

Reda: I’m quite optimistic that we’re going to be able to defeat the neighboring right. It’s a wildly unpopular measure wherever it has been introduced in Germany and in Spain. The parliament has already voted against it several times. I’m of course concerned about the really intense lobbying from some publishers who are trying to shift the position of the parliament. But so far most of the parliament reports that have come out, including the Legal committee, they have all been proposing to get rid of the neighboring right.

I am more pessimistic when it comes to the content-monitoring provision because there it’s extremely difficult to change this proposal into something that is not harmful. It’s a very complex ecosystem and I think not everybody is aware of the problems associated with content recognition technologies. And as you were saying, it’s kind of mixed up with the discussions around terrorism and hate speech. And I think that’s always a very bad starting point for having a really targeted copyright reform that it’s not mixing up a lot of different issues. So there I’m a lot more skeptical.

TC: What happens next? What’s the timeline from here?

Reda: The European parliament has presented its report, and the deadline for amendments to that is actually today [last Wednesday]. So after everybody has tabled their amendments the person who wrote the report, the rapporteur, is going to take those amendments and form them into compromises. Then we’re going to vote on it in the committee, probably in June or July, and then it will go to the plenary vote and to negotiations with the Council. So a final text could be expected maybe in a year or so.

TC: So there’s still a chance for substantial amendments?

Reda: Basically so far the proposal from the Commission is only the starting point. And nobody is bound by what the Commission has proposed. And actually Council as well — there are a lot of national governments who are completely unconvinced by the neighboring right, and are asking a lot of critical questions, so it’s very possible that we can get rid of these proposals if we’re keeping up the public pressure and it’s convincing also national governments that this is also not in their interest.

This interview has been lightly edited and condensed for clarity.