Are online platforms responsible for illegal third party content published on their networks? Many would unequivocally say yes and recent attempts to go after Facebook for its alleged role in facilitating terrorism is a vivid testament to that. In recent months, calls urging tech giants to do more to police their space grew louder internationally as well as in major European capitals. The most stringent ones came from Berlin. According to a new German law that came into force this month, social media companies now have to remove hate speech within 24 hours (in straightforward cases) or face astronomical fines reaching 50 million euros. Viewed in this context, a set of guidelines for online platforms issued by the European Commission last week was anything but unexpected, and the way they were presented suggests the EU is willing to go beyond mere rhetoric and pass legislation should those on the receiving end of the message fail to comply. So what are platforms being asked to do? The request essentially boils down to better detection, prevention and faster removal of illegal content, coupled with greater transparency. In each of these areas the Commission highlighted specific measures it wants online platforms to implement.
Trusted flaggers are widely used by Google to weed out inappropriate content on its services, in particular YouTube. Individuals and organisations can apply to become trusted reporters responsible for flagging content that violates YouTube’s Community Rules. Once flagged, the content is reviewed internally by trained teams who decide whether a reported video should be removed. Google claims trusted flaggers are accurate 90% of the time, or three times more than an average user. The likes of Facebook and Twitter also rely on trusted flaggers but unlike Google, which openly describes its program on the website, these companies hardly share any details with the public. Small bits of information are provided by the Commission and various third parties, but intermediaries themselves are being surprisingly tight-lipped on the matter. More transparency in the way Facebook and Twitter operate their trusted reporter programs is therefore needed, not only to help prospective adopters but also to make their own actions (takedowns) more legitimate.
Besides trusted reporters, the Commission encourages platforms to “establish easily accessible mechanisms [for] users to flag illegal content.” Without specifics one can only speculate as to what they are since many mediums are available for users to report violations. Intuitively, these probably include online forms, flagging buttons and emails although many intermediaries also accept notifications by post. It is not uncommon for intermediaries to have two or even three channels in place. Amazon is a case in point. It provides a form for reporting trademark and copyright infringements. Then each posting and comment can be flagged as inappropriate by logged in users and there is also a postal address for reporting defamatory content.
Historically, email notices were dominant up until 2010, but since 2011 their relative numbers have gradually decreased, giving way to web forms that displaced all other notice formats. Where forms are provided, different types of illegal content can be reported using a single form or several ones. For example, XS4All, a Dutch internet service provider, offers one form to report child pornography, discrimination, libel, slander, IP infringements, unfair competition, violation of trade secrets, spam, malware and criminal offences. By contrast, Twitter has a separate form for each of the following: abusive or harassing behaviour, impersonation, trademark infringement, counterfeit goods, copyright infringement, privacy, private information, spam, self-harm, Ads and components of a Moment. The type and number of notification channels has to be an individual choice of each platform – that goes without saying – but if effective monitoring is a priority then intermediaries should opt for channels that are most conducive to it, such as online forms and flagging buttons.
Automatic detection technologies
The different channels described above tend to involve “manual” work on the part of both senders and receivers (i.e. intermediaries). But a lot of online content is identified and, if necessary, blocked by automatic filtering systems, and it is one area where the Commission wants to see more progress qua investment. Those of us with experience uploading files to YouTube will know what “fingerprinting” means in practice. The process works by scanning your files and matching them against a database of copyrighted material. If there is a match, what happens next depends on rightsholders’ preference; they can choose to block, monetise or track your video.
For violations like child sexual exploitation a similar approach called hashing is used. PhotoDNA developed by Microsoft became somewhat of an industry standard, with major tech giants, among them Facebook, Google and Twitter, all using it to block child abuse images on their platforms. The technique works by converting an image to black-and-white and then breaking it into a grid. The resulting intensity gradients in each cell are used to create a photo’s DNA. PhotoDNA hashes are resistant to alterations, whether it’s resizing, resaving or digital editing, which makes them pretty powerful digital identifiers of harmful content.
The same companies that implemented PhotoDNA recently decided to use hash based technologies to curb the spread of online terrorist content. The outcome of this collaboration is a shared database of hashes of previously removed images and videos. Facebook, Microsoft, Twitter and YouTube are adding hashes of illegal files to the database for the benefit of others interested in identifying similar content on their services, in reviewing this content against respective policies or in removing matching content as appropriate.
The cost of this particular initiative is not known but, for example, we know that YouTube’s ContentID was anything but cheap – 60 million US dollars. Soundcloud’s system was cheaper but still quite expensive – 5 million euros. And although PhotoDNA is now available as a free cloud service, its use inevitably involves staff costs that not all SMEs may be able to afford. Indeed, costs are one of two reasons why automatic detection technologies are not widely adopted by (smaller) intermediaries; the second one is fear of losing liability protection. Article 14 of the E-Commerce Directive grants information society service providers – the term used in the legislation – immunity on condition that (a) they have no “actual knowledge of illegal activity or information” and (b) if they do obtain such knowledge they “act expeditiously to remove or disable access to the information.” But what counts as expeditious removal?
In Germany’s case, as mentioned before, it’s 24 hours if the content is easy to verify and seven days if the work requires more effort. The Commission is less prescriptive in its guidelines but considers the situation in which removals take more than a week “unsustainable.” In Austria, the Supreme Court also considered removals lasting one week as being too long. However, in Capitol Records v. Vimeo, the district court found that “given the number of infringing videos  at issue, the three and one-half week period it took Vimeo to comply with the notice constitutes expeditious removal.” The district court also held that a “one-day response time” to remove between one to six videos referenced in a notice was expeditious. Other courts have similarly held that removal of infringing content within one to several days of receiving a notice constituted expeditious removal.
Starting 2014 more and more intermediaries began publishing regular updates on notices and their outcomes in special transparency reports. The focus of most reports is IP related infringements although some also include data on government requests for private information. Some reports are available in machine readable formats, others as pdf documents. Some are published twice a year, others only once. The level of detail also varies from intermediary to intermediary. Twitter, for instance, provides statistics on a wide range of outcomes, including the number of takedown notices received, the number of accounts and tweets affected, the number of counter notices filed, the percentage of materials eventually restored. A platform like WordPress, on the other hand, shares data only on the number of DMCA takedown notices received, the percentage of copyright notices where some or all content was removed and the number of counter-notices submitted.
Information on notices may be patchy but at least it’s available for a handful of platforms (all of which tend to be US based by the way). In comparison, information on the effectiveness of automatic detection technologies is shrouded in complete secrecy. How effective is PhotoDNA? How many pieces of content are affected by ContentID daily? Can users dispute a claim if their content was blocked by one of these systems? The answer is – we don’t know because intermediaries are guarding this information like a trade secret. Some even fail to admit they have automatic systems in place. To this camp belong Vimeo and Dropbox, both of which use digital fingerprinting to prevent copyright infringements but say nothing about these measures on the official channels; relevant information can only be found on third party sites.
In its guidelines, the Commission is only asking for transparency reports “detailing the number and types of notices received.” Considering that no such information has yet been published by European platforms, the first transparency report by Soundcloud, Dailymotion, Allegro, DaWanda or any other strictly EU-based intermediary would be a major milestone. But it’s important to also keep in mind the secrecy surrounding automatic detection systems; the Commission should be explicit about the issue in the future if it’s serious about promoting complete transparency.