18 August 2022

My spy is always with me

Comments on the planned obligations of Internet service providers to combat sexualised violence against children (so-called "chat control" regulation)

On 11 May this year, the European Commission presented its draft of a “Regulation laying down rules to prevent and combat child sexual abuse”. The planned regulation is intended to counteract sexualised violence against minors as well as its initiation on the Internet through a complex structure of obligations for the most important intermediaries of network communication. It also aims to establish a cross-border authority infrastructure with a new European Union agency as the central hub.

Initial reactions to the draft have been controversial. For example, numerous child protection associations have supported the Commission proposal in an open letter. In contrast, civil rights organisations, industry associations and data protection supervisory authorities have criticized the draft, in some cases sharply. The German Federal Government also revealed a sceptical attitude when it sent the Commission an extensive list of questions, some of which were recognizably critical.

On closer inspection, the draft on the one hand bundles measures to combat sexualised violence online that have been common for some time, but whose technical problems and legal issues have still not been fully solved. On the other hand, there’s also big news: According to the proposed regulation, individual communications on certain communications services are to be searched for certain content on a large scale and in part without any probable cause. This part of the draft is particularly critical.

Problematic terminology of the draft regulation

The Regulation places far-reaching obligations on various intermediaries of internet communication in order to address the misuse of their services for “online child sexual abuse”. Referring to Directive 2011/93/EU on combating the sexual abuse of children, the proposed Regulation defines this as the dissemination of “child sexual abuse material” (so-called “child pornography”) on the one hand, and the solicitation of children for sexual purposes (often referred to as grooming) on the other, insofar as information society services are used for this purpose. In the following, we refer to both types of content by the generic term “incriminated content”. In addition, instead of “child sexual abuse material”, we refer to depictions of sexualised violence against children. This is because the terminology of the regulation is not without its warts. The term “child abuse” is viewed critically by many victims because it could imply that there was legitimate “use” of children. We therefore use this term here only in the form of quotations, given that it is the terminology of the draft.

Overview of the main contents of the draft

Under the proposed Regulation, obligations would extend to the providers of four types of services:

  1. hosting services that store information on behalf of users, in many cases to make it available to third parties (this includes social media such as Facebook, Twitter or Instagram),
  2. interpersonal communications services that enable the direct exchange of information between selected individuals (such as e-mail services or instant messengers like WhatsApp or Signal),
  3. software applications stores,
  4. internet access services.

Service providers are subject to specific obligations. These obligations arise in part directly from the proposed Regulation, in part they would be created by order of a body appointed to implement the Regulation.

Risk assessment and mitigation obligations

The proposed Regulation directly obliges most service providers to implement an ongoing risk management. Providers of hosting services and interpersonal communications services would have to assess the risk of their service being used for the distribution of incriminated content at least every three years. On this basis, they would have to take measures to mitigate the identified risk, which may include content curation or age verification. In any case, providers of hosting services and interpersonal communications services must set up a reporting system so their users can alert them to incriminated content. Software applications store providers must evaluate the extent to which the applications offered on their platform pose a risk of the solicitation of children for sexual purposes. If necessary, they must set age limits for the use of such applications.

Removal and blocking of specific incriminated content

Further obligations on service providers may arise from an order of a court or an independent administrative authority. Under the proposed Regulation, such an order can only be issued upon request of the Member State authority primarily responsible for enforcing the Regulation (referred to in the Regulation as the coordinating authority). The powers to issue orders are each accompanied by requirements that should ensure transparency and effective redress. The obligations resulting from an order are partly concerned with specific content, the storage location of which is already known, and partly with any content which falls within certain categories.

Two possible obligations relate to specific content: Firstly, the providers of hosting services may be ordered to remove certain depictions of sexualised violence. Secondly, the providers of internet access services can be obliged to block access to certain depictions of sexualised violence if they cannot be removed because they are distributed by non-cooperative hosting services from non-cooperative third countries. The content to be removed or blocked is designated by means of Internet addresses (Uniform Resource Locators or URLs for short) that indicate a specific storage location. An order to remove or block depictions of sexualised violence therefore does not oblige the service provider concerned to actively search its service for specific content.

Detection and reporting obligations

However, such an obligation of hosting service providers and interpersonal communications service providers to actively search for potentially incriminated content may arise from a detection order. A detection order may concern three types of content:

  1. already known depictions of sexualised violence,
  2. new, hitherto unknown depictions of sexualised violence,
  3. for interpersonal communications services only: solicitation of children for sexual purposes.

If the provider of a hosting service or an interpersonal communications service becomes aware of possible incriminated content on its service, it must report the content concerned. This reporting obligation exists irrespective of how the provider learned of the content. It therefore includes notably such knowledge as the provider may have gained as a result of its independently established risk management and as a result of a detection order.

The report is to be addressed to the EU Centre on Child Sexual Abuse, a new European Union agency to be created by the Regulation. The Regulation conceives the EU Centre as an information, coordination, and service point. The EU Centre has no operational powers of its own, but it is to provide support and liaison to service providers and Member State authorities in a variety of ways. For example, the EU Centre shall maintain databases with indicators for the detection of online child sexual abuse. Service providers must use these indicators when implementing removal, blocking or detection orders.

If the EU Centre receives a report, it shall check whether the report is manifestly unfounded. If this is not the case, the EU Centre shall forward the report to Europol and to the law enforcement authorities of the Member State presumed to have jurisdiction.

Known measures: Obligations of providers of internet access services and hosting services

The measures to combat depictions of sexualised violence on the internet provided for in the Regulation are mostly familiar, albeit controversial. Some service providers have already set them up voluntarily. Some measures have been the subject of legal obligations. The discussion so far has highlighted some technical problems and legal doubts associated with these measures.

Re-launch of controversial internet blocking by access providers

Insofar as providers of Internet access services can be obliged to block certain depictions of sexualised violence on the basis of URLs, the regulation contains, from a German perspective, a new edition of the Zugangserschwerungsgesetz (Access Impediment Act), which came into force in 2010 but was never applied and repealed already in 2011. This law earned the then German Federal Minister for Family Affairs (and now President of the European Commission) Ursula von der Leyen the nasty nickname “Zensursula”.

Blocking URLs by access providers is problematic for two reasons, which have already been highlighted in the discussion on the Access Impediment Act:

First, there are doubts about the effectiveness of this measure. Some of the common blocking mechanisms can be easily circumvented. In the case of encrypted connections (e.g. https / SSL), moreover, the access provider cannot read the URL in full, preventing it from implementing the blocking without elaborate technical measures (e.g. forced traffic redirection to proxies).

Secondly, many blocking mechanisms are unable to limit blocking to the specific content to be blocked. Rather, they result in rather coarse content blocking, creating vast collateral damage in the form of overblocking. The Regulation does not specify how blocking is to be implemented technically, and thus does not set clear limits to such excessive blocking. However, the transparency and redress provisions at least enable service providers and users to take legal action against excessive blocking practices. Moreover, blocking orders are limited to content whose deletion cannot be achieved.

Overall, the authorization seems acceptable to us, provided that it is supplemented by concrete and practically effective precautions to prevent overblocking. This is particularly important because it is technically much easier for providers of internet access services to provide for coarse-mesh blocking e.g. on the basis of IP addresses or by manipulating the domain name system than to actually block individual content on the basis of URLs. Without concrete and effective provisions to prevent overblocking, the Regulation could otherwise create significant misguided incentives.

Removal and detection obligations of hosting providers

The obligations of the providers of hosting services provided for in the draft are to be judged in a differentiated manner. The authorization to issue removal orders relating to specific stored content is technically and legally unobjectionable and should be welcomed in terms of legal policy. According to press reports, hosting services are cooperative without exception, even in the case of information from non-governmental agencies, so that this obligation would, as it were, preach to the converted. More problematic are detection orders that oblige hosting service providers to actively search their stored content for known or unknown depictions of sexualised violence against children.

The Regulation explicitly leaves it up to service providers to decide on the technical means to implement detection orders, although the newly erected EU Centre is to provide them with some detection technology free of charge. In any case, the detection must be based on indicators of depictions of sexual violence against children provided by the EU Centre. Again, the regulation does not specify exactly what information these indicators contain. However, according to the current and medium-term foreseeable state of the art, the detection requirements amount to two technical implementations.

Hash value-based removal of known depictions of sexualised violence

To identify already known depictions, it is common to use hash values. A hash value is a kind of digital fingerprint that is calculated from a file and identifies it. In contrast, it is not possible to compute the underlying file from the hash value. A hash-based discovery mechanism compares the hash values of the files present on the hosting service with the hash values of known depictions supplied by the EU Centre. The aim is to ensure as far as possible that a depiction is recognized even after minor changes (such as changes in size or a new colouring). For this purpose, so-called perceptual hashing is used, which calculates the hash value not from the entire file, but from certain structural properties of the image material it contains. However, precisely because perceptual hashing is robust to change, it is also susceptible to false positive hits. For example, an innocuous image could yield the same hash value as a depiction of sexualised violence because it has a similar brightness distribution.

Wonder-weapon “artificial intelligence”?

In contrast, a hash-based detection approach cannot be considered for new, hitherto unknown depictions of sexualised violence against children. To automatically detect such depictions, only self-learning algorithms can be used. The algorithms identify patterns in training data and apply them to the content present at the hosting service. In any case, however, this method is likely to produce a much higher number of false positive classifications, which can be reduced, but probably not eliminated, by refining the analysis technique. For example, the classification of content as depicting violence may depend on the context of production and distribution. Moreover, complex balancing operations of conflicting values may be required for content that does not depict actual events, such as when content is to be classified as a work of art. The extent to which self-learning algorithms will be able to carry out the necessary contextual assessments and evaluations in the foreseeable future remains to be seen.

How can the reliability of detection systems be measured?

To assess and evaluate the risk of classification errors, one can make use of common metrics for classification models:

Accuracy indicates the proportion of correct classifications in the total number of classification operations. However, it says nothing about the distribution of classification errors among the classes. For example, if there are two depictions of sexual violence among 10,000 images and a detection system classifies all images as harmless, it commits only two classification errors per 10,000 classification processes and accordingly has a high accuracy of 99.98%. Nevertheless, the system is useless because it does not detect the depictions of violence.

Precision puts the true positive classifications in relation to all positive classifications. If, in the example, the system recognizes both depictions of sexualised violence in 10,000 classification processes and also classifies two harmless images as depictions of sexualised violence, its precision is only 50%, while its accuracy is 99.98%.

Sensitivity indicates the proportion of true positive classifications to the actual positives. In the example just given, sensitivity would be 100%, since the two representations of violence were correctly recognized.

Precision and sensitivity cannot be optimised simultaneously. Rather, improvements in one metric are usually at the expense of the other. For example, if a detection system is set to detect as many depictions of sexualised violence as possible, it will typically classify more harmless depictions as (false) positives.

Risk of incorrect classifications

The risk of a detection system for the freedoms of communication depends significantly on the precision of the system. In this respect, high values are given for the already existing systems for the detection of depictions of sexualised violence. For example, the hash-based detection system PhotoDNA, developed by Microsoft and used by many large service providers, is said to have an extremely low error rate of 1 in 50 billion. The accuracy of the system would then be so high that a very high precision can also be assumed. Comparably high precision cannot be achieved with systems based on machine learning, at least not at present. However, the Commission’s impact assessment, which was published together with the draft Regulation, mentions the commercial classification system Safer. According to the assessment, this system can achieve 99.9% precision and 80% sensitivity simultaneously. However, it is not clear from the impact assessment how exactly this encouraging result was calculated.

Requirements of the CJEU for so-called “upload filters”

The risk that automatic detection of infringements on hosting services leads to too high a number of false positive misclassifications, thereby unduly restricting freedoms of communication, does not arise for the first time in the case of depictions of sexual violence. For example, Directive (EU) 2019/790 provides for the obligation of certain hosting service providers (so-called online content-sharing service providers) to ensure that certain copyright-protected works are not available on their platforms and are not uploaded to them. In fact, this requirement can only be implemented through automatic filtering systems, which are technically very similar to the detection techniques used for misrepresentations. The Court of Justice of the European Union has approved this regulation. In justification, the Court referred to requirements of the Directive which limit the risk of overblocking due to false positive detection. In addition, the Court stated that the Directive only requires service providers to detect and block content that has been designated to them by the rights holders. Moreover, service providers would not have to block content if, in order to do so, they had to assess the unlawfulness of the publication themselves in a detailed legal examination based on copyright limitation rules. Finally, the Directive provides procedural safeguards to protect the freedoms of communication.

Basic viability of the disclosure obligation and need for change in detail

The current draft regulation also contains technical and procedural requirements to reduce the risk of excessive detection and to ensure transparency and legal protection. In addition, the EU Centre is interposed as a control body without operational powers within the framework of the reporting procedure. The plausibility check by the EU Centre reduces the risk that false positive hits of the detection system have severe consequences for the users concerned.

In view of this, the authorisation to issue detection orders against the providers of hosting services appears to us to be justifiable in principle under fundamental rights, provided that the stored content is published on the internet (as opposed to private online storage services like e.g. Dropbox) and meant to be downloaded by an indefinite number of persons. In these cases, the analysis of the stored content by the service provider does not violate any reasonable expectation of confidentiality of communications.

However, as things stand, the use of self-learning algorithms to detect new depictions of sexualised violence in particular raises considerable technical problems. Even though such depictions should generally be easier to detect than copyright infringements, the classification in certain situations may still cause difficult contextualisation and balancing problems. Therefore, the draft regulation should be amended to the effect that orders for the detection of new depictions of sexualised violence against children are only permissible if

  • a very high level of precision can be achieved with available detection systems based on robust findings,
  • the use of sufficiently precise systems is mandatory for service providers, and
  • the precision of the system is evaluated during its operation.

If necessary, this may cause only certain categories of unknown depictions of violence to be searched for, e.g. those that can be identified more precisely. If it becomes apparent that the precision during practical use falls significantly below a predefined threshold within a defined time window, the scanning must be discontinued.

New measure: detection obligation for interpersonal communication services (“chat control”)

The regulation treads on uncharted territory by authorising detection orders also against the providers of interpersonal communications services. Such an authorisation has not existed before (while a forerunner in some respects was Apple’s plan, announced in 2021 and postponed after massive criticism, to check iCloud uploads for known incriminated material). The planned authorisation appears to us predominantly untenable from a fundamental rights perspective.

Particularly high classification risk

Detection of incriminated material carries the risk of false positives, which can lead to excessive blocking and, in the worst case, to serious measures inflicted on communicating parties by EU Member States’ criminal justice authorities. Compared to hosting services, this risk seems to us considerably increased.

Firstly, images and videos that are sent to selected recipients via interpersonal communications services often contain content that can only be classified based on contextualisation. This is likely to considerably reduce the precision of algorithmic classification systems. For example, a picture of a naked child on the beach can be classified as a depiction of sexuali