My spy is always with me

Bäcker, Matthias; Buermeyer, Ulf

doi:http://dx.doi.org/10.17176/20220818-181949-0

Matthias Bäcker

Ulf Buermeyer

18 August 2022

My spy is always with me

Comments on the planned obligations of Internet service providers to combat sexualised violence against children (so-called "chat control" regulation)

On 11 May this year, the European Commission presented its draft of a “Regulation laying down rules to prevent and combat child sexual abuse”. The planned regulation is intended to counteract sexualised violence against minors as well as its initiation on the Internet through a complex structure of obligations for the most important intermediaries of network communication. It also aims to establish a cross-border authority infrastructure with a new European Union agency as the central hub.

Initial reactions to the draft have been controversial. For example, numerous child protection associations have supported the Commission proposal in an open letter. In contrast, civil rights organisations, industry associations and data protection supervisory authorities have criticized the draft, in some cases sharply. The German Federal Government also revealed a sceptical attitude when it sent the Commission an extensive list of questions, some of which were recognizably critical.

On closer inspection, the draft on the one hand bundles measures to combat sexualised violence online that have been common for some time, but whose technical problems and legal issues have still not been fully solved. On the other hand, there’s also big news: According to the proposed regulation, individual communications on certain communications services are to be searched for certain content on a large scale and in part without any probable cause. This part of the draft is particularly critical.

Problematic terminology of the draft regulation

The Regulation places far-reaching obligations on various intermediaries of internet communication in order to address the misuse of their services for “online child sexual abuse”. Referring to Directive 2011/93/EU on combating the sexual abuse of children, the proposed Regulation defines this as the dissemination of “child sexual abuse material” (so-called “child pornography”) on the one hand, and the solicitation of children for sexual purposes (often referred to as grooming) on the other, insofar as information society services are used for this purpose. In the following, we refer to both types of content by the generic term “incriminated content”. In addition, instead of “child sexual abuse material”, we refer to depictions of sexualised violence against children. This is because the terminology of the regulation is not without its warts. The term “child abuse” is viewed critically by many victims because it could imply that there was legitimate “use” of children. We therefore use this term here only in the form of quotations, given that it is the terminology of the draft.

Overview of the main contents of the draft

Under the proposed Regulation, obligations would extend to the providers of four types of services:

hosting services that store information on behalf of users, in many cases to make it available to third parties (this includes social media such as Facebook, Twitter or Instagram),
interpersonal communications services that enable the direct exchange of information between selected individuals (such as e-mail services or instant messengers like WhatsApp or Signal),
software applications stores,
internet access services.

Service providers are subject to specific obligations. These obligations arise in part directly from the proposed Regulation, in part they would be created by order of a body appointed to implement the Regulation.

Risk assessment and mitigation obligations

The proposed Regulation directly obliges most service providers to implement an ongoing risk management. Providers of hosting services and interpersonal communications services would have to assess the risk of their service being used for the distribution of incriminated content at least every three years. On this basis, they would have to take measures to mitigate the identified risk, which may include content curation or age verification. In any case, providers of hosting services and interpersonal communications services must set up a reporting system so their users can alert them to incriminated content. Software applications store providers must evaluate the extent to which the applications offered on their platform pose a risk of the solicitation of children for sexual purposes. If necessary, they must set age limits for the use of such applications.

Removal and blocking of specific incriminated content

Further obligations on service providers may arise from an order of a court or an independent administrative authority. Under the proposed Regulation, such an order can only be issued upon request of the Member State authority primarily responsible for enforcing the Regulation (referred to in the Regulation as the coordinating authority). The powers to issue orders are each accompanied by requirements that should ensure transparency and effective redress. The obligations resulting from an order are partly concerned with specific content, the storage location of which is already known, and partly with any content which falls within certain categories.

Two possible obligations relate to specific content: Firstly, the providers of hosting services may be ordered to remove certain depictions of sexualised violence. Secondly, the providers of internet access services can be obliged to block access to certain depictions of sexualised violence if they cannot be removed because they are distributed by non-cooperative hosting services from non-cooperative third countries. The content to be removed or blocked is designated by means of Internet addresses (Uniform Resource Locators or URLs for short) that indicate a specific storage location. An order to remove or block depictions of sexualised violence therefore does not oblige the service provider concerned to actively search its service for specific content.

Detection and reporting obligations

However, such an obligation of hosting service providers and interpersonal communications service providers to actively search for potentially incriminated content may arise from a detection order. A detection order may concern three types of content:

already known depictions of sexualised violence,
new, hitherto unknown depictions of sexualised violence,
for interpersonal communications services only: solicitation of children for sexual purposes.

If the provider of a hosting service or an interpersonal communications service becomes aware of possible incriminated content on its service, it must report the content concerned. This reporting obligation exists irrespective of how the provider learned of the content. It therefore includes notably such knowledge as the provider may have gained as a result of its independently established risk management and as a result of a detection order.

The report is to be addressed to the EU Centre on Child Sexual Abuse, a new European Union agency to be created by the Regulation. The Regulation conceives the EU Centre as an information, coordination, and service point. The EU Centre has no operational powers of its own, but it is to provide support and liaison to service providers and Member State authorities in a variety of ways. For example, the EU Centre shall maintain databases with indicators for the detection of online child sexual abuse. Service providers must use these indicators when implementing removal, blocking or detection orders.

If the EU Centre receives a report, it shall check whether the report is manifestly unfounded. If this is not the case, the EU Centre shall forward the report to Europol and to the law enforcement authorities of the Member State presumed to have jurisdiction.

Known measures: Obligations of providers of internet access services and hosting services

The measures to combat depictions of sexualised violence on the internet provided for in the Regulation are mostly familiar, albeit controversial. Some service providers have already set them up voluntarily. Some measures have been the subject of legal obligations. The discussion so far has highlighted some technical problems and legal doubts associated with these measures.

Re-launch of controversial internet blocking by access providers

Insofar as providers of Internet access services can be obliged to block certain depictions of sexualised violence on the basis of URLs, the regulation contains, from a German perspective, a new edition of the Zugangserschwerungsgesetz (Access Impediment Act), which came into force in 2010 but was never applied and repealed already in 2011. This law earned the then German Federal Minister for Family Affairs (and now President of the European Commission) Ursula von der Leyen the nasty nickname “Zensursula”.

Blocking URLs by access providers is problematic for two reasons, which have already been highlighted in the discussion on the Access Impediment Act:

First, there are doubts about the effectiveness of this measure. Some of the common blocking mechanisms can be easily circumvented. In the case of encrypted connections (e.g. https / SSL), moreover, the access provider cannot read the URL in full, preventing it from implementing the blocking without elaborate technical measures (e.g. forced traffic redirection to proxies).

Secondly, many blocking mechanisms are unable to limit blocking to the specific content to be blocked. Rather, they result in rather coarse content blocking, creating vast collateral damage in the form of overblocking. The Regulation does not specify how blocking is to be implemented technically, and thus does not set clear limits to such excessive blocking. However, the transparency and redress provisions at least enable service providers and users to take legal action against excessive blocking practices. Moreover, blocking orders are limited to content whose deletion cannot be achieved.

Overall, the authorization seems acceptable to us, provided that it is supplemented by concrete and practically effective precautions to prevent overblocking. This is particularly important because it is technically much easier for providers of internet access services to provide for coarse-mesh blocking e.g. on the basis of IP addresses or by manipulating the domain name system than to actually block individual content on the basis of URLs. Without concrete and effective provisions to prevent overblocking, the Regulation could otherwise create significant misguided incentives.

Removal and detection obligations of hosting providers

The obligations of the providers of hosting services provided for in the draft are to be judged in a differentiated manner. The authorization to issue removal orders relating to specific stored content is technically and legally unobjectionable and should be welcomed in terms of legal policy. According to press reports, hosting services are cooperative without exception, even in the case of information from non-governmental agencies, so that this obligation would, as it were, preach to the converted. More problematic are detection orders that oblige hosting service providers to actively search their stored content for known or unknown depictions of sexualised violence against children.

The Regulation explicitly leaves it up to service providers to decide on the technical means to implement detection orders, although the newly erected EU Centre is to provide them with some detection technology free of charge. In any case, the detection must be based on indicators of depictions of sexual violence against children provided by the EU Centre. Again, the regulation does not specify exactly what information these indicators contain. However, according to the current and medium-term foreseeable state of the art, the detection requirements amount to two technical implementations.

Hash value-based removal of known depictions of sexualised violence

To identify already known depictions, it is common to use hash values. A hash value is a kind of digital fingerprint that is calculated from a file and identifies it. In contrast, it is not possible to compute the underlying file from the hash value. A hash-based discovery mechanism compares the hash values of the files present on the hosting service with the hash values of known depictions supplied by the EU Centre. The aim is to ensure as far as possible that a depiction is recognized even after minor changes (such as changes in size or a new colouring). For this purpose, so-called perceptual hashing is used, which calculates the hash value not from the entire file, but from certain structural properties of the image material it contains. However, precisely because perceptual hashing is robust to change, it is also susceptible to false positive hits. For example, an innocuous image could yield the same hash value as a depiction of sexualised violence because it has a similar brightness distribution.

Wonder-weapon “artificial intelligence”?

In contrast, a hash-based detection approach cannot be considered for new, hitherto unknown depictions of sexualised violence against children. To automatically detect such depictions, only self-learning algorithms can be used. The algorithms identify patterns in training data and apply them to the content present at the hosting service. In any case, however, this method is likely to produce a much higher number of false positive classifications, which can be reduced, but probably not eliminated, by refining the analysis technique. For example, the classification of content as depicting violence may depend on the context of production and distribution. Moreover, complex balancing operations of conflicting values may be required for content that does not depict actual events, such as when content is to be classified as a work of art. The extent to which self-learning algorithms will be able to carry out the necessary contextual assessments and evaluations in the foreseeable future remains to be seen.

How can the reliability of detection systems be measured?

To assess and evaluate the risk of classification errors, one can make use of common metrics for classification models:

Accuracy indicates the proportion of correct classifications in the total number of classification operations. However, it says nothing about the distribution of classification errors among the classes. For example, if there are two depictions of sexual violence among 10,000 images and a detection system classifies all images as harmless, it commits only two classification errors per 10,000 classification processes and accordingly has a high accuracy of 99.98%. Nevertheless, the system is useless because it does not detect the depictions of violence.

Precision puts the true positive classifications in relation to all positive classifications. If, in the example, the system recognizes both depictions of sexualised violence in 10,000 classification processes and also classifies two harmless images as depictions of sexualised violence, its precision is only 50%, while its accuracy is 99.98%.

Sensitivity indicates the proportion of true positive classifications to the actual positives. In the example just given, sensitivity would be 100%, since the two representations of violence were correctly recognized.

Precision and sensitivity cannot be optimised simultaneously. Rather, improvements in one metric are usually at the expense of the other. For example, if a detection system is set to detect as many depictions of sexualised violence as possible, it will typically classify more harmless depictions as (false) positives.

Risk of incorrect classifications

The risk of a detection system for the freedoms of communication depends significantly on the precision of the system. In this respect, high values are given for the already existing systems for the detection of depictions of sexualised violence. For example, the hash-based detection system PhotoDNA, developed by Microsoft and used by many large service providers, is said to have an extremely low error rate of 1 in 50 billion. The accuracy of the system would then be so high that a very high precision can also be assumed. Comparably high precision cannot be achieved with systems based on machine learning, at least not at present. However, the Commission’s impact assessment, which was published together with the draft Regulation, mentions the commercial classification system Safer. According to the assessment, this system can achieve 99.9% precision and 80% sensitivity simultaneously. However, it is not clear from the impact assessment how exactly this encouraging result was calculated.

Requirements of the CJEU for so-called “upload filters”

The risk that automatic detection of infringements on hosting services leads to too high a number of false positive misclassifications, thereby unduly restricting freedoms of communication, does not arise for the first time in the case of depictions of sexual violence. For example, Directive (EU) 2019/790 provides for the obligation of certain hosting service providers (so-called online content-sharing service providers) to ensure that certain copyright-protected works are not available on their platforms and are not uploaded to them. In fact, this requirement can only be implemented through automatic filtering systems, which are technically very similar to the detection techniques used for misrepresentations. The Court of Justice of the European Union has approved this regulation. In justification, the Court referred to requirements of the Directive which limit the risk of overblocking due to false positive detection. In addition, the Court stated that the Directive only requires service providers to detect and block content that has been designated to them by the rights holders. Moreover, service providers would not have to block content if, in order to do so, they had to assess the unlawfulness of the publication themselves in a detailed legal examination based on copyright limitation rules. Finally, the Directive provides procedural safeguards to protect the freedoms of communication.

Basic viability of the disclosure obligation and need for change in detail

The current draft regulation also contains technical and procedural requirements to reduce the risk of excessive detection and to ensure transparency and legal protection. In addition, the EU Centre is interposed as a control body without operational powers within the framework of the reporting procedure. The plausibility check by the EU Centre reduces the risk that false positive hits of the detection system have severe consequences for the users concerned.

In view of this, the authorisation to issue detection orders against the providers of hosting services appears to us to be justifiable in principle under fundamental rights, provided that the stored content is published on the internet (as opposed to private online storage services like e.g. Dropbox) and meant to be downloaded by an indefinite number of persons. In these cases, the analysis of the stored content by the service provider does not violate any reasonable expectation of confidentiality of communications.

However, as things stand, the use of self-learning algorithms to detect new depictions of sexualised violence in particular raises considerable technical problems. Even though such depictions should generally be easier to detect than copyright infringements, the classification in certain situations may still cause difficult contextualisation and balancing problems. Therefore, the draft regulation should be amended to the effect that orders for the detection of new depictions of sexualised violence against children are only permissible if

a very high level of precision can be achieved with available detection systems based on robust findings,
the use of sufficiently precise systems is mandatory for service providers, and
the precision of the system is evaluated during its operation.

If necessary, this may cause only certain categories of unknown depictions of violence to be searched for, e.g. those that can be identified more precisely. If it becomes apparent that the precision during practical use falls significantly below a predefined threshold within a defined time window, the scanning must be discontinued.

New measure: detection obligation for interpersonal communication services (“chat control”)

The regulation treads on uncharted territory by authorising detection orders also against the providers of interpersonal communications services. Such an authorisation has not existed before (while a forerunner in some respects was Apple’s plan, announced in 2021 and postponed after massive criticism, to check iCloud uploads for known incriminated material). The planned authorisation appears to us predominantly untenable from a fundamental rights perspective.

Particularly high classification risk

Detection of incriminated material carries the risk of false positives, which can lead to excessive blocking and, in the worst case, to serious measures inflicted on communicating parties by EU Member States’ criminal justice authorities. Compared to hosting services, this risk seems to us considerably increased.

Firstly, images and videos that are sent to selected recipients via interpersonal communications services often contain content that can only be classified based on contextualisation. This is likely to considerably reduce the precision of algorithmic classification systems. For example, a picture of a naked child on the beach can be classified as a depiction of sexualised violence or as a harmless holiday greeting for family members, depending on the context of production and distribution.

Secondly, in the case of interpersonal communications services, detection orders may also cover solicitation for sexual purposes (“grooming”). Based on such an order, the service provider must identify communications between children and adults and examine them for sufficiently substantiated indications of solicitation. This requires a complex assessment of the content and context of the communication. It is true that there are initial technical solutions that are intended to provide this detection service, such as Project Artemis, which was developed by Microsoft together with partner companies. However, it is doubtful that a sufficient level of precision can be achieved in the foreseeable future. For example, the Commission’s impact assessment indicates an accuracy (apparently not precision) of 88% for Project Artemis, which means that about one in ten classification results is wrong. Since there are far more harmless communications than solicitations for sexual purposes, the vast majority of classifications would likely be false positives. With billions of interpersonal communications per day in the European Union, such an imprecise system would generate many millions of false positives every single day. The consequences would be high risks for users of interpersonal communications services and at the same time an extreme burden for the EU Centre and Member States’ law enforcement authorities.

Technical variants of “chat control”

In addition, a specific legal problem of detection orders in the case of interpersonal communications services is that detection breaches the confidentiality of communications, which is protected by fundamental rights. In this respect, such detection orders differ substantially from orders against the providers of hosting services meant for the general public: If – and only if – all content is directed at an indeterminate number of individuals, the service provider’s knowledge and analysis of the stored content does not undermine any reasonable expectation of privacy. As far as interpersonal communications are concerned, however, such an expectation is as evident as it is justified. Therefore, scanning such communication constitutes an interference with the confidentiality of interpersonal communication. And this breach of confidentiality is basically independent of how the detection mechanism is technically implemented, although the implementations vary in their depth of intrusion. In this respect, centralised, decentralised or hybrid solutions are conceivable:

Central scanning on the server: dangerous disincentives against effective encryption

The detection can take place centrally on a server of the service provider. However, this approach is technically unfeasible if the interpersonal communication is end-to-end encrypted (E2E) – as is urgently advisable – since in this case the service provider cannot access the communication in plain text. However, the draft does not rule out the possibility of dispensing with E2E for the purpose of scanning. The regulation therefore threatens to create significant false incentives for service providers: They might be tempted to weaken the protection of their users’ privacy across the board by doing away with E2E to comply with discovery orders more easily.

Decentralised scanning on the end device: inefficient and dangerous

Instead of scanning on the service providers’ servers, the detection can be shifted completely or partially to the users’ devices (so-called client-side scanning).

For example, to detect known depictions of sexual violence, the device could calculate the hash values of images or videos to be sent via a particular communications service. The device could match these hash values with a list of hash values of known depictions provided by the service provider or transmit them anonymously to the service provider for central matching.

It would also be conceivable for self-learning algorithms to run on the terminal device to detect new depictions of sexualised violence or solicitation for sexual purposes. However, it seems questionable to us to what extent these processes are technically feasible at all on today’s devices. The hardware requirements of powerful AI systems are likely to exceed the capabilities of many smartphones, so that at least partially server-based solutions would have to be used. In addition, in the case of apps with end-to-end encryption, the technology for scanning content would have to be built into each individual app. This would mean that similar technology would have to be installed several times – for each installed app – on every individual device. In particular the hefty lists of hash values and other indicators to detect incriminated content would each occupy scarce storage space. In addition, the technical challenges of client-side scanning would once again create significant disincentives against the use of E2E. Finally, they would massively increase the barriers to market entry for new providers of communications services and all but outlaw community-built and open-source communication systems.

Infringement of the confidentiality of communication

Regardless of centralised or decentralised implementation, however, a detection order inevitably means that the entire traffic of the service must be reviewed for suspicious content, unless the detection order is limited to certain users or groups of users. The implementation only decides who will perform this screening. Under a centralised solution, the service provider is put in charge as an auxiliary police officer. In the decentralised approach, the end device is equipped with a spying function, so that the users’ own devices monitor their owners – but without being able to influence the modalities and extent of the monitoring. In all scenarios, hits are reported to a central authority, initially without even informing concerned individuals, and can then lead to severe measures. The various approaches to scanning therefore do not differ significantly in their effects on users.

Much discussion has already taken place on the question of whether, from a technical point of view, detection breaks end-to-end encryption or leaves it untouched. This question, however, turns out to be largely a pseudo-problem for the fundamental rights assessment of scanning for incriminated content: While the draft does not prohibit E2E outright, it does set various incentives for providers not to use this technology. And even if communication “on the go” should still be encrypted, users can no longer trust the confidentiality of their communication, because in substance the encryption is circumvented in any case. Furthermore, the seemingly less intrusive approach using client-side scanning creates considerable risks for data security. These go far beyond a central solution, where the detection system can at least be protected comparatively well against unauthorised access. The reason is that in a decentralised approach, the analysis technology and the indicators must be at least partially transferred to the users’ devices. There, this very sensitive data processing can only be protected against unauthorised access much more poorly than on a centralized server.

“Chat control” in essence not compatible with fundamental rights

In our view, a detection obligation of the providers of interpersonal communications services potentially relating to the entire service – i.e. a comprehensive scanning of all chat histories irrespective of probable cause against individual communicators – cannot be justified under European fundamental rights. In such cases, the detection constitutes a total monitoring of at least certain file types, and, in the case of an order to detect both depictions of sexualized violence and the solicitation of children, potentially of the entire traffic transmitted via the service.

The closest comparison can be made to search term-based so-called “strategic” telecommunications surveillance, or bulk surveillance, as carried out by the intelligence services of many Member States for foreign reconnaissance purposes. It is true that both the European Court of Human Rights and the German Federal Constitutional Court have approved such surveillance programs under narrow conditions, in particular demanding tight measures to reduce the data collected as well as additional hurdles for the transfer of the data to authorities with executive competences. We can, however, hardly imagine that this permissive attitude may be transferable to the surveillance intended by the proposed Regulation: monitoring communications without any probable cause with the intent to fight possible crimes, where the data – by the very nature of the Regulation – shall be passed on to law enforcement authorities not on an exceptional basis, but intentionally and systematically. The German Federal Constitutional Court has even explicitly stated that warrantless strategic surveillance must be limited to foreign reconnaissance. Moreover, with regard to the fundamental rights of the European Union, against which the Regulation would have to be measured, the Court of Justice of the European Union has shown itself significantly less open to legitimising strategic mass data collection, at least with regard to the security authorities of the United States.

Conceivable: Search for known content only and suspicion-independent content control without reporting obligation

At most, a detection obligation for providers of interpersonal communication services seems justifiable to us if it is limited to already known depictions of sexualised violence. The existence of such a depiction may be regarded as a sufficiently concrete reason to justify a larger-scale search in a multitude of communication processes, at least as far as the respective sender is concerned. The detection of solicitations and new depictions of violence in the confidential communications of unsuspected persons, on the other hand, would merely be based on a risk analysis that is independent of factual circumstances, as characterises strategic surveillance a.k.a. bulk collection.

Finally, detection obligations might be assessed differently if they were not linked to reporting obligations to the new EU Centre and/or law enforcement. The detection system could instead be limited to refusing the transmission of content that is identified as a depiction of sexualised violence. Such systems might also issue a warning to the person sending such content that they may be committing a criminal offence. Users of interpersonal communications services would thus no longer be exposed to the risk of burdensome follow-up measures; this would reduce the intrusive weight of detection. The remaining risk of false positives could be contained, as already envisaged, by transparency-creating provisions and a redress mechanism. Nevertheless, such warnings could have a preventive effect, as users would be given the impression that they had been caught (“Big Brother effect”).

Conclusion

The prevention and prosecution of sexualised violence against children can, as a matter of great importance, legitimise intensive limitations of fundamental rights. Insofar as the planned regulation places hosting providers under far-reaching obligations, the interferences with fundamental rights, in particular with freedoms of communication, may be justified in principle, even though they are in part associated with considerable risks. However, as outlined above, the Regulation should be worded more narrowly in some regards.

The authorisation to issue detection orders for interpersonal communications services (“chat control” in the narrower sense), on the other hand, clearly goes too far. It potentially undermines the confidentiality of the concerned services across the board and without any probable cause. Not even the important concern of child protection can justify prescribing the use of personal spies to continuously peek into the communications of large parts of the population in order to generate suspicions in the first place.

This article is a translation of “Mein Spion ist immer bei mir“.

DOWNLOAD PDF

LICENSED UNDER CC BY-SA 4.0

EXPORT METADATA

SUGGESTED CITATION Bäcker, Matthias; Buermeyer, Ulf: My spy is always with me: Comments on the planned obligations of Internet service providers to combat sexualised violence against children (so-called "chat control" regulation), VerfBlog, 2022/8/18, https://verfassungsblog.de/my-spy-is-always-with-me/, DOI: 10.17176/20220818-181949-0.

My spy is always with me

Comments on the planned obligations of Internet service providers to combat sexualised violence against children (so-called "chat control" regulation)

Problematic terminology of the draft regulation

Overview of the main contents of the draft

Known measures: Obligations of providers of internet access services and hosting services

New measure: detection obligation for interpersonal communication services (“chat control”)

Conclusion

Leave A Comment Cancel reply