KEMBAR78
Signal | PDF | Privacy | Computing
0% found this document useful (0 votes)
51 views22 pages

Signal

Uploaded by

kovebe8919
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views22 pages

Signal

Uploaded by

kovebe8919
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Katholieke

Universiteit
Leuven

Department of
Computer Science

SIGNAL PRIVACY IMPACT ASSESSMENT


H00Y2A Privacy and Big Data

Håkon Stene Ness


Marius Andreas Arder
Petter Leine Alnes
Thibault Heintz
Academic year 2023/2024
Contents
1 Application Description 2
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Stakeholders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.5 Design and Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 LINDDUN Analysis 4
2.1 Linking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Identifying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 Non-Repudiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.4 Detecting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.5 Data Disclosure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.6 Unawareness & Un-intervenability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.7 Non Compliance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3 Risks 6
3.1 Technical risks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.1.1 Phone Numbers & Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.1.2 Contact Discovery mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.1.3 Contact Discovery crawling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.1.4 Metadata capture on peer-to-peer calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2 Legal risks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.3 Ethical risks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

4 Testing 8
4.1 Contact Discovery Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4.2 Meta-Data Capture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

5 Recommendations 10
5.1 Improve Membership Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5.1.1 Mutual Contact Discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5.1.2 Mitigation strategy for crawling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5.1.3 Optional Hidden Account . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
5.2 Alternatives to phone numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
5.3 Upcoming Signal Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

6 Conclusion 11

A Appendices 15
A.1 Signal Code snippet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
A.2 Proposed unit test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
A.3 Scenario tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
A.3.1 Arbitrary contact discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
A.3.2 Sending a message to an arbitrary number . . . . . . . . . . . . . . . . . . . . . . . . . . 21

List of Figures
1 Signal application diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Old Contact Discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3 Wireshark Capture Regular Call . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4 Wireshark Capture VPN Call . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
5 Wireshark Capture Proxy Call . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
6 Wireshark Capture VPN & Proxy Call . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
7 Scenario test 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
8 Scenario test 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

1
1 Application Description
1.1 Introduction
This report presents an in-depth privacy impact assessment of the Signal messaging app on Android, with a
focus on the use of phone numbers and the contact discovery feature. It evaluates how these aspects potentially
affect user privacy using the LINDDUN framework and provides recommendations to improve upon them.

1.2 Functionality
Signal is known for its commitment to user privacy through robust end-to-end encryption and minimal data
collection. Whether it’s messages, calls, or video calls, Signal uses the Signal Protocol to ensure that the
content of these communications remains exclusively accessible to the intended recipients. As a result of
state of the art encryption neither malicious actors nor Signal itself can decipher the data being transmitted,
establishing a high standard of security and confidentiality.
In addition to its security measures, Signal also offers other features. These include customizable chat
options with various colors and themes, scheduling messages for future delivery, and the ability to set messages
to disappear after a specified time. Users can also manage their data usage preferences, utilize an incognito
keyboard for added privacy, and make in-app payments securely [1].
Signal is available on multiple platforms, like iOS, Android, and Desktop. However, it’s important to
note that synchronizing old messages is not a feature as Signal prioritizes data privacy. All data, including
messages, is stored locally on individual devices, enhancing security and maintaining user privacy.

1.3 Stakeholders
One of the main stakeholders in Signal are the developers and donors, which consists of entities like the Signal
Technology Foundation, Signal Messenger LLC, and a global network of individuals who contribute to its
open-source code. The initial development of Signal, under Open Whisper Systems, was funded through a
combination of consulting work, donations, and grant money.
Countries and states are crucial stakeholders when considering the intricate dynamics surrounding Signal.
National governments and regional authorities worldwide are constantly seeking to balance the rights of their
citizens to privacy with the imperatives of national security and law enforcement. Signal, with its strong data
privacy features and end-to-end encryption, significantly limits the amount of user data accessible to external
parties, including governmental agencies. The NSA has identified Signal as a significant challenge in their
mission to access users’ private data [2].

1.4 Data Collection


• User phone number: Is required to register for any of Signal’s services [3]. This is the only information
Signal has which can be directly linked to a user’s persona.

• User profile name and picture (optional): Users can optionally add this information to their profile
to make them recognizable to others. However, this information is end-to-end encrypted and not stored
on Signal’s servers [4].
• Contacts (optional): Users have the option to give Signal access to their contacts to check if they
are registered users. The application can then browse through the system contacts in order to discover
which contacts are registered Signal users. The contact discovery process is designed in such a way that
Signal never obtains information of your contacts. [5].
• Messages: Only stored locally on user-devices. Encrypted so that Signal cannot read them [3].
• User: Signal stores technical information such as randomly generated authentication tokens and keys.
These are not associated with the user, and are solely used for setting up calls or transmitting messages [3].
• User: Unix timestamps of when the user was last logged into the app, and when the user created their
account [6].

2
1.5 Design and Implementation
Signal is well-known for its strong commitment to user privacy and data protection. The application incorpo-
rates several key design and implementation choices that prioritize user privacy.
On the client side, the application incorporates an encrypted SQLite database for local chat storage on
users’ devices [7]. This database enables users to access their chat history conveniently. This means that a
users chat history is unavailable on any other device logged in to the same account unless you transfer them [8].
The server-side, on the other hand, comprises Signal’s cloud servers, which act as intermediaries in the
messaging process. These servers play a crucial role in securely transmitting messages between users. When
a message is sent, it is temporarily stored in an encrypted form on Signal’s cloud servers until it reaches its
intended recipient. Upon the recipient opening the Signal application, the message is retrieved from the cloud
server and delivered. Once the delivery is successfully completed and confirmed, the encrypted message is
promptly removed from Signal’s database [3].

Figure 1: Signal application diagram

At the heart of the Signal Application is End-to-End encryption (E2EE), which ensures the privacy of
the content of the messages sent in through the application. To understand how this concept is implemented
we must understand the Signal protocol. The Signal Protocol uses state of the art encryption techniques
to ensure safe transfer of user data. The protocol combines the Double Ratchet algorithm, prekeys, and a
triple Elliptic-curve Diffie–Hellman (3-DH) handshake, and uses Curve25519, AES-256, and HMAC-SHA256
as primitives [9]. The key takeaway from this is that they ensure that only the intended recipient can decipher
messages. This design choice ensures that even Signal cannot access the content of your messages [10].

3
2 LINDDUN Analysis
In this section we will highlight some threats found using the LINDDUN threat modelling system [11].

2.1 Linking
Linking is a potential privacy threat where the data points of a user are connected with other data sources to
retrieve information about the user. Signal uses phone numbers for logging in, creating a one-to-one link to a
specific phone number.

• Third party services and account registration: A third party verification service can link a phone
number and verification code to a specific user.
• IP addressing: Signal allows for redirecting all calls through their service, but this is only done by
default when receiving calls from someone outside your contact list. This makes it possible to track IP
addresses of the people you are communicating with. IP addresses can e.g., be used to infer information
about a geo-location, the use of a service provider or belonging to an organization.
• Messaging metadata: Signal is able to read metadata of messages, but this information is not stored.
If Signal was a malicious actor, this information could be used to deduce a social graph to see which
users are talking to each other. Avoiding metadata entirely would however be nearly impossible and
impractical in terms of sending messages.

2.2 Identifying
• Use of phone numbers: The threat of identification is associated with linking. The users provided
phone number is potentially identifiable data. This means the identity of the user may be revealed if the
phone number can be linked to an identity. Some phone numbers are searchable online, which makes
them identifiable. Even though a phone number is not searchable online, it is still relatively easy to gain
information about it.

2.3 Non-Repudiation
Not being able to refute claims. Signal, like many other messaging apps, has the functionality of reading
confirmations and typing indicators (pop-up that shows if someone is typing).
• Read Confirmation: Signal has read confirmations on messages that are read by the user. This may
prevent a user from using the argument of plausible deniability if for example Signal messages are used
as evidence in a legal case.

2.4 Detecting
The threat of detection is similar to what is outlined under Linking and Identifying. As stated above, one can
infer that a person is a user of Signal.

• Contact Discovery: Signal provides a contact discovery feature to see if a personal contact’s phone
number is linked to a Signal account.

2.5 Data Disclosure


Signal is specifically designed to store as little user data as possible and the design and implementation choices
are made accordingly [10]. Phone numbers could be regarded as excessively exposed, both to other users and
the third party phone number verification service.

2.6 Unawareness & Un-intervenability


Signal makes a great effort towards allowing the user to control their own data flow and what information
is shared. They also provide significant amounts of reading material on their web pages that users can read
to gain insight to how Signal handles their data and what options a user has [10]. The main concerns are
surrounding the default privacy.

4
• Maximum security by default: Some features that provide additional/improved privacy settings
are not turned on by default. Examples of these are the registration lock or the ”always relay calls”
option [12].
• Access to physical devices: Face ID or password are not turned on by default, meaning that if
someone gets access to your phone, this implies access to your Signal account.

2.7 Non Compliance


Signal as an organization is inherently adherent to privacy and security regulations [10].

5
3 Risks
3.1 Technical risks
3.1.1 Phone Numbers & Privacy
The Signal foundation has faced much criticism regarding the use of phone numbers for user authentication.
This is because, inherently, a phone number is a potential personally identifiable piece of information. Phone
numbers are often leaked in data breaches [13], and in many countries, one must provide an identity to register
a phone number.
Some users may be uncomfortable sharing their phone number with all people they communicate with
through Signal. Currently, there is no option to hide your phone number from other users. Therefore, messages
cannot be sent without revealing this information.

3.1.2 Contact Discovery mechanism


While setting up the app, there is an option for the discovery of Signal users among personal contacts. Until
recently, this method relied on uploading and comparing hashes of your contact’s phone numbers on the Signal
servers. From the source code and the Signal blog, we can derive the phone numbers were stored as a truncated
SHA-256 hash, while packages sent to the servers were additionally encrypted using AES encryption. [5]

Source: [14]
Figure 2: Old Contact Discovery

Figure 2 depicts the construction of a discovery request in the old contact discovery implementation. While
this hashing was intended to prevent the Signal service from seeing your contacts, the reversal of hashes has
been shown to be trivial. One can leverage the knowledge of the structure of the original data, the phone
number, and potentially additional information such as the country of origin. Marx et al. have demonstrated
this for phone numbers using a mask attack. A mask attack can make the brute force reversal of a hash faster
by providing priors on the structure of the data to be recovered. Hagen et al. demonstrated this vulnerability
at scale, in the setting of contact discovery for multiple messaging apps. Methods include the creation of a
key-value lookup table, brute-forcing and rainbow tables. All these methods’ efficiencies benefit heavily from
implying phone number standards, rendering the hashes trivial to reverse. [15, 16]
Signal has identified this shortcoming. In a more recent implementation, contact discovery is done trough
a secure enclave using Intel® SGX. This secure enclave protects data while in use. The host operating system
can not access and therefore not tamper with this environment. The Signal Client can verify that the contact
discovery code that is running on the Signal server is the same as the published open source code. Furthermore,
Signal implemented oblivious RAM, ORAM, to prevent the operating system from inferring information from
memory access patterns. [17] While a secure enclave, like Intel SGX is very secure in theory, in practice many
potential exploits have been found. [18, 19]

3.1.3 Contact Discovery crawling


The second problem that arises from this user discovery feature is the possibility of mass crawling of the
service, potentially providing the attacker with an index of who is a user of the service. Using this approach,
Hagen et al. was able to crawl over all US registered phone numbers and discovered 2.5 million Signal users
in 25 days using just 100 accounts. They discovered that at the time of testing, Signal throttled the contact
discovery mechanism with a leaky bucket structure. This means that when requests are received, the ’bucket’
is filled. When the bucket is full, new requests will be denied until enough space has been created again by
’leakage’ from the bucket. The bucket size was found to be 50 000 phone numbers with a daily leakage of 200
000 numbers. [15]

6
3.1.4 Metadata capture on peer-to-peer calls
In March 2017, Signal implemented peer-to-peer calls as their default connection method for calls. The reason
for this was to reduce call latency and improve the user experience. [12]. Whenever the user makes or receives
a call from someone in their contacts, Signal will attempt to establish the connection peer-to-peer. The issue
with this is that it exposes the IP-address of the user, and consequently reveals the users rough location. In
4.2 we demonstrate how this information can be extracted.
The greatest risk regarding this issue, is that user might not know that IP addresses are exposed when
calling and that they can be used to infer a users location. There is an option in the Signal settings to relay
all calls through the Signal servers, however, this is disabled by default an thus goes against the principle of
privacy by default. Users unaware of this could take it for granted that their location is completely undisclosed.

3.2 Legal risks


Signal as an organization is fundamentally committed to protecting user privacy and collecting as little data
as possible, while still maintaining the functionality of their platform [10]. The risk of them breaching any
laws regarding privacy can therefore be considered as negligible.

3.3 Ethical risks


A possible risk is that external forces may affect Signal’s decision-making. Repressive governments may have
an interest in insights of the content that is spread through digital messaging platforms. These governments
may try to force Signal to give up on their principles in order to be allowed service within that country. The
reasoning for this can for example be to prevent terrorism or avoid spread of sexual content. These were some
of the main arguments for UK’s controversial Online Safety Act [20], would give the government the power to
ask companies to scan their users private messages for this type of content.
Signal’s response was that they would retreat their services from the UK if this bill was passed. In
November 2023, Signal’s president Meredith Whittaker stated: “We would absolutely 100% walk rather than
ever undermine the trust that people place in us to provide a truly private means of communication”. [21]
Considering Signal’s track record for transparency with data-collection and information-security - and their
conduct regarding the Online Safety Act - we consider this risk to be minimal.

7
4 Testing
4.1 Contact Discovery Service
In order to test potential vulnerabilities in Signal’s Android app, we developed a unit test that simulates the call
to the Contact Discovery Service (CDS), a server-side component provided by Signal. This service establishes a
secure connection between the application and the server for contact discovery, ensuring end-to-end encryption
up to the secure enclave.
Our testing approach utilizes functionality from the getRegisteredUsersWithCdsi() method in SignalSer-
viceAccountManager.java, as detailed in Appendix A.1, to check for the possibility to exploit the CDS. To
create a valid request to the CDS, we must provide several input parameters, including a set of correctly
formatted telephone numbers. Following the construction of a valid request, we establish a connection to the
secure enclave by assuming the presence of a valid SignalServiceConfiguration and PushServiceSocket, allowing
us to obtain tokens for CDS calls.
The test concludes with the submission of a request to the CDS, parsing the response, and verifying that
no registered users are found. This verification ensures that the synthetic E.164 numbers used in the test do
not correspond to any registered users in the system. The complete test can be found in Appendix A.2.
It’s essential to note that to run this unit test, you need a valid instance of SignalServiceConfiguration
and PushServiceSocket, a condition that we failed to replicate fully in a unit test. To showcase our expected
results, we conducted scenario tests (Appendix A.3) and found that contact discovery crawling is possible. In
both scenario tests we where able to link arbitrary telephone numbers to active Signal accounts.

4.2 Meta-Data Capture

Figure 3: Wireshark Capture Regular Call

Figure 3 shows how Wireshark can be used to expose the IP address of the caller. Utilizing the online tool IP
Tracker [22], we were able to deduce that the caller had Telenet in Belgium as their provider and that they
were located in Ghent during the call.
There are, however, ways to prevent this. Firstly, Signal offers the option to always relay calls via the
Signal servers [12]. The IP addresses of the users will then be hidden from each peer. Another mitigation
method can be to use VPN when calling with Signal.

8
Figure 4: Wireshark Capture VPN Call

Figure 4 shows the Wireshark capture of a call where the caller has set their VPN location to Canada while
calling from Ghent, Belgium. As expected, the location and service provider were skewed.

Figure 5: Wireshark Capture Proxy Call

In figure 5 the caller activated the option in their settings to ”always relay calls”. This relays the call via
the Signal servers, which in turn also skews the position of the caller. IP-Tracker showed that the user was
calling from Groningen in the Netherlands.

9
Figure 6: Wireshark Capture VPN & Proxy Call

With both VPN and call relay activated, we get the result displayed in figure 6. In this case, the call was
made from Leuven in Belgium, the caller‘s VPN location was set to Norway, while the location deducted from
Wireshark and IP-tracker showed that the call came from Finland.
Our testing shows that there is some correlation between the actual position, and skewed position of the
user when relaying their call through Signal. Even though it is very rough, it still reveals which part of the
world the user is located. It is reasonable for Signal to do this for latency purposes, as this was already the
reason behind the peer-to-peer call implementation [12]. However, to be completely sure that their position is
different, a user must utilize a VPN.

5 Recommendations
5.1 Improve Membership Privacy
An issue with Signal today is the fact that any adversary can use contact discovery to reveal if a phone number
is a registered user of the application. This can be done either by syncing the contact list stored on the device,
or by looking up a phone number manually in the ”write new message” service. Other users will also be
notified if someone in their contact list registers a new account with Signal. User’s have no intervenability
in this matter and all users are discoverable through this feature. It is clear that Signal has intentionally
sacrificed some privacy for practicality or ease of use. However, this can be a deal breaker for someone who
wants to send messages in full confidentiality.

5.1.1 Mutual Contact Discovery


A feature first suggested by Hoepman [23]. This restricts users to only be able to discover users who also have
themselves in their contacts. Users will also be able to mark which users in their list that they want to hide
their account from. This approach also solves the issue of phone number fuzzing mentioned in 3.1.3. All users
will only be allowed to discover users who ”want” their discovery. The technical advantages of this approach
is that it is not computationally expensive, and does not require increased traffic and server load compared to
regular one-sided contact discovery. The only slight concern would be server storage [23, p. 20]. As Hoepman
comments, a minor security issue still remains; If Alice and Bob are both on Signal and both have each other
in their contact lists, Bob will know if Alice removes him from her contact list when he tries to sync his contact
list with Signal. Alice will in that case not show up for him in Signal. Yet, the mutual contact discovery is
still a great improvement from today’s mechanism.

5.1.2 Mitigation strategy for crawling


As described in 3.1.3, crawling the contact discovery service may be a large privacy issue. In their paper,
Hagen et al. proposed splitting the discovery feature in two parts. In an initial contact discovery request,
hash comparison is done with the entire database of hashes. For updating the contact list of an existing user,
discovery requests only compare with hashes of phone numbers that have recently registered. Both have a

10
separate leaky bucket structure. This implementation would allow Signal to set more stringent rate limits on
contact discovery with the entire data set and a more lenient limit on discovery of the most recently registered
Signal users. [15]

5.1.3 Optional Hidden Account


A simpler implementation could be to offer users the option to hide their account from contact discovery. Signal
already offers options for users who wish to be more restrained with their data, like optional profile name, call
relays and self-deleting messages. This feature would further extend the options for users to intervene in their
own privacy. In any case, Signal should be offering some support for users who prefer to stay hidden on their
platform.

5.2 Alternatives to phone numbers


The usage of phone numbers as authentication is sub optimal, but it is not redundant. First of all, the phone
number serves as a preventive measure for the creation of spam accounts, which is both cumbersome and
expensive. Additionally, it allows a seamless initiation process for users. The contact list synchronization is
undeniably an effortless way for users to establish contact with each other.
Therefore, we believe that completely removing phone number registration would introduce more challenges
for Signal than it would solve. An intermediate approach could be to allow users to display usernames instead
of phone numbers. This would allow the user to communicate with others just using the username and without
sharing a phone number, while still having the benefits of phone numbers, like spam prevention.

5.3 Upcoming Signal Implementations


Upon writing this paper, we discovered that Signal have a substantial update planned in early 2024 that will
implement some of these ideas. [24] The details are still unclear. However, from what we can gather, these
features will be implemented:

• Identifiers: Users can create usernames which will be paired and displayed along with a random nu-
meric identifier. The random number allows multiple users to have the same username. These identifiers
will be hashed and not stored in plain text on server.
• Storage: Server stores a map of 16 Byte UUIDs and encrypted usernames. The usernames are
encrypted using an entropy, a 32 Byte string of random characters. The encryption ensures that the
service is unable to track of what links correspond to a username [25].

• Contact: Users will be able to hide their phone number from others, as well as hide their number from
contact discovery. Which would only make them searchable from username [26].

The messaging app Briar omitted the phone number entirely, by replacing it by a QR code or link that
need to be exchanged by both parties. While this implementation is very secure, it adds much ’friction’ to the
on boarding process of the service, hampering the growth of the network. This does not align with Signal’s
goal of bringing private communication to the masses. [27]
Telegram has a feature to hide your phone number from other users. Additionally, it allows this in a granular
way, with the ability to block or allow certain users to see your phone number. In 2022, Telegram implemented
the possibility to sign up using an ”anonymous” phone number which are tradeable on a blockchain platform
’Fragment’.

6 Conclusion
In this work, we analyzed the privacy impact of Signal for Android, focusing on the use of phone numbers and
contact discovery.
Phone numbers are used as a means for registration and identification. While very convenient, phone
numbers can potentially be personally identifiable. As any contact between two users requires the exchange
of phone numbers, this is a concern for some. Other apps, like Briar only allow you to contact the other party
when both parties have scanned a QR code or have a link from the other user. While secure, one immediately
recognises the friction that arises from such a feature. Implementing this would hamper the scaling of the
Signal network. In a trade off, Signal is testing the implementation of usernames as an identification method.

11
We studied in detail the contact discovery feature of Signal. It enables the user to import their entire
social graph. We have showed, however, that such a contact discovery feature allows for large scale crawling
of the service, uncovering the users of the service en masse. Furthermore, with the specific implementation of
the feature, potential attack vectors may arise. Given these concerns, other services have omitted this feature
entirely. Once again, in light of convenience and growing the userbase, Signal has opted to keep this feature.
While Signal aims to provide an end to end encrypted service and a privacy first approach, security features
often collide with the service’s useability and convenience. With the competition Signal faces, the foundation
may make these compromises.

12
References
[1] Signal, “Signal features.” [Online]. Available: https://support.signal.org/hc/en-us/sections/
360001602792-Signal-Messenger-Features
[2] A. G. Jacob Appelbaum, “Inside the nsa´s war on internet security,” https://www.spiegel.de/
international/germany/inside-the-nsa-s-war-on-internet-security-a-1010361.html, 2014.
[3] Signal, “Signal terms & privacy policy.” [Online]. Available: https://signal.org/legal/
[4] ——. (2017) Encrypted profiles for signal now in public beta. [Online]. Available: https:
//signal.org/blog/signal-profiles-beta/
[5] ——, “Technology preview: Private contact discovery for signal,” 2017. [Online]. Available:
https://signal.org/blog/private-contact-discovery/
[6] ——, “Grand jury subpoena for signal user data, central district of california,” 2021. [Online]. Available:
https://signal.org/bigbrother/central-california-grand-jury/
[7] ——, “Storage management for signal android.” [Online]. Available: https://signal.org/blog/
storage-management-for-android/
[8] ——, “Backup and restore messages.” [Online]. Available: https://support.signal.org/hc/en-us/articles/
360007059752-Backup-and-Restore-Messages
[9] ——, “Signal-app documentation,” 2023. [Online]. Available: https://signal.org/docs/
[10] ——, “Signal and the general data protection regulation (gdpr).” [Online]. Available: https://support.
signal.org/hc/en-us/articles/360007059412-Signal-and-the-General-Data-Protection-Regulation-GDPR-
[11] LINDDUN, “Privacy threat types.” [Online]. Available: https://linddun.org/threat-types/
[12] Signal, “Video calls for signal out of beta,” 2017. [Online]. Available: https://signal.org/blog/
signal-video-calls/
[13] J. Lapienytė, “Whatsapp data leaked - 500 million user records for sale online.” [Online]. Available:
https://cybernews.com/news/whatsapp-data-leak/
[14] Signal, “Signal-app github repository.” [Online]. Available: https://github.com/signalapp/libsignal
[15] C. Hagen, C. Weinert, C. Sendner, A. Dmitrienko, and T. Schneider, “All the Numbers are US: Large-
scale Abuse of Contact Discovery in Mobile Messengers,” in Proceedings 2021 Network and Distributed
System Security Symposium. Virtual: Internet Society, 2021.
[16] M. Marx, E. Zimmer, T. Mueller, M. Blochberger, and H. Federrath, “Hashing of personally identifiable
information is not sufficient,” 2018.
[17] “Technology preview: Private contact discovery for Signal,” Sep. 2017.
[18] D. Moghimi, “Downfall: Exploiting Speculative Data Gathering.”
[19] “Software Guard Extensions - Wikipedia — en.wikipedia.org,” https://en.wikipedia.org/wiki/Software
Guard Extensions.
[20] UK Parliament, “Online safety act 2023 - parliamentary bills - uk parliament,” Oct 2023. [Online].
Available: https://bills.parliament.uk/bills/3137
[21] E. Woollacott, “Signal threatens to pull out of uk,” 2023. [Online]. Available: https:
//cybernews.com/news/signal-to-leave-uk/
[22] “IP tracker.” [Online]. Available: http://ip-tracker.org
[23] J.-H. Hoepman, “Mutual contact discovery,” 05 2023. [Online]. Available: https://www.cs.ru.nl/∼jhh/
publications/mutual-contact-discovery.pdf
[24] “Public Username Testing (Staging Environment),” Nov. 2023. [Online]. Available: https:
//community.signalusers.org/t/public-username-testing-staging-environment/56866

13
[25] “Signal forum - username.” [Online]. Available: https://community.signalusers.org/t/
usernames-in-signal/9157/1300

[26] “Signal forum - contact hiding.” [Online]. Available: https://community.signalusers.org/t/


usernames-in-signal/9157/1307
[27] Briar, “Briar user manual.” [Online]. Available: https://briarproject.org/manual/

14
A Appendices
A.1 Signal Code snippet
SignalServiceAccountManager.java (line 367:419)
public CdsiV2Service . Response g e t R e g i s t e r e d U s e r s W i t h C d s i ( Set < String >
previousE164s ,
Set < String >
newE164s ,
Map < ServiceId ,
ProfileKey >
serviceIds ,
boolean requireAcis ,
Optional < byte [] >
token ,
String mrEnclave ,
Long timeoutMs ,
Consumer < byte [] >
tokenSaver )
throws IOException
{
CdsiAuthResponse auth = p ushSer viceSo cket . getCdsiAuth () ;
CdsiV2Service service = new CdsiV2Service ( configuration , mrEnclave ) ;
CdsiV2Service . Request request = new CdsiV2Service . Request ( previousE164s ,
newE164s , serviceIds , requireAcis , token ) ;
Single < ServiceResponse < CdsiV2Service . Response > > single =
service . ge tRe gi st er ed Us er s ( auth . getUsername () , auth . getPassword () ,
request , tokenSaver ) ;

ServiceResponse < CdsiV2Service . Response > serviceResponse ;


try {
if ( timeoutMs == null ) {
serviceResponse = single
. blockingGet () ;
} else {
serviceResponse = single
. timeout ( timeoutMs , TimeUnit . MILLISECONDS )
. blockingGet () ;
}
} catch ( RuntimeException e ) {
Throwable cause = e . getCause () ;
if ( cause instanceof I n t e r r u p t e d E x c e p t i o n ) {
throw new IOException ( " Interrupted " , cause ) ;
} else if ( cause instanceof TimeoutException ) {
throw new IOException ( " Timed ␣ out " ) ;
} else {
throw e ;
}
} catch ( Exception e ) {
throw new RuntimeException ( " Unexpected ␣ exception ␣ when ␣ retrieving ␣
registered ␣ users ! " , e ) ;
}

if ( serviceResponse . getResult () . isPresent () ) {


return serviceResponse . getResult () . get () ;
} else if ( serviceResponse . g e t Ap p l ic a t io n E rr o r () . isPresent () ) {
if ( serviceResponse . g et A p pl i c at i o nE r r or () . get () instanceof
IOException ) {

15
throw ( IOException ) serviceResponse . g e t Ap p l ic a t io n E rr o r () . get () ;
} else {
throw new IOException ( serviceResponse . g e t Ap p l ic a t io n E rr o r () . get () ) ;
}
} else if ( serviceResponse . getEx ecutio nError () . isPresent () ) {
throw new IOException ( serviceResponse . getEx ecutio nError () . get () ) ;
} else {
throw new IOException ( " Missing ␣ result ! " ) ;
}
}
The code above is a snippet from the Android implementation of Signal. You can find the complete
implementation on Signal’s GitHub1 .

1 https://github.com/signalapp/Signal-Android

16
A.2 Proposed unit test
CdsExploitTest.java
/*
* Copyright 2023 Signal Messenger , LLC
* SPDX - License - Identifier : AGPL -3.0 - only
*/

package org . thoughtcrime . securesms . H00Y2A ;

import java . util . HashSet ;


import java . util . Random ;
import java . util . Set ;
import org . junit . Test ;
import org . signal . libsignal . zkgroup . profiles . ProfileKey ;
import org . whispersystems . signalservice . api . push . ServiceId ;
import org . whispersystems . signalservice . api . services . CdsiV2Service ;
import org . whispersystems . signalservice . internal . ServiceResponse ;
import
org . whispersystems . signalservice . internal . configuration . S i g n a l S e r v i c e C o n f i g u r a t i o n
import org . whispersystems . signalservice . internal . push . CdsiAuthResponse ;
import org . whispersystems . signalservice . internal . push . Pu shServ iceSoc ket ;

import java . io . IOException ;


import java . util . Map ;
import java . util . Optional ;
import java . util . concurrent . TimeUnit ;
import java . util . concurrent . TimeoutException ;
import java . util . function . Consumer ;
import io . reactivex . rxjava3 . core . Single ;
import static org . junit . Assert . assertTrue ;

public class CdsExploitTest {

/* *
* Generates a set of random E164 phone numbers tailored for US format .
* If we want the test to fail consequently , we have to add number of a
user we already know is registered .
* We have NOT done this , as we don ’t want to disclose a known registered
user .
* @param count the number of phone numbers to generate
* @return a set of generated E164 phone numbers
*/
private static Set < String > g en e r at e R an d o mE 1 6 4 s ( int count ) {
Random rand = new Random () ;
Set < String > e164s = new HashSet < >() ;

while ( e164s . size () < count ) {


StringBuilder sb = new StringBuilder () ;

// Add the US country code


sb . append ( ’+ ’) . append ( " 1 " ) ;

// Generate a random area code (3 digits )


sb . append (( rand . nextInt (900) + 100) ) ;

// Generate a random subscriber number (7 digits )

17
sb . append (( rand . nextInt (9000000) + 1000000) ) ;

// Convert to string
String e164 = sb . toString () ;

e164s . add ( e164 ) ;


}
// Uncomment and add valid E .164 to make the test fail consequently
// e164s . add ( < VALID E .164 >)
return e164s ;
}

/* *
* Tests the CDS ( Contact Discovery Service ) functionality .
* @throws IOException if an I / O error occurs
*/
@Test
public void testCds () throws IOException {

/*
The test works in the following way :
1 Assume a valid S i g n a l S e r v i c e C o n f i g u r a t i o n
2 Assume a valid PushS ervice Socket and obtain valid authorization token
3 Generate a CdsiV2Service to connect to the secure enclave
4 Construct a CdsiV2Service . Request with synthetic phone numbers and send
it to the Contact Discovery Service
5 Parse response Contact Discovery Service
6 Check results
*/

S i g n a l S e r v i c e C o n f i g u r a t i o n configuration ;
Push Servic eSock et pus hServi ceSock et ;

CdsiAuthResponse auth = p ushSer viceSo cket . getCdsiAuth () ;

// Define a service to connect to the secure enclave


String mrEnclave =
" \"0 f 6 f d 7 9 c d f d a a 5 b 2 e 6 3 3 7 f 5 3 4 d 3 b a f 9 9 9 3 1 8 b 0 c 4 6 2 a 7 a c 1 f 4 1 2 9 7 a 3 e 4 b 4 2 4 a 5 7 \" " ;
// Found in app / build . gradle . kts
CdsiV2Service service = new CdsiV2Service ( configuration , mrEnclave ) ;

// Generate synthetic E .164 s


Set < String > syntheticE164s = g en e r at e R an d o mE 1 6 4s (10) ;

// Define required parameters for building a request


Set < String > previousE164s = new HashSet < >() ;
boolean requireAcis = false ;
Optional < byte [] > token = null ;
Map < ServiceId , ProfileKey > serviceIds = null ; // [ Optional ] Map
of ACI number and corresponding profile key .
// Can also be obtained from database after running instance on
Android device and logging in .
CdsiV2Service . Request request = new
CdsiV2Service . Request ( previousE164s , syntheticE164s , serviceIds ,
requireAcis , token ) ;

// Send request to the Contact Discovery Service


Consumer < byte [] > tokenSaver = null ;

18
Single < ServiceResponse < CdsiV2Service . Response > > single =
service . ge tRe gi st er ed Us er s ( auth . getUsername () , auth . getPassword () ,
request , null ) ;

// Preset variables
Long timeoutMs = null ; // Set to null to ignore time restrictions , can
be set to an arbitrary time to streamline calls to the service

ServiceResponse < CdsiV2Service . Response > serviceResponse ;


// Handle possible exceptions during the request
try {
if ( timeoutMs == null ) {
serviceResponse = single
. blockingGet () ;
} else {
serviceResponse = single
. timeout ( timeoutMs , TimeUnit . MILLISECONDS )
. blockingGet () ;
}
} catch ( RuntimeException e ) {
Throwable cause = e . getCause () ;
if ( cause instanceof I n t e r r u p t e d E x c e p t i o n ) {
throw new IOException ( " Interrupted " , cause ) ;
} else if ( cause instanceof TimeoutException ) {
throw new IOException ( " Timed ␣ out " ) ;
} else {
throw e ;
}
} catch ( Exception e ) {
throw new RuntimeException ( " Unexpected ␣ exception ␣ when ␣ retrieving ␣
registered ␣ users ! " , e ) ;
}

// Extract registered users from the response


Set < String > registeredUsers = new HashSet < >() ;
if ( serviceResponse . getResult () . isPresent () ) {
CdsiV2Service . Response response = serviceResponse . getResult () . get () ;
// This is the actual repose , and what we want to extract .
// Format is given Map < String , CdsV2Result >.
// The string constitutes to a given E .164 phone number . We can then
check if the response for this number is present and not null .

// Subsequently we parse the response and add phone numbers


corresponding to registered Signal users
for ( Map . Entry < String , CdsiV2Service . ResponseItem > entry :
response . getResults () . entrySet () ) {
if ( entry . getValue () != null ) {
registeredUsers . add ( entry . getKey () ) ;
}
}

} else if ( serviceResponse . g e t Ap p l ic a t io n E rr o r () . isPresent () ) {


if ( serviceResponse . g et A p pl i c at i o nE r r or () . get () instanceof
IOException ) {
throw ( IOException ) serviceResponse . g e t Ap p l ic a t io n E rr o r () . get () ;
} else {
throw new IOException ( serviceResponse . g e t Ap p l ic a t io n E rr o r () . get () ) ;
}

19
} else if ( serviceResponse . getEx ecutio nError () . isPresent () ) {
throw new IOException ( serviceResponse . getEx ecutio nError () . get () ) ;
} else {
throw new IOException ( " Missing ␣ result ! " ) ;
}

// Assert that no registered users were found


assertTrue ( registeredUsers . isEmpty () ) ;
}
}

20
A.3 Scenario tests
A.3.1 Arbitrary contact discovery
Test description: The test checks whether an Android user can add an arbitrary phone number in their contact
list and verify if that user is a Signal user or not.
Preconditions: The user has given permission for the Signal Application to retrieve contact information.

Figure 7: Scenario test 1

A.3.2 Sending a message to an arbitrary number


Test description: The test checks whether an Android user can verify the existence of another user (i.e. linking
a phone number to a Signal account) by sending a message to an arbitrary phone number.
Preconditions: None

Figure 8: Scenario test 2

21

You might also like