End to End Encrypted Chat with Signal

## Signal is a messaging service developed by the Signal Technology Foundation which uses end to end encryption for its services. The software is free and open source, available for use on Android, iOS devices. Moreover, it has desktop versions for Windows Mac and Linux. Signal messages are encrypted with the Signal Protocol and uses a combination of Double Ratchet Algorithm, prekeys, and an Extended Triple Diffie–Hellman (X3DH) handshake.

Features

Signal uses standard cellular telephone numbers as identifiers and uses the Internet to send one-to-one as well as group messages, which can include files, pictures, voice notes, images, GIFs and videos. Messages can also be set to disappear over time, after which they will be deleted from both senders and receivers end. The time interval for this can be varied between five seconds and one weeklong. The features of the application also include one-to-one and group voice and video calls and the application does not collect meta-data like other apps [1]. Signal allows users to automatically blur faces of people in photos to protect their identities. Moreover, the Signal applications on Android and iOS can be locked with the phone’s pin, passphrase, or biometric authentication. Signal also provides the option of sending unencrypted messages and helps to communicate with contacts who do not have Signal.

History

Signal started off as TextSecure, an encrypted texting program and RedPhone, an encrypted voice calling app. RedPhone and TextSecure were first launched in May 2010 by Whisper Systems a startup company co-founded by security researcher Moxie Marlinspike and roboticist Stuart Anderson. In 2011, Whisper Systems was acquired by Twitter. Under that partnership, they launched Text Secure and RedPhone as an open source software. In 2013, Moxie Marlinspike decided to leave Twitter and resume work on Redphone and Textsecure. Towards the end of July 2014, they announced that they will merge the RedPhone and TextSecure applications as Signal. This was followed by the initial release of Signal as a RedPhone counterpart for iOS. In November 2015, the TextSecure and RedPhone applications on Android were merged to become Signal for Android. A month later, Open Whisper Systems announced Signal Desktop which at that point could only be linked with the Android version of Signal. Later in 2016, the desktop version for iOS was released as well [2].

Signal in Real World

Signal became popular during George Floyd protests in 2020. Following the death of George Floyd, downloads soared from 78000 to 183,000 as protests grew nationwide [3]. It served as crucial tool for protestors to stay remain anonymous by obscuring faces to both protect their privacy and prevent them being identified and possibly charged later on. Signal is becoming widely known due to its recent endorsements by Elon Musk and Edward Snowden as the app was praised for its security and end-to-end encryption. WhatsApp has been sharing data with Facebook since 2016, but users were able to opt-out and still use the application. However, in January 2021 WhatsApp announced that this option will no longer be possible after February 8 and users won’t be able to use the application unless they agreed to the terms of privacy policy. After the announcement, Signal saw a surge in its users as it was downloaded 7.5 million times over time period of five days. All in all, installations for the application listed on Google Play grew from 10 million to 50 million and this led huge stress on its servers eventually leading to a downtime for its users. Later, it was announced that the servers were restored [4].

Security Features

Signal offers considerable security features in comparison to its rivals, WhatsApp and Telegram. Signal encrypts messages, calls, metadata of the chat and is designed particularly to minimize the data that is retained about Signal users. It only collects phone number from its users and has no record of any other data not even any data stored on cloud. In 2016, a subpoena was issued to Signal from the government to provide information for two Signal users for federal grand jury investigation. Signal replied that it does not collect any information apart from the date and time a user registered with Signal and the last date of a user’s connectivity to the Signal service [5]. Hence, proving that Signal minimizes information retained about the user. WhatsApp on the other hand collects metadata like whom and when you messaged someone, and for how long and does not encrypt this information [6]. Telegram also collects data like user id, contact information and phone number. The chats in Telegram are not end- to-end encrypted unless the user specifically opts for secret chats feature. In addition to this, the group chats are not encrypted, and no option is there to opt for secret chats for groups. The various privacy features of Signal set it apart from other applications where it provides “Sealed Sender” feature offering protection of user privacy; record of who is messaging who is kept as a secret. In addition to this, application can be locked using PIN or fingerprint. Signal offers the feature of two factor authentication that requires the user to enter an additional PIN while registering Signal on a new device. The Screen Security feature can disable screenshots in Android and prevents Signal previews from appearing in the application. The blur faces on photos feature before sending a message is also an add-on [7]. It also ensures that video calls are end-to-end encrypted. Signal provides private groups: The Signal service has no record of a user’s group memberships, group titles, group avatars, or group attributes. Overall, the cybersecurity features for Signal make it a reliable application, and perhaps the best when it comes to security and privacy.

Encryption Protocols

Signal messages are encrypted with the Signal Protocol and uses a combination of Double Ratchet Algorithm, prekeys, and an Extended Triple Diffie–Hellman (X3DH) handshake. The Double Ratchet algorithm is used by two parties to exchange encrypted messages based on a shared secret key and is built on the core concepts of KDF chains, Diffie-Hellman ratchet and symmetric-key ratchet. KDF chains can be divided into three chains, root chain, sending chain and receiving chain. New Diffie-Hellman public keys are exchanged, and the Diffie-Hellman output secrets become the inputs to the root chain. The output keys from the root chain become new KDF keys for the sending and receiving chains. This is called the Diffie-Hellman ratchet. As the messages are sent and received, the sending chains advance, where the output keys are used to encrypt and decrypt messages. This process is called symmetric key-ratchet [2]. Symmetric key-ratchet and diffie hellman are combined to form the Double Ratchet. Signal’s voice calls were encrypted with SRTP and the ZRTP key-agreement protocol [3].

X3DH

X3DH is designed for asynchronous settings where one user is offline but has published some information to a server. A second user wants to use that information to send encrypted data to the first user, and also establish a shared secret key for future communication. This protocol uses elliptic curve public keys in the form of Curve448 or Curve25519 [8]. The three parties used to describe this protocol are Alice, Bob, and a server:

Alice: wants to send Bob some initial data using encryption, and to establish a shared secret key.
Bob: wants to allow parties like Alice to establish a shared secret key with him and send encrypted data. However, Bob might be offline when Alice attempts to do this. To enable this, Bob has a relationship with some server
Server: can store messages from Alice to Bob which Bob can later retrieve. It also lets Bob publish some data which the server will provide to parties like Alice.

X3DH has three phases:

Publishing keys: Bob publishes his identity key and prekeys to a server which stores it as a bundle
Sending the initial message : Alice fetches a prekey bundle from the server, and uses it to send an initial message to Bob.
Retrieving the initial message: Bob receives and deciphers Alice’s initial message.

To perform an X3DH key agreement with Bob, Alice contacts the server and fetches a prekey bundle. The server should provide one of Bob’s one-time prekeys if one exists, and then delete it. If all of Bob’s one-time prekeys on the server have been deleted, the bundle will not contain a one-time pre-key [8].

Notation:

A= Alice
B= Bob
IK= Identity key
SPK= Signed pre-key
OPK= One-time pre-key
EK= Ephermal key
DH= Diffie-Hellman
SK= Shared key

If the bundle does not contain a one-time prekey, Alice calculates: DH1 = DH(IKA, SPKB) DH2 = DH(EKA, IKB) DH3 = DH(EKA, SPKB) SK = KDF(DH1 || DH2 || DH3) If the bundle does contain a one-time prekey, then Alice would calculate an additional DH and include it in the shared key calculation: DH4 = DH(EKA, OPKB) SK = KDF(DH1 || DH2 || DH3 || DH4)

The following diagram shows the DH calculations between keys. DH1 and DH2 provide mutual authentication, while DH3 and DH4 provide forward secrecy [8].

SK calculations

After calculating SK, Alice deletes her ephemeral private key and the DH outputs, and calculates an associated data byte sequence AD that contains identity information for both parties: AD = Encode(IKA) || Encode(IKB)

Alice then sends the SK and AD she calculated to Bob along with the encrypted text to Bob in the initial message. After Bob receives the initial message, he repeats the calculations performed by Alice, to retreive the SK and AD, then he tries to decipher the text. If he successfuly decrypts the text, he deletes any one-time pre-key used, stores the SK and can use the SK for future communication with Alice [8].

Security Considerations

Authentication
- Before or after an X3DH key agreement, the parties may compare their identity public keys through some authenticated channel. If authentication is not performed, the parties receive no cryptographic guarantee as to who they are communicating with [8].
Protocol Replay
- If Alice’s initial message doesn’t use a one-time prekey, it may be replayed to Bob and he will accept it. This could cause Bob to think Alice had sent him the same message repeatedly. To solve this, a post-X3DH protocol may negotiate a new encryption key for Alice based on fresh random input from Bob. Bob could attempt other solutions, such as maintaining a blacklist of observed messages, or replacing old signed prekeys more rapidly. Analyzing these mitigations is beyond the scope of this document [8].
Replay and Key Reuse
- Another consequence of the replays discussed in the previous section is that a successfully replayed initial message would cause Bob to derive the same SK in different protocol runs [8].
Deniability
- X3DH doesn’t give either Alice or Bob a publishable cryptographic proof of the contents of their communication or the fact that they communicated [8].
Signatures
- It might be tempting to observe that mutual authentication and forward secrecy are achieved by the DH calculations, and omit the prekey signature. However, this would allow a “weak forward secrecy” attack: A malicious server could provide Alice a prekey bundle with forged prekeys, and later compromise Bob’s IKB to calculate SK. Alternatively, it might be tempting to replace the DH-based mutual authentication (i.e. DH1 and DH2) with signatures from the identity keys. However, this reduces deniability, increases the size of initial messages, and increases the damage done if ephemeral or prekey private keys are compromised, or if the signature scheme is broken [8].
Key compromise
- Compromise of a party’s identity private key allows impersonation of that party to others. Compromise of a party’s prekey private keys may affect the security of older or newer SK values, depending on many considerations [8].
Server trust
- A malicious server could cause communication between Alice and Bob to fail [8].
Identity binding
- Authentication does not necessarily prevent an identity misbinding or unknown key share attack [8].

Double Ratchet

KDF Chains

A KDF is a cryptographic function that takes an input, a secret key, a random KDF key, and returns an output. The output is indistinguishable from random as long as the key is not known[9].

The term KDF chains is used when some of the output from the function is used as an output key and some is used as the KDF key for the for another function. The diagram below shows a KDF chain consisting of 3 functions [9]:

A KDF chain has 3 properties:

Resilience: The output keys appear random to someone without knowledge of the KDF keys. This is true even if that person can control the KDF inputs [9].
Forward security: Output keys from the past appear random to someone who learns the KDF key at some point in time [9].
Break-in recovery: Future output keys appear random to someone who learns the KDF key at some point in time, as long as future inputs add sufficient entropy [9].

In a Double Ratchet session, each party stores 3 chains: root chain, sending chain, and receiving chain. The sending and receiving chains advance as each message is sent and received. Their output keys are used to encrypt and decrypt messages. This is called the symmetric-key ratchet [9].

Diffe-Hellman Ratchet

To implement the DH ratchet, each party generates a DH key pair which becomes their current ratchet key pair. Every message from either party begins with a header which contains the sender’s current ratchet public key. When a new ratchet public key is received from the remote party, a DH ratchet step is performed which replaces the local party’s current ratchet key pair with a new key pair. This results in a “ping-pong” behavior as the parties take turns replacing ratchet key pairs [9].

The diagram below represents a Diffe-Hellman ratchet step, it consists of updating the root key twice, and using the output keys from the KDF as new sending and receiving chain keys [9].

DH Ratchet

Double Ratchet

A double ratchet combines the symmetric-key and DH ratchets, the SK ratchet is applied to the sending or receiving chain, and the DH ratchet is applied before the SK ratchet [9].

The diagram below represent a full Double Ratchet run, When Alice sends her first message A1, she applies a symmetric-key ratchet step to her sending chain key, resulting in a new message key. The new chain key is stored, but the message key and old chain key can be deleted. If Alice next receives a response B1 from Bob, it will contain a new ratchet public key. Alice applies a DH ratchet step to derive new receiving and sending chain keys. She then applies a symmetric-key ratchet step to the receiving chain to get the message key for the received message [9].

Double Ratchet

The following code snippet is used to initialize Alice and Bob ratchets:

def RatchetInitAlice(state, SK, bob_dh_public_key):
    state.DHs = GENERATE_DH()
    state.DHr = bob_dh_public_key
    state.RK, state.CKs = KDF_RK(SK, DH(state.DHs, state.DHr)) 
    state.CKr = None
    state.Ns = 0
    state.Nr = 0
    state.PN = 0
    state.MKSKIPPED = {}

def RatchetInitBob(state, SK, bob_dh_key_pair):
    state.DHs = bob_dh_key_pair
    state.DHr = None
    state.RK = SK
    state.CKs = None
    state.CKr = None
    state.Ns = 0
    state.Nr = 0
    state.PN = 0
    state.MKSKIPPED = {}

The following code snippet is used to encrypt messages:

def RatchetEncrypt(state, plaintext, AD):
    state.CKs, mk = KDF_CK(state.CKs)
    header = HEADER(state.DHs, state.PN, state.Ns)
    state.Ns += 1
    return header, ENCRYPT(mk, plaintext, CONCAT(AD, header))

The following code snippet is used to decrypt messages:

def RatchetDecrypt(state, header, ciphertext, AD):
    plaintext = TrySkippedMessageKeys(state, header, ciphertext, AD)
    if plaintext != None:
        return plaintext
    if header.dh != state.DHr:                 
        SkipMessageKeys(state, header.pn)
        DHRatchet(state, header)
    SkipMessageKeys(state, header.n)             
    state.CKr, mk = KDF_CK(state.CKr)
    state.Nr += 1
    return DECRYPT(mk, ciphertext, CONCAT(AD, header))

def TrySkippedMessageKeys(state, header, ciphertext, AD):
    if (header.dh, header.n) in state.MKSKIPPED:
        mk = state.MKSKIPPED[header.dh, header.n]
        del state.MKSKIPPED[header.dh, header.n]
        return DECRYPT(mk, ciphertext, CONCAT(AD, header))
    else:
        return None

def SkipMessageKeys(state, until):
    if state.Nr + MAX_SKIP < until:
        raise Error()
    if state.CKr != None:
        while state.Nr < until:
            state.CKr, mk = KDF_CK(state.CKr)
            state.MKSKIPPED[state.DHr, state.Nr] = mk
            state.Nr += 1

def DHRatchet(state, header):
    state.PN = state.Ns                          
    state.Ns = 0
    state.Nr = 0
    state.DHr = header.dh
    state.RK, state.CKr = KDF_RK(state.RK, DH(state.DHs, state.DHr))
    state.DHs = GENERATE_DH()
    state.RK, state.CKs = KDF_RK(state.RK, DH(state.DHs, state.DHr))

Out of Order Messages

The Double Ratchet handles lost or out-of-order messages by including in each message header the message’s number in the sending chain and the length of the previous sending chain. This diagram shows an example where B4 is received before B2 and B3, Alice’s receiving chain will have a length of 1 and it will only have the message B1, so it will store message keys for B2 and B3 so they can be decrypted when they arrive later [9].

Out of Order messages

Security Considerations

Secure deletion:
- The Double Ratchet algorithm is designed to provide security against an attacker who records encrypted messages and then compromises the sender or receiver at a later time. This security could be defeated if deleted plaintext or keys could be recovered by an attacker with low-level access to the compromised device [9].
Recovery from compromise:
- The DH ratchet is designed to recover security against a passive eavesdropper who observes encrypted messages after compromising one (or both) of the parties to a session. Despite this mitigation, a compromise of secret keys or of device integrity will have a devastating effect on the security of future communications [9].
Cryptanalysis and ratchet public keys:
- If weaknesses are discovered in any of the cryptographic algorithms a session relies upon, the session should be discarded and replaced with a new session using strong cryptography [9].
Deletion of skipped message keys:
- A malicious sender could induce recipients to store large numbers of skipped message keys, possibly causing denial-of-service due to consuming storage space [9].
Deferring new ratchet key generation:
- During each DH ratchet step a new ratchet key pair and sending chain are generated. As the sending chain is not needed right away, these steps could be deferred until the party is about to send a new message. This would slightly increase security by shortening the lifetime of ratchet keys, at the cost of some complexity [9].
Truncating authentication tags:
- If the ENCRYPT() function is implemented differently, then truncation might require a more complicated analysis and is not recommended [9].
Implementation fingerprinting:
- If this protocol is used in settings with anonymous parties, care should be taken that implementations behave identically in all cases [9].

There are two more protocols used by the signal application that were not covered by this presentation or companion notes. If you are interested in learning more about this topic you can take a look at the developer notes published on Signal’s website: https://signal.org/docs/

References

[1] https://en.wikipedia.org/wiki/Signal_(software)
[2] https://en.wikipedia.org/wiki/Signal_(software)#2010%E2%80%9313:_Origins
[3] https://www.nytimes.com/2020/06/11/style/signal-messaging-app-encryption-protests.html
[4] https://en.wikipedia.org/wiki/Signal_(software)#2010%E2%80%9313:_Origins
[5] https://signal.org/bigbrother/eastern-virginia-grand-jury/
[6] https://beebom.com/whatsapp-vs-telegram-vs-signal/
[7] https://gadgetstouse.com/blog/2021/01/12/whatsapp-vs-telegram-vs-signal-detailed-comparison-based-on-all-features/
[8] https://signal.org/docs/specifications/x3dh/
[9] https://signal.org/docs/specifications/doubleratchet/