Aspectos de Segurança do CryptTalk

Devido à natureza técnica do conteúdo apresentado nesta página, o texto foi mantido no original em inglês.

Protecting Voice Calls

Voice calls are an important part of modern private and business life. Many of us typically share our private lives freely on the phone without considering who might be listening. For the average person this might be safe enough: our personal secrets are not that interesting to others. Those in the public eye or business leaders might have more to protect.

Regardless of the many cases of wiretapping portrayed in popular movies, the majority still believe that public landlines and mobile telephony networks are quite secure. Unfortunately, this is not true. Landline phones are run through copper wiring and this means that your trust of the infrastructure may be about all that comes between your call and an eavesdropper’s attention. Since GSM communication happens over easily accessible radio signals it is encrypted, but this encryption is not strong enough and it is easily broken with the right equipment.

Operators are another weak point in voice-call security. Most telephony networks are designed to allow access to in-call voice streams. This is mandated by government regulations in all countries to allow lawful and necessary interception for security reasons. Since the capability to hack in is there, the security of any communicated data is never better than the security levels available to the operator. This is even true for VoIP services which provide some of the strongest encryption between the customer’s equipment and their central servers.

Strong encryption can provide unbreakable protection for voice calls, but third parties can only be excluded effectively if the encryption happens end-to-end, i.e. when the strongest line of defense is primarily and technologically within the devices of the call-making participants and not on the open network itself.

End-to-end Security

Advances in modern cryptography allow the protection of voice calls without relying on third parties or allowing possible attacks by middle man. End-to-end security relies only on the protection of the end device and so the power to control the situation is in the hands of the end user.

Symmetric key encryption, when used properly, can provide practically unbreakable protection for data transmissions. The AES-256 cipher (approved for the handling of top secret First World government information) simply cannot be broken by brute force even if we assume the highest levels of transistor technology are ranged against us. Even futuristic solutions such as quantum computers would only raise the crackability of an AES cipher from impossible to practically impossible. Massive advances in cryptanalysis may one day offer ways to break high-level symmetric key encryption but even this is improbable in the foreseeable future.

Symmetric key encryption provides top secret protection for voice data but, by definition, needs a shared secret between the two parties. In practice this would mean personally exchanging one or more 30 character passwords before making any calls. This could easily become inconvenient. Fortunately, Diffie-Hellman key exchange with RSA signature make it possible to exchange a shared secret securely during the establishment of the connection.

The Diffie-Hellman key exchange method allows the parties to establish the shared secret safely over any insecure channel. The shared secret is established without ever sending the secret itself. Both parties create random secrets that are never sent over the wire directly; only a derived value is actually transferred, one that is practically impossible to reverse-engineer. The shared secret is calculated from this derived value and his own secret numerical element (basically a complex password). All the values are sent publicly over the insecure channel except the base secrets and the derived shared secret.

Diffie-Hellman provides a perfect shared secret for symmetric encryption but it does not offer authentication in itself. The shared secret provides a secure channel but we cannot be sure who is at the other end. Public key cryptography provides a way for peer to peer authentication without relying on a central authority. Both parties generate secret private keys and their derived public counterparts. The public keys are exchanged in advance in a secure way. During the Diffie-Hellman key exchange, a digital signature is added to key parts of the exchange, which can be verified using the public key of the other party. This guarantees that the established shared secret provides the secure channel to the person we intended.

CryptTalk Solution

During the design of the CryptTalk service, the aim of Arenim Technologies was to provide the very same feature set and user experience as is available broadly for normal calls and messaging. The only difference: CryptTalk is completely secure.

CryptTalk is a voice over IP (VoIP) solution that provides superior quality voice communication delivered over the Internet. It uses an industry-proven voice engine that provides very low delay even with encryption.

Voice calls are encrypted end-to-end using strong AES-256 symmetric encryption which is based on ephemeral keys setup using Elliptic Curve Diffie Hellman. This ensures that calls are accessible only to the intended parties of the calls and only during the active call time. Even if the devices are later stolen and their protections broken, past calls remain secure.

CryptTalk takes extra care to protect the sensitive data kept on the device. Unauthorized access to the device is made extremely difficult by a service provided by the CryptTalk server which makes brute force attacks impossible.

Besides strong security, CryptTalk aims to fulfill the needs of business life by also providing secure multi-party conferencing and instant messaging features.

Professional administration interface

Besides security issues, business requirements are satisfied by CryptTalk’s built-in contact management features and the professional administration interface designed to be of great benefit to managers and administrators of your organization.

Contact management features are enhanced with a presence (user is online) service. It is possible to see if your contacts are either available, offline, in a call or if they do not want to be disturbed.

Professional user management options include the possibility to add CryptTalk users from outside your organization in a trusted and secure way. Collaboration between companies and organizations can also be administered and managed centrally by defining access and visibility rules and credentials.

Technical Insight

Voice encryption

A call between two subscribing parties is set up using the SIP protocol via the servers provided by CryptTalk. The whole call setup is secured using TLS (i.e. transport level security is provided via the servers). The voice content of calls is protected with end-to-end security built up after the call connection has been established by both parties.

When a call is connected, the path for the voice media is set up using the Interactive Connection Establishment (ICE) protocol. The aim here is to find the simplest and shortest path between the two parties. This means that if both parties are on the same corporate or private network then the voice data won't leave that network; if they are on separate networks it will use a direct path on the Internet between the two networks. Only in cases where a direct path cannot be established (because of network address translation issues) will the ICE protocol select the use of CryptTalk servers to relay the media. Using the shortest possible path ensures the fewest number of parties that are able to even see that a secure connection is being established.

Key exchange

The secure channel is set up along the shortest possible voice path. The key exchange procedure is started before any voice data is sent. The key exchange procedure is very similar to the procedures used in the industry standard Transport Layer Security (TLS) protocol, with a slight adjustment for the UDP transport protocol, including the bypassing of any unnecessary features.

TLS is a widely used protocol with many optional features and security levels. Until recently most web sites using TLS avoided the use of Diffie-Hellman key exchanging because of the higher CPU costs associated with it. CryptTalk, as a top security solution, only uses the highest security level. Lower level options were not options for us.

The Diffie-Hellman algorithm allows you to establish a shared secret with another node without transmitting the secret across the network. When the call is finished the shared secret is erased and cannot be reproduced later even by the original parties of the call. This feature provides Perfect Forward Secrecy: There is no way to decrypt the content of the call later even if the devices of both parties are compromised after the call.

Since CryptTalk is a mobile product it is important to conserve resources. For this reason a variant of the Diffie-Hellman method is used; one that is based on elliptic curve cryptography. This relies on the elliptic curve discrete logarithm problem instead of integer factorization and can provide the same security with much shorter key lengths. According to renowned cryptanalysts, a 384 bit EC key provides approximately the same security as a 7680 bit RSA, but because of the smaller key size it requires significantly less processing power for encryption.

Since the Diffie-Hellman protocol does not provide authentication in itself, the commit messages during the exchange are signed using RSA-2048 keys. The RSA private key provides the authenticity of the user and allows the other party to ensure that he is not speaking with an impostor (Man-in-the-Middle protection). The RSA key is only used for signing and never for encryption and this further improves security. You can read about how the private key is protected in the Device Protection section below.

The implemented key exchange makes use of important features of the TLS and this also provides extra protection against replay and bid-down attacks.

Secure RTP

The actual voice data will only be sent after the shared secret has been established by way of the key exchange. The voice data is first encoded with a modern, wide-band voice codec to conserve bandwidth and then it is packed into the industry standard SRTP (secure real time transport protocol) for transmission. SRTP is a version of the RTP protocol that uses symmetric encryption to secure the transmitted voice data.

CryptTalk uses the most secure ciphering option: AES-256 in the counter mode. As discussed earlier, this provides absolute secrecy level protection for the transmitted data.

SRTP uses key derivation methods to create multiple keys from the single master key (the shared secret established during the key exchange). To further improve security, separate keys are used to encrypt the voice data coming from each party. Also, a separate key is used in the integrity protection of the encrypted data. Integrity protection is an optional feature in SRTP because this slightly increases the bandwidth requirements, but this is worth doing as it prevents attacks where a malicious entity gains access to a part of the encrypted stream and resends it at a later time, trying to confuse the listener. It is not easy for an eavesdropper to hack into your call this way since he will never gain access to what was said in the call seconds he is attempting to access. Nevertheless CryptTalk always uses integrity protection.

Device protection

The authentication of both parties prior to the call is based on the private key stored on each user’s mobile device (as discussed earlier). To ensure the safety of these private keys and personal data, CryptTalk uses multi-factor protection technology that is based on 3 pillars:

something that the user knows - a 6 digit PIN only known by the user
something that the user has - the unique mobile phone itself
external protection - CryptTalk servers contain part of the key

Traditionally, this multi-factor authentication can also be based on "something that the user is". This additional protection can be available if you use certain specific devices: IPhone 5, for example, provides biometric authentication.

The private key and personal data (this includes call history and settings too) are encrypted with a key that is derived from the above three pillars. To provide maximum security, SHA-512 is used for the key derivation and AES-256 is used for the encryption itself.

Part of the key will be your PIN; this is something that only the user knows. For this reason your PIN is only temporarily stored by CryptTalk for the very short time while it is being used. The PIN itself never ‘exists’ outside the device and it is not stored by the server. This ensures that the PIN provides protection that is separate from the other pillars.

Your mobile device also provides the ‘possession’ factor. Even if a potential hacker gets hold of your actual device and break its built-in protection to gain access, he would still only gain access to a useless-on-its-own part of the key. Modern mobile manufacturers consider security to be very important and if phones are used with high security settings (like PIN access and encrypted file systems) it is almost impossible for an attacker to obtain the secure keys. (In the case of the IOS platform, part of the key is stored in the keychain that is locked in the UID of the device and which is also protected by the PIN of system).

The third part of the key is only stored on the CryptTalk servers. This thwarts any attempts for a brute force attack on the PIN even if a hacker has somehow obtained the data protected by the device. The One-Time Password (OTP) method detailed in RFC4226 is used to authenticate the user to the server. The shared secret of the OTP is encrypted with a key derived from the PIN of the user. The encryption used is AES-256 EBC. The decryption of the OTP key is protected against brute force attacks by the fact that the result of the decryption is the completely random shared secret protocol and the attacker cannot check the validity of this without contacting the server (the server will allow only a few tries at this before locking the account). On top of all this, additional protection for the encryption key derivation is bound to the device by using PBKDF2 with a salt value that is stored only on the device.

Using such multi-platform protection, a user’s private data can only be accessed after multiple attacks against the system’s very robust defenses. One possible attack would be is somebody somehow obtained both the PIN and the device from the user. Without the device, the PIN is worthless. Without the PIN, the attacker has to obtain the device, break all of its built in protection layers and also get access to the protected data on the CryptTalk servers.

Conferencing security

Conferencing is an important feature for many users of CryptTalk and we believe that its implementation has to provide the same end-to-end security as is available for two party CryptTalk calls. Using an external mixer (conferencing server) would require the creation of trust relationships between the mixing service and all the participants of the conference. In many cases this is not practical.

Mixing the voice data on the phone of the participants provides a much clearer trust relationship: each participant has to trust the party who has invited him to the conference.

The invited participants of a conference are connected using regular call technology (with end-to-end CryptTalk security) and then the decrypted voice data is mixed and sent to all the active parties, who, themselves, have CryptTalk decryption technology.