Introduction

When working on a mobile application security assessment we noted an unusual traffic flow. This was a DTLS handshake coming from a remote server to the mobile application listener. As we always pay close attention to transport security implementation in the applications we test, we were about to verify if certificates are properly validated in the observed case. However, it was not that simple.

The desire to test client-side TLS certificates validation led us to: development of custom tools, interception and digging through various types of network traffic, reading through RFC documents and even patching of binary files. Now we understand that we were facing DTLS handshake initiated to derive keys to secure media communications between parties, following WebRTC recommendations.

In this article we describe what we did, how we did that and what obstacles we had to overcome. This exercise helped us to formulate a methodology to test some aspects of WebRTC-capable applications from a security standpoint.

TL;DR

For readers who want a short summary of this article we suggest to jump to the conclusion section. In this case the most valuable part is the description itself as it provides a methodology applicable to similar cases. We also developed some tools which helped us during this assessment. However, they can only be applied to perform similar exercises and quite useless as a standalone software.

Twilio Platform

Preparing Custom Setup

We noted that the application we were initially testing uses Twilio platform to make voice calls. The Android build uses Twilio's Java module and corresponding native library. In order to avoid hitting our customer's infrastructure while performing our tests we decided to deploy our own solution. This way we can minimise the study surface and have full control over the software behaviour.

Twilio has great introductory articles: getting started for voice applications, getting started with Android client. They also allow to create a trial account for ~€10 which one can use to make voice calls between test applications. After completing "Getting started" guides your Twilio account will be properly configured, various tokens created, Google Firebase messaging configured. You will also have a working Android mobile application, a web server to route calls and to feed Twilio with your tokens. Same way you can get an iOS application, but in this article we will cover Android only.

Note that we were not assessing the whole Twilio solution, we were just interested in voice calls implementation. In that particular case it was enough to get an example Android application configured to use in our setup and install it on two devices. We used the suggested Android quickstart example with the following minor updates:

TWILIO_ACCESS_TOKEN_SERVER_URL pointed to our web server (which was based on another example.

pointed to our web server (which was based on another example. identy string was set to different values on each device we used for testing;

string was set to different values on each device we used for testing; app/google-services.json file was updated after integration with Google Firebase service;

file was updated after integration with Google Firebase service; we enabled debug logging by adding Voice.setLogLevel(LogLevel.DEBUG); call to onCreate() method of app/src/main/java/com/twilio/voice/quickstart/VoiceActivity.java file.

As we had two mobile applications able to setup voice calls between each other, we started to investigate the network communications they produce.

Traffic Interception and Analysis

Information Gathering

We configured one of the test devices to be connected via WiFi access point setup on our test laptop. This way we observed all communications made by Twilio example application when issuing a phone call.

The application was contacting the following domains:

eventgw.twilio.com

ers.twilio.com

our test web server holding Twilio tokens

global.stun.twilio.com

chunderm.gll.twilio.com

We noted that traffic towards eventgw.twilio.com , ers.twilio.com and access tokens server follows Android proxy rules. Thus, we observed the corresponding communication using our Burp proxy instance. Its certificate authority was installed on our test device in advance.

ers.twilio.com was accessed using HTTPS to check and update Twilio registration token. The application informed Twilio about events happening on the device by sending HTTPS POST requests to ers.twilio.com server. There was no interesting data in these communication channels.

TLS-protected packets were exchanged with chunderm.gll.twilio.com server over TCP/443. This communication ignored global proxy settings.

Then the application sent STUN binding request to global.stun.twilio.com over UDP/3478.

After more exchanges with chunderm.gll.twilio.com the application started UDP STUN communication with a server in 3.122.181.0/24 subnet (approximately). The server IP address and port were different each time we started a call. After STUN registration and binding requests in the same channel the application received incoming DTLS Client Hello message. You can see this on the following screenshot from Wireshark capture:

DTLS handshake was quickly established, no data packets were sent and traffic with RTP data continued to flow in the very same UDP channel.

Our objective was to verify the proper validation of the server's certificate by the client side (Twilio server in this case). In this scenario the role of the server was played by our mobile client.

Searching the Internet with traces we already collected revealed that what we observed is DTLS-SRTP protocol, defined in RFC5734. The idea here is to use DTLS protocol to derive key material to secure RTP data. DTLS is already defined and implemented, it authorises the remote party and secures sensitive data. When using key exchange algorithms based on elliptic curves, it works fast even on low powered devices. The corresponding packet exchange is described in RFC5764 as follows:

Client Server ClientHello + use_srtp --------> ServerHello + use_srtp Certificate* ServerKeyExchange* CertificateRequest* <-------- ServerHelloDone Certificate* ClientKeyExchange CertificateVerify* [ChangeCipherSpec] Finished --------> [ChangeCipherSpec] <-------- Finished SRTP packets <-------> SRTP packets

The SRTP keys are derived with help of a specific DTLS protocol "use srtp " extension. The details here are not important to us, except that these keys stay inside DTLS library implementation and not transferred as Application Data . For this reason we will need a specific software to terminate such connections.

The security outcome here is that if DTLS certificates are not properly verified then a suitably positioned network-based attacker can obtain keys for encrypted media traffic.

The important remaining question is how such certificates can be verified? There is no particular "domain name" here. The client needs to rely on something else that we do not see yet. RFC5764 mentions this in the following passage:

A DTLS-SRTP session may be indicated by an external signaling protocol like SIP. When the signaling exchange is integrity- protected (e.g., when SIP Identity protection via digital signatures is used), DTLS-SRTP can leverage this integrity guarantee to provide complete security of the media stream. A description of how to indicate DTLS-SRTP sessions in SIP and SDP [RFC4566], and how to authenticate the endpoints using fingerprints can be found in [RFC5763].

Obscure, as it often happens with such documents. But it gives us a hint that we have to look into other channels. And we have such a candidate: chunderm.gll.twilio.com . Nevertheless, we have to redirect DTLS handshake to our DTLS server keeping other protocols untouched. Let's solve this problem first.

DTLS Demultiplexing

From our observations and RFC5764 specification, DTLS-SRTP traffic is a mix of STUN, DTLS and RTP protocols, all in the same UDP channel. Thus, we can not just forward this traffic to some listener and handle only DTLS packets. This will break the application logic.

We need a proxy tool which normally just redirects traffic between two endpoints. When it catches DTLS packets it forwards them to our fake listener. The responses have to be delivered back to their intended destination. Thus, the remote side and the mobile application will exchange STUN and RTP packets as usual but DTLS channel will be handled by us.

Determining the packet type is simple, it is described in RFC5764:

+----------------+ | 127 < B < 192 -|--> forward to RTP | | packet --> | 19 < B < 64 -|--> forward to DTLS | | | B < 2 -|--> forward to STUN +----------------+

The following describes the traffic flows our proxy has to handle:

+---------------+ | demux tool | | | mobile <--> +- STUN/RTP --*-+ <--> original app | / | destination | DTLS | | | +--+ <--> fake DTLS | | server +---------------+

This was implemented in quick-and-dirty style using Go and can be grabbed from this repo.

This proxy requires the following parameters:

Socket to listen on. This can be anything, we just redirect traffic to it using firewall.

Fake DTLS server address. We fully control this parameter.

Original destination. Well, here we have a problem. In our case it is different each time, so we have to figure out this parameter in runtime. This is covered in the next section.

Intercepting Traffic to chunderm.gll.twilio.com

From prior observations it became clear that essential information is transmitted inside TLS-protected data with chunderm.gll.twilio.com server (port 443). Unfortunately for us, the application properly verified TLS certificates. We were not able to bypass it with Xposed modules or Frida with objection framework.

The application uses libtwilio_voice_android_so.so native library which includes (but not linked to) resiprocate and boringssl sources. Thus, we can not just handle calls with Frida. In this situation we had to go the binary patching way.

We used Ghidra project for decompiling and patching. Binary patching is not well supported by Ghidra (yet?) but it is possible to overcome it with SavePatch external script.

We patched calls (five in total) to SSL_CTX_set_verify function (address 0x0026786c ) which sets certificate verification mode in its second argument. To avoid verification this parameter has to be zero. You can see this in the following illustrations.

SSL_CTX_set_verify , the entry #1, original:

SSL_CTX_set_verify , the entry #1, patched

SSL_CTX_set_verify , the entry #2, original

SSL_CTX_set_verify , the entry #2, patched

SSL_CTX_set_verify , the entry #3, original

SSL_CTX_set_verify , the entry #3, patched

SSL_CTX_set_verify , the entry #4, original

SSL_CTX_set_verify , the entry #4, patched

SSL_CTX_set_verify , the entry #5, original

SSL_CTX_set_verify , the entry #5, patched

SSL_CTX_set_verify (offset 0x0016163c ) also sets a callback which always has to return 1 as a result of successful verification.

Verification callback, original:

Verification callback, patched:

Even if verification succeeds, the library validates if domain name matches the expected one.

Common name verification, decompiled:

Common name verification, source code:

Common name verification, patched:

Common name verification, patched, decompiled:

On top of that, the library checks if common name ends with .twilio.com . It is of course an addition made to the resiprocate library. We left it untouched and just took this behaviour into account when configuring intercepting proxies.

Parent domain validation:

The final difference between the original file and the patched version:

$ diff <(xxd libtwilio_voice_android_so.so) <(xxd libtwilio_voice_android_so.so.patched) 85248c85248 < 0014cff0: 7a44 0121 0af1 3afc 6668 19a8 4946 7ff7 zD.!..:.fh..IF.. --- > 0014cff0: 7a44 0021 0af1 3afc 6668 19a8 4946 7ff7 zD.!..:.fh..IF.. 85252c85252 < 0014d030: a068 dff8 3426 7a44 0121 0af1 17fc a668 .h..4&zD.!.....h --- > 0014d030: a068 dff8 3426 7a44 0021 0af1 17fc a668 .h..4&zD.!.....h 86212c86212 < 00150c30: dff8 042a 4046 0121 7a44 06f1 17fe 0af1 ...*@F.!zD...... --- > 00150c30: dff8 042a 4046 0021 7a44 06f1 17fe 0af1 ...*@F.!zD...... 86401c86401 < 00151800: 0830 1a90 4846 17f7 a6e9 5846 0df5 657d .0..HF....XF..e} --- > 00151800: 0830 1a90 4846 17f7 a6e9 0120 0df5 657d .0..HF..... ..e} 122442c122442 < 001de490: c8f8 2c01 0968 0029 00f0 bc80 bb48 52ad ..,..h.).....HR. --- > 001de490: c8f8 2c01 0968 0229 40f0 bc80 bb48 52ad ..,..h.)@....HR. 186554c186554 < 002d8b90: 334a 2046 0121 7a44 7ef7 68fe 2046 0421 3J F.!zD~.h. F.! --- > 002d8b90: 334a 2046 0021 7a44 7ef7 68fe 2046 0421 3J F.!zD~.h. F.! 187441,187442c187441,187442 < 002dc300: edfb 38b3 95f8 6100 0321 0022 0028 08bf ..8...a..!.".(.. < 002dc310: 0121 2046 7bf7 aafa 1349 2046 0022 7944 .! F{....I F."yD --- > 002dc300: edfb 38b3 95f8 6100 0021 0022 0028 08bf ..8...a..!.".(.. > 002dc310: 0021 2046 7bf7 aafa 1349 2046 0022 7944 .! F{....I F."yD

This was still not enough. The library limits the list of trusted certificate authorities to the following list:

subject= /C=US/O=DigiCert Inc/OU=www.digicert.com/CN=DigiCert Global Root G2 subject= /C=US/O=DigiCert Inc/OU=www.digicert.com/CN=DigiCert High Assurance EV Root CA subject= /C=US/O=DigiCert Inc/OU=www.digicert.com/CN=DigiCert Global Root CA subject= /C=US/O=Amazon/CN=Amazon Root CA 4 subject= /C=US/O=Amazon/CN=Amazon Root CA 3 subject= /C=US/O=DigiCert Inc/OU=www.digicert.com/CN=DigiCert Assured ID Root G3 subject= /C=US/O=Starfield Technologies, Inc./OU=Starfield Class 2 Certification Authority subject= /C=US/O=thawte, Inc./OU=Certification Services Division/OU=(c) 2006 thawte, Inc. - For authorized use only/CN=thawte Primary Root CA subject= /C=US/ST=Arizona/L=Scottsdale/O=Starfield Technologies, Inc./CN=Starfield Services Root Certificate Authority - G2 subject= /C=US/O=DigiCert Inc/OU=www.digicert.com/CN=DigiCert Global Root G3 subject= /C=US/O=thawte, Inc./OU=(c) 2007 thawte, Inc. - For authorized use only/CN=thawte Primary Root CA - G2

To overcome this we created our own CA which, when formatted as PEM, matches the length of one of the listed above.

We updated Amazon's root CA:

$ openssl x509 -text -noout -in twilio_cert_chosen.txt Certificate: Data: Version: 3 (0x2) Serial Number: 06:6c:9f:d2:96:35:86:9f:0a:0f:e5:86:78:f8:5b:26:bb:8a:37 Signature Algorithm: sha384WithRSAEncryption Issuer: C = US, O = Amazon, CN = Amazon Root CA 2 Validity Not Before: May 26 00:00:00 2015 GMT Not After : May 26 00:00:00 2040 GMT Subject: C = US, O = Amazon, CN = Amazon Root CA 2

Our certificate:

$ openssl x509 -text -noout -in ca_test1.crt Certificate: Data: Version: 3 (0x2) Serial Number: 5957144336496509465 (0x52ac0736349ac219) Signature Algorithm: sha256WithRSAEncryption Issuer: O = xxxxxxx, CN = Gremwell Validity Not Before: Mar 13 15:45:00 2020 GMT Not After : Mar 13 15:45:00 2030 GMT Subject: O = xxxxxxx, CN = Gremwell Subject Public Key Info:

We used organization field to make certificates sizes match. Then we just replaced one PEM string with another in the binary.

Uploading the patched library to the mobile device:

$ adb push libtwilio_voice_android_so.so.patched_cert /sdcard/

Altering the existing library on the device (as superuser):

# cp /sdcard/libtwilio_voice_android_so.so.patched_cert /data/app/com.twilio.voice.quickstart-1/lib/arm/libtwilio_voice_android_so.so # cp /sdcard/libtwilio_voice_android_so.so.patched_cert /data/data/com.twilio.voice.quickstart/app_lib/libtwilio_voice_android_so.so

With such alterations we were able to intercept the data transmitted between the mobile client and chunderm.gll.twilio.com .

Signalling Traffic

We are about to intercept the traffic flow between our test mobile application and chunderm.gll.twilio.com host. As chunderm.gll.twilio.com is load-balanced and deployed on Amazon services, it is better to fix its IP address to a single one. The easiest way to do this is to add the corresponding entry to /etc/hosts and to verify that the test device is using this configuration:

35.157.205.11 chunderm.gll.twilio.com

Traffic redirection can be configured with iptables :

sudo iptables -t nat -A PREROUTING -p tcp --dport 443 -s 192.168.12.0/24 -d 35.157.205.11 -i wlan0 -j DNAT --to-destination 127.0.0.1:9090

When using DNAT to localhost do not forget to enable local routing:

sudo sysctl -w net.ipv4.conf.all.route_localnet=1

The traffic we want to intercept is not a HTTPS one (as we quickly learned by trying to forward it to Burp proxy). Thus, we used a general-purpose proxy tool, viproxy. We used it like this:

$ ./bin/viproxy -l 9090 -r 35.157.205.11:443 --l-ssl --l-sslcert chunderm.gll.twilio.com.crt \ --l-sslkey chunderm.gll.twilio.com.pem --r-ssl -f twilio_chunderm_1.log

Certificate chunderm.gll.twilio.com.crt was generated specifically for this purpose and signed by CA that we previously added to the libtwilio_voice_android_so.so library. Common name in this certificate was set to chunderm.gll.twilio.com . Thus, the application trusted our proxy and we were finally able to get the traffic in clear-text.

The traffic that we observed contained SIP signaling. It was used to indicate initiated calls to the remote party, receive notifications regarding call status and negotiate media channels parameters. Exactly for this purpose the Twilio voice native library includes resiprocate source code. Let's briefly analyse the intercepted SIP messages.

Client (our test mobile app) sent the following SIP INVITE message:

INVITE sip:chunderm.gll.twilio.com:443;transport=tls SIP/2.0 Via: SIP/2.0/TLS 192.168.12.118;branch=z9hG4bK-524287-1---40ce49870111dfe0;rport Max-Forwards: 70 Contact: <sip:VoiceSDK@192.168.12.118;ob;transport=tls>;+sip.instance="bd3b91dB35B543C6d83A09d877b0Dd5D" To: <sip:chunderm.gll.twilio.com:443;transport=tls> From: <sip:VoiceSDK@chunderm.gll.twilio.com>;tag=56541ce5 Call-ID: irMlCaSv9fso9ZI2vhCwOg.. CSeq: 1 INVITE Allow: INVITE, ACK, CANCEL, OPTIONS, BYE Content-Type: application/sdp Supported: outbound, path, gruu User-Agent: VoiceSDK X-Twilio-BridgeToken: eyJ6a.... X-Twilio-Client: %7B%22mob... X-Twilio-ClientVersion: 5 Content-Length: 1131 v=0 o=- 6896153525930087223 2 IN IP4 127.0.0.1 s=- t=0 0 a=group:BUNDLE audio a=msid-semantic: WMS 4f3033cBBB87fac4f85efe4E4a8Ef2dd m=audio 9 UDP/TLS/RTP/SAVPF 111 103 9 0 8 105 13 110 113 126 c=IN IP4 0.0.0.0 a=rtcp:9 IN IP4 0.0.0.0 a=ice-ufrag:sdmi a=ice-pwd:gw1kcxcee+d0P/MKRe2Bm5A7 a=ice-options:trickle a=fingerprint:sha-256 1B:EA:BF:33:B8:11:26:6D:91:AD:1B:A0:16:FD:5D:60:59:33:F7:46:A3:BA:99:2A:1D:04:99:A6:F2:C6:2D:43 a=setup:actpass ... a=ssrc:667633809 mslabel:4f3033cBBB87fac4f85efe4E4a8Ef2dd a=ssrc:667633809 label:631b37accdEfD305ABcE805FdaDD25F5

Note the fingerprint parameter: sha-256 1B:EA:BF:33:B8:11:26:6D:91:AD:1B:A0:16:FD:5D:60:59:33:F7:46:A3:BA:99:2A:1D:04:99:A6:F2:C6:2D:43 .

The server replied with the following:

SIP/2.0 200 OK CSeq: 1 INVITE Call-ID: irMlCaSv9fso9ZI2vhCwOg.. From: <sip:VoiceSDK@chunderm.gll.twilio.com>;tag=56541ce5 To: <sip:chunderm.gll.twilio.com:443;transport=tls>;tag=98073708_6772d868_46d541a4-5a12-4872-8aa4-31b1ca3e294b Via: SIP/2.0/TLS 192.168.12.118;received=78.23.55.64;branch=z9hG4bK-524287-1---40ce49870111dfe0;rport=37922 Record-Route: <sip:172.21.5.43:10193;r2=on;transport=udp;ftag=56541ce5;lr> Record-Route: <sip:35.157.205.11:443;r2=on;transport=tls;ftag=56541ce5;lr> Server: Twilio Contact: <sip:172.21.10.120:10193> Allow: INVITE,ACK,CANCEL,OPTIONS,BYE Content-Type: application/sdp X-Twilio-CallSid: CA2fad432f2da6576c4735f03af99592af Content-Length: 1039 X-Twilio-EdgeHost: ec2-35-157-205-11.eu-central-1.compute.amazonaws.com X-Twilio-EdgeRegion: de1 X-Twilio-Zone: EU_FRANKFURT v=0 o=root 219971503 219971503 IN IP4 172.21.25.231 s=Twilio Media Gateway c=IN IP4 3.122.181.240 t=0 0 a=group:BUNDLE audio a=ice-lite m=audio 17612 RTP/SAVPF 111 0 126 a=rtpmap:111 opus/48000/2 a=rtpmap:0 PCMU/8000 a=rtpmap:126 telephone-event/8000 a=fmtp:126 0-16 a=ptime:20 a=maxptime:20 a=ice-ufrag:7d27112b7d583e977e998884781408ef a=ice-pwd:5cfdaed470f11afc443384ae3a39434c a=candidate:H37ab5f0 1 UDP 2130706431 3.122.181.240 17612 typ host a=end-of-candidates a=connection:new a=setup:active a=fingerprint:sha-256 33:4E:FE:3C:76:F2:04:B4:18:FC:95:85:56:3C:1C:A7:B0:87:39:15:3D:07:42:45:85:40:6C:2C:77:A9:80:76 a=mid:audio ...

Note the parameters c=IN IP4 3.122.181.240 and m=audio 17612 RTP/SAVPF 111 0 126 . This sets DTLS-SRTP remote endpoint to 3.122.181.240:17612 .

There is another fingerprint parameter in server's reply: sha-256 33:4E:FE:3C:76:F2:04:B4:18:FC:95:85:56:3C:1C:A7:B0:87:39:15:3D:07:42:45:85:40:6C:2C:77:A9:80:76 .

If we look into the corresponding traffic capture we will see the UDP channel established between our application and 3.122.181.240:17612 . The mobile app presented to the client the following certificate during DTLS handshake:

$ openssl x509 -text -noout -in incoming1.crt Certificate: Data: Version: 3 (0x2) Serial Number: d3:83:6b:a2:6c:9e:c6:a3 Signature Algorithm: ecdsa-with-SHA256 Issuer: CN = WebRTC Validity Not Before: Mar 15 15:29:49 2020 GMT Not After : Apr 15 15:29:49 2020 GMT Subject: CN = WebRTC Subject Public Key Info: Public Key Algorithm: id-ecPublicKey Public-Key: (256 bit) pub: 04:d1:b2:19:57:ed:ec:17:70:05:02:0c:ea:4a:57: 2a:dc:f8:c2:c1:d2:83:b5:cf:38:dd:09:af:c3:b8: d5:fe:e2:ac:c2:1e:e1:0f:d5:f2:b3:94:ba:e0:d5: d0:a2:df:36:a3:f1:e7:a3:ca:c3:30:4c:8e:8b:78: eb:b5:25:a9:2a ASN1 OID: prime256v1 NIST CURVE: P-256 Signature Algorithm: ecdsa-with-SHA256 30:44:02:20:05:da:f7:2e:d4:01:a9:0f:dd:70:70:33:f5:1c: 8e:f2:2e:51:d6:71:c9:07:d0:ef:1c:e1:4e:76:b1:f0:1f:e1: 02:20:7a:cf:e0:49:a0:07:58:c6:b7:f5:8f:fe:2b:c3:91:ff: 17:ea:72:62:0b:f0:22:80:7b:09:8e:4c:8a:83:a8:11 $ openssl x509 -noout -fingerprint -sha256 -inform pem -in incoming1.crt SHA256 Fingerprint=1B:EA:BF:33:B8:11:26:6D:91:AD:1B:A0:16:FD:5D:60:59:33:F7:46:A3:BA:99:2A:1D:04:99:A6:F2:C6:2D:43

This is exactly the same fingerprint that client has sent in the SIP INVITE message, that was described earlier.

We saw that SIP signalling exchange with chunderm.gll.twilio.com negotiated DTLS certificates fingerprints. They can be used to validate the client. We can assume that client certificate can contain arbitrary information, it just has to have the fingerprint matching the one, that was transmitted in the signalling channel. However, this is a subject to verify.

Another parameter that was negotiated in this signalling channel is the remote endpoint to establish DTLS-SRTP channel:

c=IN IP4 3.122.181.240 ... m=audio 17612 RTP/SAVPF 111 0 126

In this case it is 3.122.181.240:17612 .

This format for transferring media parameters is called Session Description Protocol (SDP). As we later learned, it is one of the recommended approaches for negotiating media channels in WebRTC. The SDP Anatomy article describes the most common parameters.

One important outcome is that the remote party in this case is Twilio server. This means that it has full control over voice communications and can (in theory) intercept the traffic, decrypt it, record it and even modify. This is what can be found on Twilio website regarding this link:

Media shared in Peer-to-Peer Rooms is encrypted end-to-end and can never be accessed by Twilio. Each Participant in a Peer-to-Peer Room negotiates a separate DTLS/SRTP connection to every other participant. All media published to or subscribed from the Room is sent over these secure connections, and is encrypted only at the sender and decrypted only at the receiver. Network Traversal Service TURN cannot decrypt media: TURN only routes the packet between peers.

However, what we observed before could be just Twilio's "demo account" feature (as we used demo subscription). Indeed, Twilio server somehow has to insert "thank you for using Twilio demo account" voice message when making outgoing call.

Anyway, the idea of this exercise is to test resilience against man-in-the-middle attacker, not if Twilio servers sniff communications. Thus, the next step is to intercept DTLS-SRTP.

Intercepting DTLS-SRTP

Now we have our demultiplexing proxy tool ready and we know how to determine the remote endpoint it needs to connect to. We can automate it with "replace" scripts feature of viproxy tool. For that we launch it like this:

$ ./bin/viproxy -l 9090 -r 35.157.205.11:443 --l-ssl --l-sslcert chunderm.gll.twilio.com.crt \ --l-sslkey chunderm.gll.twilio.com.pem --r-ssl \ --resp-replace replace_run-stunproxy.rb -f twilio_chunderm_2.log

Where replace_run-stunproxy.rb contains the following:

# c=IN IP4 3.122.181.246 # m=audio 19516 RTP/SAVPF 111 0 126 if self . index ( 'c=IN IP4 ' ) m = self . match ( '^c=IN IP4 (3 \. 122 \. 181 \. [0-9]{1,3})' ) ip = m [ 1 ] m = self . match ( '^m=audio ( \d +) RTP/SAVPF .+' ) port = m [ 1 ] . to_i puts "DTLS-SRTP endpoint found (#{ip}:#{port}), launching proxy..." system ( "dtls-srtp-demux -H #{ip} -P #{port} -D 127.0.0.1 -d 8443 -h 127.0.0.1 -p 6001 &" ) :ok end

We still need to forward DTLS-SRTP traffic from the mobile app to the intercepting proxy. This can be achieved with the following iptables command:

$ sudo iptables -t nat -A PREROUTING -p udp --dport 0:65535 -s 192.168.12.0/24 -d 3.122.181.0/24 -i wlan0 -j DNAT --to-destination 127.0.0.1:6001

We are almost ready, the remaining step is to prepare DTLS server.

Terminating DTLS with SRTP Extension

As we learned from RFC5764, DTLS handshake in DTLS-SRTP protocol relies on use_srtp extension. Thus, we need a TLS library which supports it. We chose GnuTLS as it has a useful API and plenty of examples for many cases.

We used mini-dtls-srtp server as an example and combined it with the corresponding documentation. The source code is published as dtls-srtp-server.

DTLS server should present some certificate to the remote client. We assumed that there are no specific requirements for it (at the time of writing we did not have any evidences saying the opposite). Thus, we tried to make it close to the ones we observed with the validity period extended:

Certificate: Data: Version: 3 (0x2) Serial Number: 5550018979492260217 (0x4d05a0db494fe979) Signature Algorithm: ecdsa-with-SHA256 Issuer: CN = WebRTC Validity Not Before: Mar 11 15:14:00 2020 GMT Not After : Mar 11 15:14:00 2030 GMT Subject: CN = WebRTC Subject Public Key Info: Public Key Algorithm: id-ecPublicKey Public-Key: (256 bit) pub: 04:6d:49:e5:56:72:f8:f3:13:34:94:ae:a2:0e:11: dc:a9:a1:6e:62:1e:8c:e6:59:80:f3:6d:c6:42:5c: 22:e6:c9:20:83:ba:f2:49:1d:18:ad:38:a6:5f:d1: 7a:2b:d9:03:08:9b:bf:ef:39:96:94:f8:b2:6f:fd: 44:15:61:2d:93 ASN1 OID: prime256v1 NIST CURVE: P-256 Signature Algorithm: ecdsa-with-SHA256 30:44:02:20:1e:5d:99:c5:9b:c9:9e:b1:ee:bb:64:fd:30:86: c2:70:be:72:61:8e:fe:6d:bf:23:8d:da:87:91:e8:7e:c9:ad: 02:20:77:f6:27:ce:ec:87:8e:e6:28:ab:df:e7:13:70:12:d9: 8b:31:7c:84:3e:a5:37:88:f5:32:fd:1c:52:55:3f:7a

However, we noted that there is a requirement for private key used along with the certificate. It has to be generated with prime256v1 elliptic curve. This was a practical observation and the default curve used by GnuTLS.

For now we left the certificate and the corresponding private key hardcoded in our DTLS server. If we spot cases when client verifies certificate parameters other than fingerprint, we will add the corresponding test to our qsslcaudit tool.

Putting It All Together

As we have all the components ready, we can now test if Twilio server properly verifies client's DTLS certificate.

When our test application received the incoming voice call, our viproxy instance intercepted several SIP messages and determined remote DTLS-SRTP endpoint. Knowing its address, the proxy launched DTLS-SRTP multiplexing tool. This tool transparently forwarded all packets, except for DTLS. The latter was redirected to our DTLS server which terminated the connection.

Communications diagram:

Surprisingly our DTLS server returned negotiated SRTP keys:

Client key: 20b55cc952afc019b88e6bf2b938f465 Client salt: 52537c2d1142be963ceeb89f3917 Server key: 7a0c83f7bc03832ca37ba97bd6e0e111 Server salt: 9d1a8e282f6e66a83a77e25512d0

Wireshark captured the following DTLS handshake:

As can be seen in the screenshot, there is no DTLS alert message. Did we successfully spoofed the client? (Un)fortunately, no: immediately after this handshake, the server sent SIP BYE message:

BYE sip:VoiceSDK@78.23.55.64:38166;ob;transport=tls SIP/2.0 CSeq: 1 BYE From: <sip:chunderm.gll.twilio.com:443;transport=tls>;tag=32934414_6772d868_c358fa1f-bffa-4081-8470-9fa6ff46176d To: <sip:VoiceSDK@chunderm.gll.twilio.com>;tag=01ed5987 Call-ID: oc1h_oktbATMC4pHJ-kkpQ.. Max-Forwards: 69 User-Agent: Twilio Via: SIP/2.0/TLS 35.157.205.11:443;branch=z9hG4bK6c4.566d9a05.0 Via: SIP/2.0/UDP 172.21.10.120:10193;branch=z9hG4bKc358fa1f-bffa-4081-8470-9fa6ff46176d_6772d868_440-16643670738841742609 X-Twilio-CallSid: CA2a489088afb6cb198c47fa45f038ff45 Content-Length: 0

It is the same message that was sent during normal voice call termination, no explicit error message provided. From user experience point of view, the application just stops the call.

The corresponding error message in the application's debug log (over ADB) is not verbose at all:

03-17 09:15:13.181 12647-12647/com.twilio.voice.quickstart D/VoiceActivity: Connect failure 03-17 09:15:13.181 12647-12647/com.twilio.voice.quickstart E/VoiceActivity: Call Error: 31005, Connection error

Previously enabled debug logging did not help either. However, it produced quite a lot of output, including SIP signalling messages.

The "Call Error" is returned by the following hook in its Java source code:

public void onConnectFailure ( @NonNull Call call, @NonNull CallException error ) { setAudioFocus ( false ) ; if ( BuildConfig. playCustomRingback ) { SoundPoolManager. getInstance ( VoiceActivity. this ) . stopRinging ( ) ; } Log. d ( TAG, "Connect failure" ) ; String message = String . format ( Locale . US , "Call Error: %d, %s" , error. getErrorCode ( ) , error. getMessage ( ) ) ; Log. e ( TAG, message ) ; Snackbar. make ( coordinatorLayout, message, Snackbar. LENGTH_LONG ) . show ( ) ; resetUI ( ) ; }

We did not identify the exact source code responsible for handling the case that emits onConnectFailure() event.

In order to be confident that we did all things properly, we have to simulate the valid client's behaviour.

Spoofing DTLS Certificate Fingerprint

We configured our intercepting proxy to substitute mobile application's DTLS certificate fingerprint with the one that we use. This was accomplished by the following script for viproxy tool:

# replace client's certificate fingerprint # (we do not care if we substitute server's certificate too) # a=fingerprint:sha-256 1C:02:25:E.... # if self . index ( 'fingerprint:sha-256' ) fprint = "37:BE:BB:AA:0B:14:D5:0B:A5:A5:8D:A3:7C:25:00:E9:BE:FE:89:07:C0:35:66:9F:D0:54:20:BF:48:D4:E0:0F" self . gsub! ( / ^a=fingerprint:sha - 256 . * $ / , "a=fingerprint:sha-256 #{fprint}" ) puts "client's certificate fingerprint replaced" # send packet further :ok end

The corresponding viproxy commandline:

$ ./bin/viproxy -l 9090 -r 35.157.205.11:443 --l-ssl --l-sslcert chunderm.gll.twilio.com.crt \ --l-sslkey chunderm.gll.twilio.com.pem --r-ssl \ --resp-replace replace_run-stunproxy.rb --req-replace replace_dtls-fingerprint.rb

Having such setup, we intercepted network traffic. Now the mobile client did not display any errors and we observed SRTP traffic:

There is no DTLS handshake in this excerpt, as we intercepted it before it reached the mobile client.

This means that our setup was correct. Twilio server indeed verifies the mobile client's DTLS certificate fingerprint.

Our DTLS server negotiated the following SRTP keys:

Client key: ef8122e36bdaeef5ed84723168ea8666 Client salt: 0cd646cd7ff704c309f77155f3dc Server key: 11b6d02f9be14e79a248ecdce22fc2d0 Server salt: 99d5daf0d315869ec880d8b72ad8

We checked if we can decrypt the intercepted SRTP traffic with these keys. We tried to use ffmpeg project tools to avoid writing our own SRTP decrypt-assemble-play implementation.

Preparing the key in ffmpeg format:

# client --> server key $ echo "11b6d02f9be14e79a248ecdce22fc2d099d5daf0d315869ec880d8b72ad8" | xxd -r -p | base64 EbbQL5vhTnmiSOzc4i/C0JnV2vDTFYaeyIDYtyrY # server --> client key $ echo "ef8122e36bdaeef5ed84723168ea86660cd646cd7ff704c309f77155f3dc" | xxd -r -p | base64 74Ei42va7vXthHIxaOqGZgzWRs1/9wTDCfdxVfPc

Running ffmpeg :

$ ffplay -srtp_in_suite AES_CM_128_HMAC_SHA1_80 -srtp_in_params 74Ei42va7vXthHIxaOqGZgzWRs1/9wTDCfdxVfPc srtp://192.168.0.176:8888

We replayed SRTP traffic that was originally sent by server to our mobile application:

# tcpreplay-edit --fixcsum --srcipmap=3.122.181.212:192.168.0.222 --dstipmap=192.168.12.118:192.168.0.176 \ --portmap=50000-55000:8888 --enet-smac=00:....:9e --enet-dmac=10:...:20 --intf1=wlan0 incoming3_srtp.pcapng

For some reason, ffplay did not produce any output or sound. However, it printed the following errors when we used the incorrect key:

$ ffplay -srtp_in_suite AES_CM_128_HMAC_SHA1_80 -srtp_in_params EbbQL5vhTnmiSOzc4i/C0JnV2vDTFYaeyIDYtyrY srtp://192.168.0.176:8888 ffplay version 4.2.2-alt1 Copyright (c) 2003-2019 the FFmpeg developers ... libavdevice 58. 8.100 / 58. 8.100 libavfilter 7. 57.100 / 7. 57.100 libavresample 4. 0. 0 / 4. 0. 0 libswscale 5. 5.100 / 5. 5.100 libswresample 3. 5.100 / 3. 5.100 libpostproc 55. 5.100 / 55. 5.100 HMAC mismatch 0.000 fd= 0 aq= 0KB vq= 0KB sq= 0B f=0/0 Last message repeated 1 times HMAC mismatch 0.000 fd= 0 aq= 0KB vq= 0KB sq= 0B f=0/0 HMAC mismatch 0.000 fd= 0 aq= 0KB vq= 0KB sq= 0B f=0/0 HMAC mismatch 0.000 fd= 0 aq= 0KB vq= 0KB sq= 0B f=0/0 Last message repeated 1 times

This confirms that the keys we obtained can indeed be used to decrypt voice traffic.

Conclusion

With this exercise we learned a bit about DTLS-SRTP protocol (defined in RFC5764) and how it is used by Twilio platform for voice communications between Android mobile clients.

We were able to prepare the setup which allowed us to intercept various communication channels. The most important of them were the aforementioned DTLS-SRTP and SIP signalling channels. This required advanced usage of transparent TCP proxy to parse signalling data and development of following tools: DTLS server and SRTP-DTLS demultiplexor.

We confirmed that Twilio properly verifies the fingerprint of mobile client's DTLS certificate. Together with proper validation of SIP signalling channel security, this prevents man-in-the-middle attacks against voice communications.

The described activity encouraged us to test other products which use DTLS-SRTP protocol. This we describe in the next section.

Wire

After looking into Twilio platform described above, we decided to assess other applications which use DTLS-SRTP protocol. Our fist victim was Wire platform/application. This was a lucky choice as it was a pleasure to work with an open source client which is well designed and written in clean style.

When we initiated this assessment we began with tooling and approach developed when working on Twilio platform. However, it turned out that some alterations were required. As a result, the description you have read in the previous chapter was slightly rewritten to comply with the updated software.

Our objective is not to fully assess Wire application but to make our understanding of media communications better, improve testing approach and corresponding software. Thus, we focused on intercepting and analysing network traffic during phone calls, keeping the rest to other researchers.

Setup

The application was installed on two different Android devices from Google Play store. Two new accounts were created. For brevity, let's call one device and application app1 and another one app2 .

The application's version at the moment was 3.46.890 :

$ adb shell dumpsys package com.wire | grep -B1 versionName versionCode=890 targetSdk=28 versionName=3.46.890

Traffic Analysis

We captured the traffic the application produces when making the phone call. To capture DNS requests we started the capture process before the application launch.

We used the following setup:

app1 was running on the device with IP address 192.168.12.118

was running on the device with IP address app2 was running on another device having IP address 192.168.0.221

was running on another device having IP address the first device was connected to 192.168.0.0/24 network via test laptop, serving as a gateway, with IP address 192.168.12.1

If we exclude TLS-encrypted traffic, the application starts with STUN communications between two servers:

turn04.de.prod.wire.com , 116.203.131.31:3478 (both UDP and TCP)

, (both UDP and TCP) turn03.de.prod.wire.com , 116.203.137.142:3478 (both UDP and TCP)

Both STUN channels were used to determine peer address in peer-to-peer setup (probably, for redundancy). In this case the negotiated peers were 192.168.0.221:55831 and 192.168.12.118:33880 . The endpoints were negotiated in XOR-PEER-ADDRESS STUN attribute. Note that these are the IP address of app1 and app2 . This allowed media traffic to be transmitted on the local network, without crossing the Internet.

After defining the peer channel, the applications establish DTLS-SRTP communication with each other over 192.168.0.221:55831 / 192.168.12.118:33880 . In this channel, followed by several STUN messages, app2 starts DTLS handshake with app1 :

Certificate presented by DTLS server:

$ openssl x509 -text -noout -in wire_outgoing-call1_app1.cert Certificate: Data: Version: 3 (0x2) Serial Number: d3:50:43:ba:f1:bd:2f:e4 Signature Algorithm: ecdsa-with-SHA256 Issuer: CN = WebRTC Validity Not Before: Mar 17 09:18:28 2020 GMT Not After : Apr 17 09:18:28 2020 GMT Subject: CN = WebRTC Subject Public Key Info: Public Key Algorithm: id-ecPublicKey Public-Key: (256 bit) pub: 04:d7:26:d1:fe:00:f2:28:f0:95:44:f4:b5:f6:eb: 62:95:49:66:e6:57:83:3a:a5:76:5c:c7:22:b9:93: a5:f3:cc:79:f9:68:c6:62:30:3e:f6:3e:33:63:68: fb:aa:ec:4c:2f:83:b4:85:1e:f0:78:19:72:fc:39: 9c:18:8d:73:3e ASN1 OID: prime256v1 NIST CURVE: P-256 Signature Algorithm: ecdsa-with-SHA256 30:46:02:21:00:a6:33:62:fe:2e:32:63:3f:47:33:ec:2f:85: 0c:1e:94:f4:24:36:07:f1:70:d7:e9:01:7a:e5:d0:96:99:ed: db:02:21:00:bc:de:98:a3:88:f6:a9:bf:55:75:a3:70:9c:5c: 27:f3:c2:25:ca:8f:64:a2:a7:10:47:35:59:90:63:a7:90:fb

We also noted that after initial handshake there were several data packets transmitted inside DTLS channel. Then SRTP flow starts with the first packet from app2 :

When we initiated the similar call once more after relaunching the application, we noted that initial STUN communications were established with the same servers. The DTLS certificate presented by app1 was different:

$ openssl x509 -text -noout -in wire_outgoing-call2_app1.cert Certificate: Data: Version: 3 (0x2) Serial Number: 94:db:c9:6b:ef:06:88:d2 Signature Algorithm: ecdsa-with-SHA256 Issuer: CN = WebRTC Validity Not Before: Mar 17 10:21:47 2020 GMT Not After : Apr 17 10:21:47 2020 GMT Subject: CN = WebRTC Subject Public Key Info: Public Key Algorithm: id-ecPublicKey Public-Key: (256 bit) pub: 04:99:6b:45:17:c4:2c:b0:c3:82:95:c6:43:c0:8a: 65:74:e0:5e:75:c3:82:4b:2f:01:aa:e4:09:d5:67: 4e:b3:46:bf:83:da:8b:70:db:d2:79:8e:16:a8:13: bc:89:29:36:60:e1:4b:a1:24:d0:93:83:fe:72:47: f3:25:e3:5e:71 ASN1 OID: prime256v1 NIST CURVE: P-256 Signature Algorithm: ecdsa-with-SHA256 30:45:02:21:00:81:fa:e2:f7:57:18:cf:40:a7:0c:53:59:51: 63:fb:35:4f:81:19:2d:6f:ed:2b:bd:5a:38:84:83:ac:e3:7f: 42:02:20:21:3c:6c:20:bd:b9:ec:7f:09:4d:dd:df:5f:eb:1a: 65:4b:10:98:ca:88:34:77:6e:0b:1a:42:de:89:95:c2:74

TLS Certificates Validation

We performed client-side TLS implementation test using our qsslcaudit tool. We checked all communication channels that we noted:

prod-nginz-https.wire.com

prod-nginz-ssl.wire.com

turn03.de.prod.wire.com

turn04.de.prod.wire.com

The results were perfectly fine for prod-nginz-https.wire.com and prod-nginz-ssl.wire.com , see the following excerpt:

$ qsslcaudit -l 0.0.0.0 -p 8443 --user-cert untrusted5.gremwell.com_cert+chain.pem --user-key untrusted5.gremwell.com.key --user-cn prod-nginz-https.wire.com ... tests results summary table: +----|------------------------------------|------------|-----------------------------+ | ## | Test Name | Result | Comment | +----|------------------------------------|------------|-----------------------------+ | 1 | custom certificate trust | PASSED | | | 2 | self-signed certificate for target | PASSED | | | | domain trust | | | | 3 | self-signed certificate for invali | PASSED | | | | d domain trust | | | | 4 | custom certificate for target doma | PASSED | | | | in trust | | | | 5 | custom certificate for invalid dom | PASSED | | | | ain trust | | | | 8 | SSLv2 protocol support | PASSED | | | 9 | SSLv3 protocol support | PASSED | | | 10 | SSLv3 protocol and EXPORT grade ci | PASSED | | | | phers support | | | | 11 | SSLv3 protocol and LOW grade ciphe | PASSED | | | | rs support | | | | 12 | SSLv3 protocol and MEDIUM grade ci | PASSED | | | | phers support | | | | 13 | TLS 1.0 protocol support | PASSED | | | 14 | TLS 1.0 protocol and EXPORT grade | PASSED | | | | ciphers support | | | | 15 | TLS 1.0 protocol and LOW grade cip | PASSED | | | | hers support | | | | 16 | TLS 1.0 protocol and MEDIUM grade | PASSED | | | | ciphers support | | | | 17 | TLS 1.1 protocol and EXPORT grade | PASSED | | | | ciphers support | | | | 18 | TLS 1.1 protocol and LOW grade cip | PASSED | | | | hers support | | | | 19 | TLS 1.1 protocol and MEDIUM grade | PASSED | | | | ciphers support | | | | 20 | TLS 1.2 protocol and EXPORT grade | PASSED | | | | ciphers support | | | | 21 | TLS 1.2 protocol and LOW grade cip | PASSED | | | | hers support | | | | 22 | TLS 1.2 protocol and MEDIUM grade | PASSED | | | | ciphers support | | | +----|------------------------------------|------------|-----------------------------+ most likely all connections were established by the same client the first connection details: source host: 192.168.12.118 dtls?: false ssl errors: The TLS/SSL connection has been closed The remote host closed the connection ssl conn established?: true socket errors ids: 1 1 received data, bytes: 317 transmitted data, bytes: 4231 protocol: TLSv1.2 accepted ciphers: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256:TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384:TLS_EMPTY_RENEGOTIATION_INFO_SCSV SNI: prod-nginz-https.wire.com ALPN: h2, http/1.1 qsslcaudit version: 0.8.1

However, the client trusts any certificate when connecting to TURN servers:

$ qsslcaudit -l 0.0.0.0 -p 8443 --user-cert untrusted5.gremwell.com_cert+chain.pem --user-key untrusted5.gremwell.com.key --user-cn turn03.de.prod.wire.com ... tests results summary table: +----|------------------------------------|------------|-----------------------------+ | ## | Test Name | Result | Comment | +----|------------------------------------|------------|-----------------------------+ | 1 | custom certificate trust | FAILED !!! | mitm possible | | 2 | self-signed certificate for target | FAILED !!! | -//- | | | domain trust | | | | 3 | self-signed certificate for invali | FAILED !!! | -//- | | | d domain trust | | | | 4 | custom certificate for target doma | FAILED !!! | -//- | | | in trust | | | | 5 | custom certificate for invalid dom | FAILED !!! | -//- | | | ain trust | | | | 8 | SSLv2 protocol support | PASSED | | | 9 | SSLv3 protocol support | PASSED | | | 10 | SSLv3 protocol and EXPORT grade ci | PASSED | | | | phers support | | | | 11 | SSLv3 protocol and LOW grade ciphe | PASSED | | | | rs support | | | | 12 | SSLv3 protocol and MEDIUM grade ci | PASSED | | | | phers support | | | | 13 | TLS 1.0 protocol support | FAILED !!! | | | 14 | TLS 1.0 protocol and EXPORT grade | PASSED | | | | ciphers support | | | | 15 | TLS 1.0 protocol and LOW grade cip | PASSED | | | | hers support | | | | 16 | TLS 1.0 protocol and MEDIUM grade | FAILED !!! | | | | ciphers support | | | | 17 | TLS 1.1 protocol and EXPORT grade | PASSED | | | | ciphers support | | | | 18 | TLS 1.1 protocol and LOW grade cip | PASSED | | | | hers support | | | | 19 | TLS 1.1 protocol and MEDIUM grade | FAILED !!! | | | | ciphers support | | | | 20 | TLS 1.2 protocol and EXPORT grade | PASSED | | | | ciphers support | | | | 21 | TLS 1.2 protocol and LOW grade cip | PASSED | | | | hers support | | | | 22 | TLS 1.2 protocol and MEDIUM grade | FAILED !!! | | | | ciphers support | | | +----|------------------------------------|------------|-----------------------------+ most likely all connections were established by the same client the first connection details: source host: 192.168.12.118 dtls?: false ssl errors: ssl conn established?: true intercepted data: qsslcaudit version: 0.8.1

It is a weird result, but we can not consider this as a security issue. The application transmits the same data simultaneously in plain text, using UDP and TCP channels. This communication channel is not used to transfer any sensitive data.

Media Flow Analysis

As was briefly mentioned previously, STUN protocol is used to negotiate peers. We have to parse the protocol in runtime and extract them. For this purpose we developed a stunpeersniff tool. This tool searches for XOR-PEER-ADDRESS attribute and launches UDP demultiplexor with the corresponding parameters. This UDP demux tool was described earlier: it searches for DTLS packets and forwards them to our DTLS server instance.

We forwarded STUN traffic, that was sent by app1 , towards our TCP proxy with the following iptables command:

$ sudo iptables -t nat -A PREROUTING -p tcp --dport 3478 -s 192.168.12.0/24 -d 116.203.131.31 -i wlan0 -j DNAT --to-destination 127.0.0.1:6000

iptables command to forward DTLS-SRTP traffic to UDP demultiplexor:

$ sudo iptables -t nat -A PREROUTING -p udp --dport 10000:65535 -s 192.168.12.118 -d 192.168.0.221 -i wlan0 -j DNAT --to-destination 127.0.0.1:6001

STUN proxy command:

$ stunpeersniff -H 116.203.131.31 -P 3478 -h 127.0.0.1 -p 6000 -t dtls-srtp-demux

This setup intercepted the communications we need and forwarded DTLS handshake to our DTLS server.

DTLS server received Alert (Bad Certificate) DTLS error immediately after providing certificate. This confirms that the remote party (another mobile client in our case) validates DTLS certificate. It is a more robust behaviour than we observed with Twilio: DTLS handshake does not reach the point to derive SRTP encryption keys.

We noted that if we redirect DTLS handshake into nowhere, the call establishes successfully and both parties can talk. The captured traffic reveals that SRTP data is transmitted using a TURN channel. This traffic goes between the mobile application and remote TURN server. To decode it with Wireshark we followed this advice.

You can see the result in the following illustration:

We tried to identify how ( if _) DTLS handshake is now established. Apparently, the corresponding data is transmitted inside TURN channel in this case. You can see this in the following screenshot from Wireshark. The certificate presence is revealed by WebRTC common name string and expiration date 200418 encoded as a string:

This data exchange continues using TURN channel data messages:

Wire application is based on Wire signaling library in terms of media communications. This library uses Google's WebRTC reference implementation to handle the communications we observed.

WebRTC library implements a single module that handles DTLS handshake and fingerprint verification ( p2p/base/dtlstransport.cc ). The DTLS transport implementation does not depend on a particular communication channel and is fed with data by the rest of the library. This allows us to assume that the single DTLS validation test we performed earlier is enough to be sure that parties are properly verified in other cases too.

We also tried to figure out how signalling data is transmitted. WebRTC standard does not explicitly define how this should be implemented. According to src/econn/README.md file of the AVS Wire library source code, SDP data is sent as SETUP message via Backend . Backend in this case can be arbitrary communication channel. Mobile clients receive remote signalling via WebSockets protocol which transmits end-to-end encrypted messages. In upstream direction encrypted messages are sent as POST requests to prod-nginz-https.wire.com . WebSockets message example:

{ "payload" : [ { "conversation" : "fd2c8985-cb03-4a0c-9291-513a02355693" , "time" : "2020-03-19T14:22:01.638Z" , "data" : { "text" : "owABAaEAWCAFs5h5Cft0kXBPKdIWvtlIbuNUuU7vNLeO23zzAGM7ngJZDMYBp ... nNab4HfDHOZZVDNXDu+ymAmpsuKSW7Q=" , "sender" : "cbc19847050b5b9d" , "recipient" : "141e979399699446" } , "from" : "4707e7e4-25a4-4afe-ad4d-13eb565b7fab" , "type" : "conversation.otr-message-add" } ] , "transient" : false , "id" : "0000aaa8-69ed-11ea-8075-22000a23ecad" }

According to src/peerflow/peerflow.cpp source of AVS Wire library, some messages use WebRTC data channel. Data channel interface is defined in WebRTC's api/datachannelinterface.{hh,cc} . It is encapsulated in SCTP protocol which goes inside DTLS channel and which is transmitted over STUN/TURN messages.

Conclusion

Working with Wire allowed us to observe some WebRTC internals. Compared to Twilio platform, Wire uses vanilla WebRTC approach to establish peer-to-peer multimedia communications. Due to its open source nature we were able to check our observations against the implementation.

Overall security of media data transmitted by Wire mobile application follows WebRTC guidelines:

RTP media data is secured as SRTP.

Keys for SRTP are derived by DTLS handshake.

DTLS handshake fails if peer fingerprint does not match the announced one.

Peer fingerprint is transmitted as end-to-end encrypted data inside WebSocket, secured with TLS.

Critical TLS servers certificates are properly validated by Android client.

In order to intercept Wire media traffic the same tools and firewall configuration is needed as with Twilio case. Additionally, we wrote a STUN sniffer tool stunpeersniff which is required to determine peers on the fly and configure DTLS-SRTP proxy accordingly.