A recent comment on Leo's stand alone VoIP service post brought up a topic always worth exploring; SIP security. Here's the quote:

The big thing I am concerned with is will [Google] implement [SIP service] securely??? Will it utilize encryption, for both signaling and media? I could see malicious folks attempting to utilize SIPVicious and other sip cracking tools to try and take over accounts. Also, this could lead to toll fraud, in my opinion. If you have a balance on your account or automatic billing, someone takes over your account (which is easier now because all they have to do is brute force your sip client) then they can start making calls as you on your dime. I think there are several security risks involved with this..... Nick Grant secvoip.com

This comment has some good points that I want to to explore because it's important to understand what we mean when we say secure SIP. When talking about securing your VoIP traffic, you can really break this down into 3 large categories:

Securing passwords Signaling encryption Media encryption

Securing Passwords

First, let's take a look at password security. In our minds, this is the most important of your VoIP (or any other webservice) security layers. In OnSIP, your VoIP password is what we use to authenticate your phone when it attempts to perform certain actions. For example, we require authentication for registration and dialing toll-based services like the PSTN. With this kind of reliance on passwords, an insecure password can directly lead to hacked accounts and lost money through toll fraud. In this case, the key to preventing a hacked account is to choose secure passwords that are not easily guessed or so short that they're likely to be in someone's rainbow table. This is one reason we only allow randomly generated passwords for the VoIP password. All too often do we hear about our PSTN gateway customers running their own PBXs on which they choose their own passwords - or even worse don't change the default passwords - and end up using passwords that are easily guessed by hackers and fraudsters.

Once we have a strong password configured on the server and in the phone, we need a way to let the phone send its password to the server without communicating it to anyone who may be snooping around the network. Further, we need a way to do this when the whole channel is unencrypted. Enter digest authentication and one-way hashing. Let's look at a SIP registration back and forth to examine how SIP uses digest authentication and what that means for transmitting your password.

Click to view the SIP trace

Client (C) Sends Initial REGISTER request With this packet, my phone is requesting a new registration at junctionnetworks.com. The phone sends this initial request without any password information in it. REGISTER sip:junctionnetworks.com SIP/2.0.Via: SIP/2.0/UDP 192.168.1.140:31984;branch=z9hG4bK-d8754z-f4bed10eca76384a-1---d8754z-;rport.Max-Forwards: 70.Contact: <sip:erick@192.168.1.140:31984;rinstance=c0e4881bb90b9fb3>.To: <sip:erick@junctionnetworks.com>.From: <sip:erick@junctionnetworks.com>;tag=4879c74e.Call-ID: YmE1YjNlMzBhOWQ0Y2Q0NzAwZWVhMDQ1NjY1ZmU0Y2Q..CSeq: 1 REGISTER.Expires: 200.Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, NOTIFY, MESSAGE, SUBSCRIBE, INFO.User-Agent: eyeBeam release 1104g stamp 54685.Content-Length: 0. Server (S) sends 401 Response with WWW-Authenticate Header The server rejects the first REGISTER request with this 401 Unauthorized response. The server is indicating to the client that the REGISTER request was received, but the client needs to authenticate with the server before the registration will be allowed. In order to authenticate securely, the client must use the values from the WWW-Authenticate header below to create a new registration request... more on that to come. SIP/2.0 401 Unauthorized.Via: SIP/2.0/UDP 192.168.1.140:31984;received=173.210.1.18;branch=z9hG4bK-d8754z-f4bed10eca76384a-1---d8754z-;rport=2258.To: <sip:erick@junctionnetworks.com>;tag=75d09fb22ceadb40012c6e771a69dc74.93c4.From: <sip:erick@junctionnetworks.com>;tag=4879c74e.Call-ID: YmE1YjNlMzBhOWQ0Y2Q0NzAwZWVhMDQ1NjY1ZmU0Y2Q..CSeq: 1 REGISTER.<span class="highlight">WWW-Authenticate: Digest realm="jnctn.net", nonce="4d7a8a66000154b28dab7029039c013476fc27a8cd89234f", qop="auth".</span>Server: OpenSIPS (1.5.3-notls (x86_64/linux)).Content-Length: 0. C rerequests with Authorization Header The client uses the parameters in the WWW-Authenticate header to resubmit the request, but this time with the Authorization header in this packet. The hashed up password is in the response parameter of the Authorization header. REGISTER sip:junctionnetworks.com SIP/2.0.Via: SIP/2.0/UDP 192.168.1.140:31984;branch=z9hG4bK-d8754z-25b3a63f83a0cb3d-1---d8754z-;rport.Max-Forwards: 70.Contact: <sip:erick@192.168.1.140:31984;rinstance=c0e4881bb90b9fb3>.To: <sip:erick@junctionnetworks.com>.From: <sip:erick@junctionnetworks.com>;tag=4879c74e.Call-ID: YmE1YjNlMzBhOWQ0Y2Q0NzAwZWVhMDQ1NjY1ZmU0Y2Q..CSeq: 2 REGISTER.Expires: 200.Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, NOTIFY, MESSAGE, SUBSCRIBE, INFO.User-Agent: eyeBeam release 1104g stamp 54685.<span class="highlight">Authorization: Digest username="junction_erick",realm="jnctn.net",nonce="4d7a8a66000154b28dab7029039c013476fc27a8cd89234f",uri="sip:junctionnetworks.com",response="dae97ee06023d15556901d1c1c5e260e",cnonce="7b06cbf1bc67cde5fd2cb6bde6fa2eda",nc=00000001,qop=auth,algorithm=MD5.</span>Content-Length: 0. S Sends Accepted Response The server has accepted the registration request. SIP/2.0 200 OK.Via: SIP/2.0/UDP 192.168.1.140:31984;received=173.210.1.18;branch=z9hG4bK-d8754z-25b3a63f83a0cb3d-1---d8754z-;rport=2258.To: <sip:erick@junctionnetworks.com>;tag=75d09fb22ceadb40012c6e771a69dc74.bead.From: <sip:erick@junctionnetworks.com>;tag=4879c74e.Call-ID: YmE1YjNlMzBhOWQ0Y2Q0NzAwZWVhMDQ1NjY1ZmU0Y2Q..CSeq: 2 REGISTER.Contact: <sip:erick@71.249.175.83:1427>;expires=2844, <sip:erick@192.168.1.140:31984;rinstance=c0e4881bb90b9fb3>;expires=200;received="sip:173.210.1.18:2258".Server: OpenSIPS (1.5.3-notls (x86_64/linux)).Content-Length: 0.

My password is in that Authorization header, but you can't tell me what it is. Even though the packets may be transmitted in plain text via UDP, TCP, or this blog post, the password is one way hashed via the "quality of protection" (QOP) auth mechanism defined for digest authentication. From the RFC:

The Digest Access Authentication scheme is not intended to be a complete answer to the need for security in the World Wide Web. This scheme provides no encryption of message content. The intent is simply to create an access authentication method that avoids the most serious flaws of Basic authentication.

Computing Digest Authentication Responses

Each piece of the digest authentication is designed as a mechanism to make the authentication stage stronger against various attack vectors. Here is a short definition list of the more obscure parts of digest authentication with a quick note about each parameter's purpose.

nonce A server side generated seed for securing the digest response. The nonce may be unique for each client and used only for a limited time so as to protect the server against replay attacks cnonce (client nonce) Like the server nonce, the cnonce is a client side generated seed. Instead of being designed to protect the server, though, the cnonce is designed to protect the client against nefarious proxies, man in the middle (MITM) attacks, precomputed dictionary attacks, and other attack vectors. The cnonce is an optional mechanism in the original spec, but it's required in the case of qop being defined. For instance, let's say badserver.org is out on the interwebs sending INVITE s to random SIP clients 24x7. When unluckybob@unsuspecting.org answers his phone, the badserver.org folks just wait for unluckybob to hang up the phone and send the BYE . At this point, badserver.org rejects the BYE with a 407 Proxy Authentication response. Poor unluckybob@unsuspecting.org may now have been tricked into sending his (precomputed) hashed up authentication credentials to badserver.org. If Bob's user-agent responds to authentication schemes that don't require a cnonce, then he may have just given away his username and password. (Thanks to Sjur Usken for that case.) If the client seeds its response back to the server with a shared cnonce, it will effectively make badserver.org's large, precomputed table of common password responses useless. qop The "Quality of Protection" field, effectively the digest auth specification, allows for alternate computational algorithms to be defined. The qop field allows the server to communicate to the client which computations it can accept. nc (nonce count) Nonce count is an incrementing integer, controlled by the client and tracked on the server, indicating the numbered sequential requests that have been sent by the client with this nonce/cnonce pair. This allows servers to detect replay attacks and "act appropriately" if so... like shut down and run for cover.

To view a sample implementation of the QOP auth algorithm, I've coded up this small ruby utility to calculate the digest auth for you, check it out here. Feel free to use it off the command line like so:

$ ./digest_auth.rb unluckybob lazypassword sip:unsuspecting.org REGISTER badserver.org easy-nonce auth client-nonce-protects-bobHere's your digest:d0fe5c04c64f4b1bbf6e6d35d684046d

Feel free to use this utility however you see fit; however, keep in mind its only intended purpose is as a demonstration of digest auth for the current discussion.

I hope this explains how digest authentication is used to avoid plain text password transmission is avoided even when SIP signaling is transmitted in the clear. As long as you are choosing strong passwords, and using SIP clients that will only respond to strong authentication requests then your password should stay safe.

Signaling and Media Encryption

I hope it's obvious now, that it is unnecessary to encrypt your traffic to secure your password. However, let's talk about encrypting your signaling and media traffic. For this discussion, I will only take a look at the SIP/SDP/(S)RTP protocol stack. (Any protocols and standards defined outside of the RFCs referenced in this post are outside the context of this discussion (is that a tautology?)). This post is only meant as an overview of common industry practices). Just to remind us, let's keep in mind the OSI (7 layer burrito) model and where we are at at each point of this discussion.

Securing the Signal

Unlike password hashing, we must be able to decrypt VoIP signaling and media to make usable messages out of them on the remote endpoint. Various methods exist for securing the signaling; however, TLS remains the most common, complete, and recommended solution to securing SIP. Even more than simply sending an encrypted SIP packet across one hop, what a user truly wants in SIP security is an encrypted end to end stream.

RFC 3261 recommends SIPS to be delivered over TLS on every segment of the call, providing end to end security. If a remote endpoint can't be reached via secure channels, then insecure SIP URIs may be attempted. But, when calling a SIPS URI, the call signaling is supposed to be encrypted end to end. SIP encryption via IPSec is a more ad hoc approach and does not guarantee end to end encryption. By definition, IPSec only provides hop to hop encryption, and no alternate security URI is defined to identify SIP over IPSec like there is the SIPS URI. Other suggestions also exist for DTLS in the case of UDP; however, finding either client or server implementations is difficult, at best, and really just non standard in practice.

Because SIPS over TLS is the SIP standard for security (see RFC 3261), there is a slightly higher barrier to entry for providing secure SIP service than for providing connectionless message oriented SIP service (i.e. UDP). I will stop here. It is out of this article's scope to explain why maintaining streams consumes more physical resources than maintaining datagram based connections in standard select/poll based architectures... (but let's hope OpenSIPs new 2.0 architechture is making great strides here).

Securing the Media

When discussing signaling and media encryption, we must encrypt signaling before worrying about encrypting the media. Why is this? Many people think of encrypting your VoIP calls as simply obfuscating the media stream. That view point is forgetting the important detail that the signaling always controls the media stream. SRTP, described via SDP payloads of SIP packets, is encrypted using a set of clear text cryptography parameters to describe the encryption algorithm and keys that allow decryption of the associated RTP. If those keys are sent in the clear, then any unauthorized party would have all necessary information ahead of time to decrypt the encrypted media, rendering the encryption useless. From RFC 4568:

...IT IS REQUIRED that encryption of the encapsulating payload be used whenever a master key parameter (inline) appears in the message. Failure to encrypt the SDP message containing an inline SRTP master key renders the SRTP authentication or encryption service useless in practically all circumstances...

Most of the following discussion can be found in section 4 of rfc 4568. Describing the SRTP encryption parameters comes through the crypto SDP attribute:

a=crypto:<tag> <crypto-suite> <key-params> [<session-params>]

In a practice example, this ends up looking like the following:

...sdp attributes...a=crypto:1 AES_CM_128_HMAC_SHA1_80 inline:PS1uQCVeeCFCanVmcjkpPywjNWhcYD0mXXtxaVBR|2^20|1:32...more sdp...

I'll leave the description of the crypto attribute as an exercise for you to read about in the rest of RFC 4568. The point here is that the sender must transmit to the receiver all necessary information to decrypt the SRTP packet in order to make that packet useful at the receiver end. As this information is described in the signaling layer, we must encrypt the signaling in order to encrypt the media in any meaningful way.

Google Voice DNS Records

OK... so what's my point and how does this at all involve Google?

Since we've been on a recent tangent about talking about a possible SIP avenue into Google Voice provided numbers and with the comment at the top of this article I thought I would simply enumerate the SRV records that we can find for Google's SIP service in the hopes it gives us some insight on what they're planning.

In short, it looks as though Google will only roll out UDP service to start, unless they have some other domain names that I don't know about, which is entirely possible. Anyway, let's assume for this discussion that sip.voice.google.com is the only domain we're considering here. Because RFC 3263 tells us how to find a SIP server, we know how to look for DNS records to do some snooping around Googles SIP service. Lets start with some NAPTR records.

Click to view NAPTR DNS lookups

$ dig google.com NAPTR; <<>> DiG 9.6.0-APPLE-P2 <<>> google.com NAPTR;; global options: +cmd;; Got answer:;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62733;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0;; QUESTION SECTION:;google.com. IN NAPTR;; Query time: 35 msec;; SERVER: 10.0.1.1#53(10.0.1.1);; WHEN: Sat Mar 12 02:17:17 2011;; MSG SIZE rcvd: 28$ dig voice.google.com NAPTR; <<>> DiG 9.6.0-APPLE-P2 <<>> voice.google.com NAPTR;; global options: +cmd;; Got answer:;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 50795;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0;; QUESTION SECTION:;voice.google.com. IN NAPTR;; ANSWER SECTION:voice.google.com. 776 IN CNAME voice.l.google.com.;; Query time: 60 msec;; SERVER: 10.0.1.1#53(10.0.1.1);; WHEN: Sat Mar 12 02:17:22 2011;; MSG SIZE rcvd: 56$ dig voice.l.google.com NAPTR; <<>> DiG 9.6.0-APPLE-P2 <<>> voice.l.google.com NAPTR;; global options: +cmd;; Got answer:;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 28575;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0;; QUESTION SECTION:;voice.l.google.com. IN NAPTR;; Query time: 6 msec;; SERVER: 10.0.1.1#53(10.0.1.1);; WHEN: Sat Mar 12 02:17:26 2011;; MSG SIZE rcvd: 36$ dig sip.voice.google.com NAPTR; <<>> DiG 9.6.0-APPLE-P2 <<>> sip.voice.google.com NAPTR;; global options: +cmd;; Got answer:;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 46385;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0;; QUESTION SECTION:;sip.voice.google.com. IN NAPTR;; ANSWER SECTION:sip.voice.google.com. 430580 IN CNAME voice-sip.l.google.com.;; Query time: 8 msec;; SERVER: 10.0.1.1#53(10.0.1.1);; WHEN: Sat Mar 12 02:17:30 2011;; MSG SIZE rcvd: 64$ dig voice-sip.l.google.com NAPTR; <<>> DiG 9.6.0-APPLE-P2 <<>> voice-sip.l.google.com NAPTR;; global options: +cmd;; Got answer:;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 11097;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0;; QUESTION SECTION:;voice-sip.l.google.com. IN NAPTR;; Query time: 16 msec;; SERVER: 10.0.1.1#53(10.0.1.1);; WHEN: Sat Mar 12 02:17:36 2011;; MSG SIZE rcvd: 40

Well, unfortunately there is not much to go off of there for named pointer records; although, we did learn about a few new domain names. Let's try doing SRV record lookups on sip.voice.google.com...

Click to view SRV DNS lookups

Google Voice UDP Record $ dig _sip._udp.sip.voice.google.com SRV; <<>> DiG 9.6.0-APPLE-P2 <<>> _sip._udp.sip.voice.google.com SRV;; global options: +cmd;; Got answer:;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 8463;; flags: qr rd ra; QUERY: 1, ANSWER: 5, AUTHORITY: 4, ADDITIONAL: 9;; QUESTION SECTION:;_sip._udp.sip.voice.google.com. IN SRV;; ANSWER SECTION:<span class="highlight">_sip._udp.sip.voice.google.com. 86400 IN SRV 20 1 5060 alt1.voice-sip.l.google.com.</span><span class="highlight">_sip._udp.sip.voice.google.com. 86400 IN SRV 10 1 5060 voice-sip.l.google.com.</span><span class="highlight">_sip._udp.sip.voice.google.com. 86400 IN SRV 50 1 5060 alt4.voice-sip.l.google.com.</span><span class="highlight">_sip._udp.sip.voice.google.com. 86400 IN SRV 30 1 5060 alt2.voice-sip.l.google.com.</span><span class="highlight">_sip._udp.sip.voice.google.com. 86400 IN SRV 40 1 5060 alt3.voice-sip.l.google.com.</span>;; AUTHORITY SECTION:google.com. 146471 IN NS ns3.google.com.google.com. 146471 IN NS ns2.google.com.google.com. 146471 IN NS ns1.google.com.google.com. 146471 IN NS ns4.google.com.;; ADDITIONAL SECTION:alt1.voice-sip.l.google.com. 300 IN A 74.125.95.192voice-sip.l.google.com. 300 IN A 74.125.95.192alt4.voice-sip.l.google.com. 300 IN A 74.125.95.192alt2.voice-sip.l.google.com. 300 IN A 74.125.95.192alt3.voice-sip.l.google.com. 300 IN A 74.125.95.192ns1.google.com. 342957 IN A 216.239.32.10ns2.google.com. 342957 IN A 216.239.34.10ns3.google.com. 319271 IN A 216.239.36.10ns4.google.com. 342957 IN A 216.239.38.10;; Query time: 18 msec;; SERVER: 207.172.3.8#53(207.172.3.8);; WHEN: Fri Mar 11 18:01:49 2011;; MSG SIZE rcvd: 494 No TCP nor SIPS Record $ dig _sip._tcp.sip.voice.google.com SRV; <<>> DiG 9.6.0-APPLE-P2 <<>> _sip._tcp.sip.voice.google.com SRV;; global options: +cmd;; Got answer:;; ->>HEADER<<- opcode: QUERY, status: <span class="highlight">NXDOMAIN</span>, id: 19256;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0;; QUESTION SECTION:;_sip._tcp.sip.voice.google.com. IN SRV;; AUTHORITY SECTION:google.com. 60 IN SOA ns1.google.com. dns-admin.google.com. 1444316 7200 1800 1209600 300;; Query time: 17 msec;; SERVER: 207.172.3.8#53(207.172.3.8);; WHEN: Fri Mar 11 18:01:53 2011;; MSG SIZE rcvd: 98 $ dig _sips._tcp.sip.voice.google.com SRV; <<>> DiG 9.6.0-APPLE-P2 <<>> _sips._tcp.sip.voice.google.com SRV;; global options: +cmd;; Got answer:;; ->>HEADER<<- opcode: QUERY, status: <span class="highlight">NXDOMAIN</span>, id: 40628;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0;; QUESTION SECTION:;_sips._tcp.sip.voice.google.com. IN SRV;; Query time: 63 msec;; SERVER: 10.0.1.1#53(10.0.1.1);; WHEN: Sat Mar 12 02:21:13 2011;; MSG SIZE rcvd: 49

OK, so here at least we see something interesting. From the UDP SRV lookup we can see a list of a few servers listed as providers for UDP SIP service for sip.voice.google.com. Although there is nothing listed for TCP and SIPS TLS records. Infer what you will from this but I will take this as a sign that there is no near term plan to offer TCP or SIPS service (although - I belive gizmo phone did have a TCP offering - _sip._tcp.gizmo5.com. 8000 IN SRV 10 10 5060 proxy01.sipphone.com. ). If that is true (about no TCP) then I think it may allow us to reasonably assume that in any near term offering on Google Voice SIP service, there will probably be no encryption services available as hoped for in the original comment. That being said - if Google wanted to roll out a SIP phone registration service (which I doubt) they would of course still be able to do this over UDP while keeping passwords secure... hey I know a company that does that!

Related Links: What are DNS SRV records for SIP?