To update: We are in the last stage of the winter holidays of 2019, which means that most Ars home office phones can remain inactive for a few more days. As such, we have resurfaced some archival classics: the last one is this look at how SIP (Session Initiation Protocol) won the VoIP protocol wars once. This story first appeared on December 8, 2009, and appears unchanged below.

As the industry grows, it is quite common to find multiple solutions that attempt to address similar requirements. This evolution dictates that these proposed standards go through a selection stage; Over time, we see that some become more dominant than others. Today, the Session Initiation Protocol (SIP) is clearly one of the dominant VoIP protocols, but that obviously did not happen overnight. In this article, the first of a series of detailed articles exploring SIP and VoIP, we will see the main factors that led to this result.

A brief history of VoIP

Let's go back to 1995 in the days before Google, IM and even broadband. Cell phones were large and bulky, Microsoft had developed a new Windows interface with a "Start" button and Netscape had the most popular web browser. The growth of the Internet and data networks led many to realize that it is possible to use new networks to meet our voice communication needs and substantially reduce the associated cost. The first commercial VoIP Internet solution came from a company called VocalTec; Its software allowed two people to talk to each other over the Internet. One would make a local call to an ISP through a 28.8K or 36.6K modem and could talk to friends even if they lived far away. I remember trying this software, and the sound was definitely inferior to acceptable quality. (It often seemed like he was trying to talk while he was in a pool). However, the software successfully connected two people and introduced a real-time voice conversation for a limited bandwidth network.

For the first VoIP implementers it was immediately apparent that there are several differences between the telephone network and the data network. One of them is the message exchange design. The telephone system operates on a circuit breaker, where a circuit is the complete route between two endpoints. Therefore, it is possible to guarantee a single route for all messages in a single communication. The data network works with packets, where several jumps along the way help route the packets to their final destination, and this route can change from one packet to another. Due to this structure, the data network cannot guarantee that the packets of a single session go through the same route. Therefore, VoIP required some new innovations before it could really take off.

To initiate a call, you need a VoIP signaling protocol. The term "signaling" comes from the world of telephone communication with circuit breaker. In this system, we have signals sent from one end to the other to communicate and allow us to talk over great distances. The function of a signaling protocol is to define the way in which these messages are structured and the rules that allow us to start, configure and end the conversation. It is worth noting that signaling messages do not include the voice that is heard (the means of the call). The signaling protocol may include the information of the media streams and their attributes, but the speech itself in a voice call is not a signaling message. If you are looking for a very high level explanation, think of the signage as the messages that a device sends when you dial or hang up the phone.

So the race began to create a new signaling protocol. Some of these protocol specifications were open for all to implement, and others were solutions patented by the provider. And that career is not over yet, as we constantly see new proposals that try to convince everyone that there is a better way to do things. A VoIP signaling protocol must show how it integrates with the data network; This includes aspects such as the definition of a method for locating communication devices, the specification of server behavior, the introduction of new services and security design.

SIP protocol design

SIP is an Internet Engineering Task Force (IETF) protocol and, as such, was designed to be an open Internet protocol. Its first release was in 1999, defined by RFC 2543, but its first drafts date from 1996. RFC 3261 revised some of its definitions later in 2002.

Let's look at a simple SIP request:

INVITE sip: [email protected] SIP / 2.0

Via: SIP / 2.0 / UDP home.mynetwork.org; branch = z9hG4bK8uf35f

To: Jon Stokes

From: Gilad ; tag = n23ycs

Caller ID: [email protected]

CSeq: 59164 INVITE

Contact: sip: [email protected]

Maximum feed: 70

SIP is text based. Note that the addresses are very similar to the email addresses. Although SIP can support phone numbers, the basic idea is that the addresses don't have to be phone numbers, just as you wouldn't expect your email address to look like your home or work address. A SIP message might resemble the following (partial) example:

GET / reviews / HTTP / 1.1

Host: arstechnica.com

User-Agent: Gecko / Firefox / 3.5.5

Therefore, SIP is quite similar to HTTP. The first line is the request line, which contains information about the type of request (GET in HTTP and INVITE in SIP for these examples) and the expected address, while the subsequent lines are headed with additional information. Naturally, responses in SIP also closely resemble HTTP responses. The idea is to use the structure of one of the most popular Internet protocols and make it easier for software developers and network administrators to work with SIP.

These attempts to make SIP as easy as HTTP worked to some extent, but the requirements of SIP addresses are more complex than HTTP, so the protocol is more complex. For example, it is a basic requirement in SIP to be able to have two-way symmetric communication, while a typical HTTP scenario would be a client that makes requests to a server and the server sends a response. Even without prior knowledge of HTTP, learning this message structure is a very easy task.

For those who wonder, the previous SIP example is the first packet that can be sent when calling from a SIP phone to Ars Technica Assistant Director, Jon Stokes. I will refrain from entering the technical details of the message content at this time, since this is a subject for a separate article.

Reuse and keep it simple

Another important factor in the design of SIP was the decision to reuse other existing Internet standards as much as possible. The address location uses DNS, user authentication uses HTTP summary authentication, call media stream configuration uses Session Description Protocol (SDP), encryption uses TLS and, where applicable, users They send each other XML information. This integration helped establish SIP as part of the Internet protocol world, and providers were able to reuse existing implementations in their SIP applications. On the other hand, in some cases, the IETF had to make additional definitions in other protocols to meet the needs of SIP.

Maintaining the complexity of servers, especially proxy servers, along the least possible call route is also an emphasis on SIP design. SIP proxy servers route messages between the callers. The proxies defined in the standard do not know the status of the call, but operate at the transaction level and may also have no status. This helps with scalability, because fewer devices can handle more calls. To do that, the protocol itself separated into several distinct layers, a common practice that programmers use to break down a complex system. This design helps to further simplify the SIP and facilitates its implementation. Sometimes, maintaining this minimum state forced some limitations (and later, some changes in the protocol), but these by-products were kept to a minimum.

Finally, and perhaps most importantly, SIP was not created solely as a replacement for the telephone system. It allows extensions and depends on them to provide additional services beyond simple calls. For example, you can use SIP to maintain user status information in an IM client, as well as to configure IM sessions. Another extension allows you to transfer a call to a third party, something that was simply not defined by the basic SIP specification. This is possible thanks to the fact that SIP provides the necessary basic constructions while limiting those constructions only when necessary. SIP defines the concept of "dialogue", which is a two-way communication, but does not limit dialogues to calls. Bidirectional communication also includes setting up your IM status and receiving updates from your IM friends. Extensions can also easily define new types of request or response and new headings when necessary.