10. Relaciones Institucionales y Participación social
10.1. Relaciones institucionales: Parlamento y Defensor del Pueblo
In real-time communications, signaling has four main roles: 1) Negotiation of media capabilities and settings
2) Identification and authentication of participants in a session 3) Controlling the media session, indicating progress,
changing and terminating the session
4) Glare resolution, when both sides of a session try to establish or change a session at the same time.
The next sections will examine these functions in detail, showing how 1) is essential and standardized, while 2), 3), and 4) are optional or are just part of the web application in WebRTC.
4.1.1 Why Signaling is Not Standardized
Signaling is not standardized in WebRTC because it does not need to be standardized to enable interoperability between browsers. Signaling is effectively a matter between the web browser and the web server. The web server can ensure that both browsers utilize the same signaling protocol, using downloaded JavaScript code.
In the web model, only the minimum components are standardized, leaving web developers the freedom to choose and design all other aspects of web pages and applications. In practice, this means that only transport (HTTP), markup (HTML), and media (WebRTC) need to be standardized. As shown in Figure 4.1, the server selects the signaling protocol and ensures that users of the web application or site support the same protocol. Web servers A, B, and C do not need to use the same signaling protocol, but in each case the browsers are able to establish media sessions.
Figure 4.1 Web Server Chooses Signaling Protocol
Compare this situation to the general VoIP or video system where there is no way for a signaling or control server to push signaling code into the end devices. As a result, the only way interoperability can be achieved is for both endpoints to use the same standardized signaling protocol, such as SIP or Jingle, which are introduced in Section 4.3.6 and 4.3.7.
For a federated or trapezoid architecture, such as that shown in Figure 1.5, both web domains need to agree on a signaling protocol in order to interoperate. However, this signaling protocol does not necessarily need to be the same signaling protocol used in each of the browsers. In other words, just because two web domains use SIP to communicate, this doesn’t mean that SIP must be used in both browsers.
4.1.2 Media Negotiation
WebRTC specifications includes requirements for the “signaling channel.” The most important function of signaling is the exchange of information contained in the Session Description Protocol (SDP) objects between the browsers involved in the Peer Connection. SDP as used by JSEP contains all the information necessary for the RTP media stack on the browser to configure the media session, including the types of media (audio, video, data), codecs used (Opus, G.711, etc), any parameters or settings for the codecs, and information about the bandwidth. Also, the signaling channel is used to exchange candidate addresses for ICE hole
punching. The candidate addresses represent the IP addresses and UDP ports where potentially media packets could be received by the browser. Candidates can also be sent and received outside of SDP in the signaling channel. Keying material for SRTP must also be exchanged in the signaling channel.
ICE hole punching, described in Section 3.4, cannot begin until the candidate addresses have been exchanged over the signaling channel, so without this signaling function, there can be no establishment of a Peer Connection.
4.1.3 Identification and Authentication
When a standard signaling protocol such as SIP or Jingle is used to initiate real-time communication, the signaling channel provides the identity of the participants and also optionally authentication. In WebRTC, there are two non-signaling sources of identity. One is the context provided by the web application. For instance, a user to a WebRTC website might sign on with a particular screen name. When this user wishes to establish a session with another user, the web application presents the screen name to the other user as the identity. A user can only trust the website that this identity is accurate. This is very similar to caller identity in the PSTN (Public Switched Telephone Network). A PSTN user must trust their service provider that the caller ID presented to their telephone is in fact the caller – they have no other way of independently determining it. Or, an identification might be passed in a URL, which could contain a random token. The parties in a WebRTC session established this way would be parties who knew the identification token.
WebRTC defines an alternative approach of identity through the media channel which does not rely on trusting information from the website. The notion of media path identity was first proposed by the ZRTP [RFC6189] media path keying protocol, in which caller identity and authentication was provided in the media path without relying on the signaling channel at all. WebRTC uses DTLS-SRTP [RFC5763] to provide media path identity. This is done using the fingerprint of the public key used during the DTLS handshake. This fingerprint can be authenticated by the use of an Identity Provider, described in Section 10.4. The signaling channel is used to transport the fingerprint and the identity assertion but is not otherwise involved in the generation or validation of the identity assertion.
A conventional multimedia signaling protocol, such as SIP or Jingle, or some proprietary protocol, can provide call control of the session. The proprietary protocol could be extremely simple, such as the example in the demo application of Chapter 7. However, in WebRTC, while signaling is required to initiate or change a media session, signaling is not needed to indicate status or to terminate a session. Instead, the ICE state machine in the browser can provide this information. For example, as candidate addresses are being checked, this can provide progress information about the session. Once a session is established, if ICE continuing consent checks fail, this is an indication that the session has been terminated.
4.1.5 Glare Resolution
Signaling protocols such as SIP have built-in glare resolution. Glare is when both sides of a communication session attempt to setup or change a session at the same time. It is a race condition that could result in an nondeterministic state for the session. Some of the approaches for using SDP eliminate many glare conditions, and if those approach are incorporated in WebRTC, the requirements on glare could be greatly reduced. For example, if adding a new media source to a session can be done without a new offer/answer exchange, then this common source of glare can be eliminated.