How WebRTC Works: An Ultimate Guide

Business Development
Software development
WebRTC
10 Apr
Stan Reshetnyk
Background

There is hardly anyone who has not heard of Skype. It has become synonymous with real-time video calls. To use this or a similar communication solution, you need to install certain software on your device. But what if you can avoid loading your computer or phone with yet another program? What if you could connect with someone on the other side of the earth with a super high quality and fast connection right from your browser? Sounds great, right?

There is such a miracle solution, and it is already being used by the largest companies, including Skype itself, YouTube and others. This is a WebRTC project. And even if you are aware of such a technology, you may be wondering how this is possible: How does WebRTC work? We are sure this article will help you find the answer to this question.

How WebRTC Works

What is WebRTC?

WebRTC (Web Real Time Communication) is a cutting-edge technology and a set of streaming protocols for transferring real-time data between browsers or applications using point-to-point transmission technology. This means that with this communicative solution users can communicate via text messages, video or voice without using a third party server. To establish a connection, you only need a browser and access to the Internet.

 

The largest companies such as Google, Amazon and Facebook are using WebRTC technology to develop video chat applications, providing them with a better and more reliable connection.

How WebRTC Works

How Does WebRTC Work?

It was previously mentioned that with WebRTC technology peer-to-peer communication is established without the use of a third-party server, although in fact, a server intervenes a little in the process. This is the reason why WebRTC is not fully P2P. The point is that before a direct connection can be established, some data must be passed between clients. 

Data (e.g.media encoding method, number and types of streams etc.) required to initialize the connection is formed into an SDP packet. The process of initializing a connection is often referred to as the offer-answer or request-response message flow.

The connection initiator sends SDP or “offer” to other call participants through the signalling server, and they, in turn, generate their SDP packets based on the information received and send their “answer”to the initiator.

This is about the exchange of media information. But peers must also exchange network connection data, and for this, the Interactive Connectivity Establishment (ICE) protocol is generated. ICE candidates are also exchanged between participants, whereby the best possible connection route becomes available and, finally, bi-directional data transfer is established.

 

Note that if the participants in a WebRTC-enabled video call are on different networks, several intermediate network devices (routers/gateways) must be used to establish a connection. We will talk about them in the WebRTC Specifications & Components section. Now let’s see what is so special about WebRTC technology that such efforts are being made to bypass the need to use a third-party server.

Advantages of WebRTC

  • No software installation is required. 

WebRTC technology works great in all major browsers such as Chrome, Firefox, Safari, and Edge without the need to install additional applications.

  • Minimal delay (latency).

With a latency of fewer than 0.5 seconds, WebRTC has become the fastest means of real-time data transfer.

  • High voice and video quality.

High-quality communication is ensured by built-in noise and echo suppression systems, as well as the flexibility of the media data stream, which can adapt to various communication conditions.

  • High-security level.

All connections are secure and encrypted according to the DTLS and SRTP protocols.

  • Cross-platform.

You can use WebRTC-based applications or browser extensions on various devices and operating systems.

  • Open-source.

This means that WebRTC is available for implementation in your product for online communication.

To get the most out of the benefits of peer-to-peer communication, you need to understand how to make it work effectively for you. This technology does not offer a one-way solution, but a whole set of solutions and the WebRTC development company will help you choose what will be best for you.

How WebRTC Works

WebRTC Specifications & Components

ICE

ICE stands for Interactive Connectivity Establishment and is used to find all the ways two computers can communicate with each other. The ICE candidates contain all the details about the available communication methods: for peers that are on the same network applied direct connection, otherwise, a TURN server is used. 

When ICE candidates generated by the WebRTC framework are exchanged between peers via a signiling server, the best possible connection route is obtained.

TURN

TURN stands for Traversal Using Relays around NAT, and helps in traversing NAT (Network Address Translation) or firewalls. Why should they be traversed? The fact is that immediately after the transfer of SDP through the signaling server, the NAT process should begin. But the catch is that public addresses that are assigned to a computer in a private network are not suitable for WebRTC-enabled video calls.

As a result, NAT and firewalls only make it difficult for peers to communicate with each other. Therefore, in order to bypass these barriers by relaying data through an intermediate server, a request for a public IP address is made to a STUN server.

STUN

STUN is an abbreviated name of Session Traversal Utilities for NAT. In fact, it performs the same function as TURN – it helps peers find public IP addresses and exchange them through a signaling server to establish a connection.

RTP

RTP, short for Real-time Transport Protocol, defines a standard packet format for delivering audio and video over the Internet. RTP is used in conjunction with the RTP Control Protocol (RTCP). While RTP carries media streams (such as audio and video), RTCP is used to monitor transmission statistics and Quality of Service (QoS) and help keep multiple streams synchronized. RTP is generated and received on even port numbers, and the corresponding RTCP communication uses the next higher odd port number.

Signaling

The signaling server enables the exchange of metadata (SDP and ICE) between peers. As already stated above, the offer-answer message flow passes through it and the connection is established. After fulfilling its role, the signaling server is no longer involved in real-time streaming. Then there is real peer-to-peer communication without the participation of third-party servers.

SDP

SDP stands for Session Description Protocol. SDP is an important part of WebRTC. Earlier we mentioned that the SDP protocol describes the multimedia session parameters (type, codecs, session parameters, etc.). The information necessary to establish a connection is presented in the form of a text file, which is sent through the signaling server. 

This is necessary so that all routes and parameters are consistent, otherwise, it will not be possible to set up a connection.

WebRTC APIs

 MediaStream API

Before starting a video or voice call, the user must grant access to the webcam or microphone. Since this is a privacy issue, you need to run this command either every time before you start using the WebRTC video call application, or once for the first time for each domain.

The MediaStream API is an interface that provides a way to access your device’s camera and microphone. The API deals with media streams (audio and video track data), supports them and methods for managing them (for example, turning on an audio or video recording device, or a screen sharing function), and reveals media playback devices’ data.

RTCPeerConnection API

Peer-to-peer connection is the basis of WebRTC technology. This makes it unique and different from other streaming solutions that create connections through intermediate servers. The PeerConnection API provides methods for establishing, supporting, and monitoring a connection between peers. Its operation may not be obvious to users—it handles SDP negotiation, NAT traversal, codec implementation, packet loss, and bandwidth management.

RTCDataChannel API

While the PeerConnection and MediaStream APIs provide media and network connection data transfer, the RTCDataChannel API supports any type of data exchange (gaming, chat, and file transfer). This data channel is similar to the WebSocket API, but faster, since the communication between peers is direct.

How WebRTC Works

WebRTC Security

There are several dangers that are commonly associated with the use of applications or plug-ins for real-time communication:

  1. Interception of unencrypted data on the way to an intermediate server or browser.

  2. Installing malware or viruses with an application or plug-in for video communication.

  3. Video or sound recording and its distribution without the user’s knowledge.

WebRTC technology protects against these dangers in the following ways:

  1. All WebRTC components are encrypted and data confidentiality is ensured by DTLS and SRTP protocols.

  2. Using WebRTC does not require software installation, which means there are no ways that malware or viruses can get into your device.

  3. Access to the webcam or microphone is provided by users so that media resources cannot be activated without permission. And even if a camera or a microphone is used, it will be visible on the client’s user interface (for example, the microphone icon will light up).

WebRTC Architectures

Although the peer-to-peer architecture is the core of the WebRTC standard, it is not well suited for some use cases. Therefore, there are several topologies with their pros and cons, which can be successfully used in various applications.

Peer-to-peer Architecture

The best connection type for simple applications with a small number of users (no more than 2-3 conference participants). Peer-to-peer (P2P) topology does not require the participation of external servers (besides signaling and TURN / TURNS), which makes data transfer faster. On the other hand, this topology is not suitable for advanced conference applications with large numbers of users. Since the connection is not designed for such a load, it becomes unstable. In addition, this type of communication is not suitable for recording that needs a central server.

Selective Forwarding Architecture

Selective Forwarding (SFU) Architecture is considered the golden mean among all topologies. It provides more participants (from 4 to 10) with high-quality communication with minimal delay.

In this type of connection, each session participant sends a stream of data to the server, which forwards it to other participants.

The disadvantage of the SFU topology is the need for additional server CPU power.

Multipoint Control Architecture

The Multipoint Control (MCU) architecture is well suited for advanced wide-ranging applications. In this type of topology, each participant sends media data to the MCU, which in turn, after decoding and mixing the audio and video streams into one, sends it to each participant. By reducing the bandwidth required to download session participants, the MCU is suitable for operation in poor network conditions even with a large number of users.

The downside is that some of the CPU load is moved to the provider.

Hybrid Architecture

Hybrid architecture is a mixture of architectures, which can be chosen depending on priorities and needs. If the participation of a large number of people is necessary, it is preferable to use the MCU architecture. If the recording is pivotal – SFU is the best option, and if you’re on a tight budget, you can start with a P2P architecture with the potential to expand as needed.

How WebRTC Works

WebRTC Servers

WebRTC Application Servers 

WebRTC app servers are servers that host applications. When you open an application, the server serves up the web page, including HTML, CSS, JS, and images.

WebRTC Signaling Servers

The WebRTC signaling server is a server that participates in the metadata transfer intermediary (ICE candidates and SDP). It is this server that is responsible for negotiating, establishing, and managing the connection between peers.

NAT Traversal Servers For WebRTC

As described earlier, NAT is not suitable for WebRTC and interferes with the correct operation of sessions, so it should be bypassed using special servers. There are two such servers: STUN and TURN which often go together.

STUN helps to find available public IP addresses for a device and share it with another peer to use it for direct media transferring.

The TURN server is used to relay media through it and is invoked when the user is unable to contact other participants in the session directly.

WebRTC Media Servers

To perform complex tasks, media servers act as WebRTC clients, while working on the server side. You will need a media server if you need to make group calls, record, live broadcast or stream, and perform other non-trivial tasks.

Media servers come in various types. For example, MCU and SFU, which have already been mentioned in the WebRTC Architectures section.

How WebRTC Works

When to Choose WebRTC?

WebRTC in Video Streaming

Ultra-low latency or real-time latency streaming allows them to be more involved and participate in creating a more realistic experience. No wonder why many major companies have switched to using WebRTC in their products. Among big names are YouTube, Google, Snapchat, Slack and others. You can implement peer-to-peer streaming conferencing technology into a finished product or create a streaming platform from scratch. The difference will be in cost, timing, customization and susceptibility to extensions and updates.

WebRTC for Corporate Video Chat Platforms

According to statistics in recent years, approximately 40% of Europeans work remotely. The dramatic transition to telework has led to the need for a sufficient number of corporate video chat platforms.

Reputable companies decided to develop their custom and branded applications using WebRTC technologyto coordinate workflow, eliminate communication barriers and ensure secure and fast file sharing.

Virtual conferences and meetings have proven to be as effective as real ones. Even more, features such as recording, screen sharing, dashboard, the ability to communicate and share data via chat during the session, automatic subtitles and their translation into other languages, create an environment for more productive work.

WebRTC in Multiplayer Games

The introduction of peer-to-peer video and audio technology into the gaming industry brings benefits to both developers and users.

Developers were able to take advantage of the ease of implementation and adaptability of WebRTC to the desired product. If we are talking about online games, you will agree that it is very convenient to have such a solution built right into the user’s browser, without the need to create a separate plug-in or software.

For the player, the quality of sound and the minimum delay in media transmission are very important. Gaming is often about speed and drive. It is not without reason that studies assign such merits to virtual sports as improved hand-eye coordination, quick reaction, and sharpened mental abilities. 

Now imagine that you are warning another player of danger or asking to cover you, but the sound is delayed! Bad scene!

It’s awesome that there is a WebRTC technology with a latency of less than 0.5 seconds that can save virtual lives.

WebRTC for Websites

Developers prefer WebRTC for its malleability, which makes it easy for them to embed it into any website. So, if you already have your  website, but you want to expand its capabilities by allowing users to communicate with each other or with website operators, the implementation of real-time peer to peer communication will be the right decision.

This is especially true for websites of various institutions (government, financial, medical, etc.), as well as online stores. Users do not need to resort to additional means of communication (mobile phone or email) to receive advice. Everything they need is already in their browser. It’s comfortable and timely. Using video/voice calls or chat is a guarantee of real-time assistance.

WebRTC in File Transfer Apps

We exchange information almost every day without even thinking about it. We can send a photo to a friend, share a song that caught our ears, or send an article right before a deadline. The files we send have different formats, sizes and importance levels. However, in any case, it is important for us that these files reach our addressee as quickly as possible and are not compromised on their way. Someone is dealing with documents of the highest importance, and they would not like to involve third-party servers or cloud services in their transfer. With the WebRTC Data Channel API, files can be sent directly between user browsers. Data in any format and volume is transferred quickly through a peer-to-peer connection. And the most essential thing is WebRTC security, which is ensured by mandatory encryption of all data.

The WebRTC Data Channel can be used as the main ingredient in developing applications solely for sending files. It can also extend the functionality of any video chat app or streaming platform, where users can send files to each other during or independently of a session.

And as a bonus, the Data Channel can be embedded into applications for remote control. For example, just like you can control your SmartTV with your smartphone which has the RTCDataChannel.

WеbRTC in Telemedicine

Telemedicine is an area that is making the most of the benefits of WebRTC peer-to-peer communication. It once again testifies to the reliability and efficiency of this technology. After all, who would risk the health and well-being of their patients?

The use of high-quality video communication allows doctors to consult patients in a cozy virtual office. The WebRTC Data Channel API is used to transfer e-prescriptions, health diaries, Electronic Health Records (EHR), and wearable health device data. This data must be transmitted over the most secure channel, and WebRTC provides such protection. 

Many telemedicine applications are actively deploying AI-powered chatbots to help with initial symptom analysis and suggest the next steps.

The use of WebRTC opens up new opportunities for better health care delivery. In addition, developers make sure that telemedicine applications are certified and comply with local or federal regulations (GDPR and HIPAA).

Conclusion

WebRTC is truly one of the most important communication solutions of our century. It can be applied in completely different industries and use cases. Knowing how this works will help you understand how you can get the most out of it in your particular case. And the WebRTC development company will help you implement your project using the most reliable technology.