Creating Your First WebRTC Application

DZone 's Guide to

Creating Your First WebRTC Application

This tutorial details creating a WebRTC app from start to finish, linking each feature to technical highlights for using the API.

· Mobile Zone ·
Free Resource


Internet-based communication has always been an integral part of large as well as medium-sized enterprises. A fully-functional and well-equipped internet based communication is one that communicates through video, text, audio or email over a single platform. All communication channels require a different technology. Video and audio communication has often demanded dedicated plug-ins. With the latest technology innovation in form of WebRTC API there is change in scenario altogether.

Let’s talk about how we go about creating our first HTML WebRTC application that can support both audio and video communication.

Before taking a start, let me tell you that WebRTC is a web browser API and hence to have it in functional, your browser should support WebRTC. And all WebRTC fans would be happy to know that both Mozilla Firefox and Google Chrome browsers support WebRTC.

Image title

Rumors Spread

Some application platforms claim that they are ‘WebRTC’ enabled but the reality is they only support getUserMedia, and that getUserMedia does not support the rest of the RTC components.

Hence, perform a vital check-up session of platform carefully before development.

Technical Highlights of WebRTC

Let’s have a close look what are the technical expectations from WebRTC applications:

  1. Streaming of audio, video or other data.

  2. Getting network information such as IP addresses and ports. Exchanging of this information with other WebRTC clients (peers) to enable connection even through NATs and firewalls.

  3. In case of error reporting, initiation, session closing and coordinating with signaling communication.

  4. Communicating with streaming video, audio or data.

  5. Exchange of information about media and client capability such as resolution and codecs.

The above functions are been implemented by WebRTC by employing some main APIs listed below:

  1. MediaStream (aka getUserMedia)

  2. RTCPeerConnection

  3. RTCDataChannel

Let’s have a close look on these API’s

1. MediaStream: getUserMedia or MediaStream takes the access to data streams for example from user’s camera and microphone. MediaStream is available in Chrome, Firefox and Opera. MediaSream API represents synchronized streams of media. This can be well-explained that a stream taken from camera and microphone input has synchronized video and audio tracks. Each MediaStream has an input, which might be a MediaStrem generated by navigator. getUserMedia(), and an output might have been passed to a video element or an RTCPeer Connection.

You must know that getUserMedia() method takes three parameters:

    1. A constraint’s object.

    2. Success callback which, if called, is passed a MediaStream.

    3. Failure callback which, if called, is passed an error object.

Each MediaStream has a label, such as ’Xk7EuLhsuHKbnjLWkW4yYGNJJ8ONsgwHBvLQ’. An array of MediaStreamTracks is returned by the getAudioTracks() and getVideoTracks() methods.

For the simpl.info/gum example, stream.getAudioTracks() returns an empty array (because there’s no audio) and, assuming a working webcam is connected, stream.getVideoTracks() returns an array of one MediaStreamTrack representing the stream from the webcam. Each MediaStreamTrack has a kind (‘video’ or ‘audio’), and a label (like ‘FaceTime HD Camera (Built-in)’), and represents one or more channels of either audio or video. In this case, there is only one video track and no audio, but you can easily imagine with use cases where there are more: for example, a chat application that gets streams from the front camera, rear camera, microphone, and a ‘screenshared’ application.

Do note: getUserMedia can also be used as an input node for the Web Audio API:

function gotStream(stream) {
    window.AudioContext = window.AudioContext || window.webkitAudioContext;
    var audioContext = new AudioContext();

    // Create an AudioNode from the stream
    var mediaStreamSource = audioContext.createMediaStreamSource(stream);

    // Connect it to destination to hear yourself
    // or any other node for processing!

navigator.getUserMedia({audio:true}, gotStream);

getUserMedia can also be added in Chromium-based apps and extensions. Adding audioCapture and/or videoCapture permissions enables permission to be requested and granted only once, on installation. Thereafter, the user permission for camera or microphone access is not asked.

Similarly, pages using HTTPS: permission only has to be granted once for getUserMedia(). Very first time an Always Allow button is displayed in the info-bar of the browser.

It is always required on enabling MediaStream for any streaming data source not just a camera or microphone. This enables streaming from disc or from arbitrary data sources such as other inputs and sensors.

Note that getUserMedia() must be used on a server, not the local file system, otherwise a PERMISSION_DENIED: 1 error will be thrown.

2. RTCPeerConnection: Audio or video calling holds the extension of encryption and bandwidth management. It gets supported in Chrome (in desktop and Android both), Opera (on desktop and Android) and of course in Firefox too. RTCPeerConnection is implemented by Chrome and Opera as webkitRTCPeerConnection and by Firefox as mozRTCPeerConnection. There’s an ultra-simple demo of Chromium’s RTCPeerConnection implementation at simpl.info/pc and a great video chat application at apprtc.appspot.com. This app uses adapter.js, a JavaScript shim maintained by Google, that abstracts away browser differences and spec changes.

RTCPeerConnection is the WebRTC component that handles stable and efficient communication of streaming data between peers.

Let’s have a close look on WebRTC architecture diagram which exhibits the role of RTCPeerConnection.

Looking at the diagram from JavaScript perspective, you must understand that RTCPeerConnection safeguards web developers from the myriad complexities. The codecs and protocols used by WebRTC takes care of huge amount of work to make real-time communication even over unreliable networks:

  • Packet loss concealment

  • Echo cancellation

  • Bandwidth adaptivity

  • Dynamic jitter buffering

  • Automatic gain control

  • Noise reduction and suppression

  • Image ‘cleaning.’

Let’s discuss over-signaling:

2 (a) Signaling: session control, network and media information

WebRTC uses RTCPeerConnection to communicate streaming data between browsers. Along with this it also needs a mechanism to coordinate communication and to send control messages. This process can be defined as a signaling. One should know that signaling is not part of the RTCPeerConnection API. WebRTC app developers can choose whatever messaging protocol they prefer, such as SIP or XMPP, and also an appropriate duplex (two-way) communication channel.

Conclusively, users discover each other and exchange ‘real world’ details such as names. WebRTC client applications exchange network information. Peers exchange data about media such as video format and resolution. WebRTC client applications traverse NAT gateways and firewalls.

We can also understand by this that WebRTC needs four types of server-side functionality:

    • User discovery and communication.

    • Signaling.

    • NAT/firewall traversal.

    •  Relay servers in case peer-to-peer communication fails.

3. RTCDataChannel: peer-to-peer communication of generic data. The API is supported by Chrome 25, Opera 18 and Firefox 22 and above.

Let the Coding BEGIN!

Assuming that you have quite a good technical and coding skills of JavaScript, HTML and CSS, and Node.JS.

Let’s get started with the coding part.

Step 1: Create a blank HTML5 document

Create a bare-bones HTML document.

<script type="text/javascript" src="js/lib/adapter.js"></script>
  <script type="text/javascript">// <![CDATA[

// ]]>

Step 2: Get video from your webcam

  1. Add a video element to your page.

  2. Add the following JavaScript to the script element on your page, to enable getUserMedia() to set the source of the video from the web cam:

  3. Test it out locally

var constraints = {video: true};

function successCallback(localMediaStream) {
  window.stream = localMediaStream; // stream available to console
  var video = document.querySelector("video");
  video.src = window.URL.createObjectURL(localMediaStream);

function errorCallback(error){
  console.log("navigator.getUserMedia error: ", error);

navigator.getUserMedia(constraints, successCallback, errorCallback);

Have a close look at what exactly we did here. We first called getUserMedia using:

navigator.getUserMedia(constraints, successCallback, errorCallback);

The constraints argument allows us to specify the media to get, in this case video only:

var constraints = {"video": true}

If successful, the video stream from the webcam is set as the source of the video element:

function successCallback(localMediaStream) {
  window.stream = localMediaStream; // stream available to console
  var video = document.querySelector("video");
  video.src = window.URL.createObjectURL(localMediaStream);

Step 3: Set up a signaling server and start exchange messages

Real world applications do not have the sender and receiver RTCPeerConnections on same page, hence we need a way for them to communicate metadata.

For its application we would be requiring a signaling server: a server that can exchange messages between a WebRTC app (client) running in one browser and a client in another browser. The actual messages are stringed JavaScript objects.

In this step we’ll build a simple Node.js signaling server, using the socket.io Node module and JavaScript library for messaging. Experience of Node.js and socket.io will be useful, but not crucial — the messaging components are very simple. In this example, the server (the Node app) is server.js and the client (the web app) is index.html.

The Node server application in this step has two tasks.

    1. To act as a messaging intermediary:

    socket.on('message', function (message) {
      log('Got message: ', message);
      socket.broadcast.emit('message', message);

     2. and to manage WebRTC video chat ‘rooms':

if (numClients == 0){
  socket.emit('created', room);
} else if (numClients == 1) {
  io.sockets.in(room).emit('join', room);
  socket.emit('joined', room);
} else { // max two clients
  socket.emit('full', room);

Our simple WebRTC application will only permit a maximum of two peers to share a room. Ensure you have Node, socket.io and node-static installed. To install socket.io and node-static, run Node Package Manager from a terminal in your application directory:

npm install socket.io
npm install node-static

You would now have three separate files, index.html, server.js, and your main JavaScript file main.js. They would look something like this.

1. main.js

var static = require('node-static');
var http = require('http');
var file = new(static.Server)();
var app = http.createServer(function (req, res) {
  file.serve(req, res);

var io = require('socket.io').listen(app);

io.sockets.on('connection', function (socket){

  // convenience function to log server messages on the client
function log(){
var array = [">>> Message from server: "];
  for (var i = 0; i < arguments.length; i++) {
    socket.emit('log', array);

socket.on('message', function (message) {
log('Got message:', message);
    // for a real app, would be room only (not broadcast)
socket.broadcast.emit('message', message);

socket.on('create or join', function (room) {
var numClients = io.sockets.clients(room).length;

log('Room ' + room + ' has ' + numClients + ' client(s)');
log('Request to create or join room ' + room);

if (numClients === 0){
socket.emit('created', room);
} else if (numClients === 1) {
io.sockets.in(room).emit('join', room);
socket.emit('joined', room);
} else { // max two clients
socket.emit('full', room);
socket.emit('emit(): client ' + socket.id + ' joined room ' + room);
socket.broadcast.emit('broadcast(): client ' + socket.id + ' joined room ' + room);



2. index.html

WebRTC client

<script type="text/javascript" src="http://localhost:2013/socket.io/socket.io.js"></script>
<script type="text/javascript" src="js/lib/adapter.js"></script>
<script type="text/javascript" src="js/main.js"></script>

3. server.js

var isInitiator;

room = prompt("Enter room name:");

var socket = io.connect();

if (room !== "") {
  console.log('Joining room ' + room);
  socket.emit('create or join', room);

socket.on('full', function (room){
  console.log('Room ' + room + ' is full');

socket.on('empty', function (room){
  isInitiator = true;
  console.log('Room ' + room + ' is empty');

socket.on('join', function (room){
  console.log('Making request to join room ' + room);
  console.log('You are the initiator!');

socket.on('log', function (array){
  console.log.apply(console, array);

To start the server, run the following command from a terminal in your application directory:

node server.js

Step 4: Connecting everything

Now we are going to connect everything. We are going to add signaling to our video client created in step 2. We are also going to implement RTCPeerConnection and RTCDataChannel to our app.

1. Add the following code in your main html file:

<div id="container">
<div id="videos">
<div id="textareas"><textarea id="dataChannelSend" disabled="disabled"></textarea>
 <textarea id="dataChannelReceive" disabled="disabled"></textarea></div>
 <button id="sendButton" disabled="disabled">Send</button></div>

2. Your main JavaScript file, main.js, would look something like this:

'use strict';

var sendChannel;
var sendButton = document.getElementById("sendButton");
var sendTextarea = document.getElementById("dataChannelSend");
var receiveTextarea = document.getElementById("dataChannelReceive");

sendButton.onclick = sendData;

var isChannelReady;
var isInitiator;
var isStarted;
var localStream;
var pc;
var remoteStream;
var turnReady;

var pc_config = webrtcDetectedBrowser === 'firefox' ?
  {'iceServers':[{'url':'stun:'}]} : // number IP
  {'iceServers': [{'url': 'stun:stun.l.google.com:19302'}]};

var pc_constraints = {
  'optional': [
    {'DtlsSrtpKeyAgreement': true},
    {'RtpDataChannels': true}

// Set up audio and video regardless of what devices are present.
var sdpConstraints = {'mandatory': {
  'OfferToReceiveVideo':true }};


var room = location.pathname.substring(1);
if (room === '') {
//  room = prompt('Enter room name:');
  room = 'foo';
} else {

var socket = io.connect();

if (room !== '') {
  console.log('Create or join room', room);
  socket.emit('create or join', room);

socket.on('created', function (room){
  console.log('Created room ' + room);
  isInitiator = true;

socket.on('full', function (room){
  console.log('Room ' + room + ' is full');

socket.on('join', function (room){
  console.log('Another peer made a request to join room ' + room);
  console.log('This peer is the initiator of room ' + room + '!');
  isChannelReady = true;

socket.on('joined', function (room){
  console.log('This peer has joined room ' + room);
  isChannelReady = true;

socket.on('log', function (array){
  console.log.apply(console, array);


function sendMessage(message){
console.log('Sending message: ', message);
  socket.emit('message', message);

socket.on('message', function (message){
  console.log('Received message:', message);
  if (message === 'got user media') {
  } else if (message.type === 'offer') {
    if (!isInitiator && !isStarted) {
    pc.setRemoteDescription(new RTCSessionDescription(message));
  } else if (message.type === 'answer' && isStarted) {
    pc.setRemoteDescription(new RTCSessionDescription(message));
  } else if (message.type === 'candidate' && isStarted) {
    var candidate = new RTCIceCandidate({sdpMLineIndex:message.label,
  } else if (message === 'bye' && isStarted) {


var localVideo = document.querySelector('#localVideo');
var remoteVideo = document.querySelector('#remoteVideo');

function handleUserMedia(stream) {
  localStream = stream;
  attachMediaStream(localVideo, stream);
  console.log('Adding local stream.');
  sendMessage('got user media');
  if (isInitiator) {

function handleUserMediaError(error){
  console.log('getUserMedia error: ', error);

var constraints = {video: true};

getUserMedia(constraints, handleUserMedia, handleUserMediaError);
console.log('Getting user media with constraints', constraints);

if (location.hostname != "localhost") {

function maybeStart() {
  if (!isStarted && localStream && isChannelReady) {
    isStarted = true;
    if (isInitiator) {

window.onbeforeunload = function(e){


function createPeerConnection() {
  try {
    pc = new RTCPeerConnection(pc_config, pc_constraints);
    pc.onicecandidate = handleIceCandidate;
    console.log('Created RTCPeerConnnection with:n' +
      '  config: '' + JSON.stringify(pc_config) + '';n' +
      '  constraints: '' + JSON.stringify(pc_constraints) + ''.');
  } catch (e) {
    console.log('Failed to create PeerConnection, exception: ' + e.message);
    alert('Cannot create RTCPeerConnection object.');
  pc.onaddstream = handleRemoteStreamAdded;
  pc.onremovestream = handleRemoteStreamRemoved;

  if (isInitiator) {
    try {
      // Reliable Data Channels not yet supported in Chrome
      sendChannel = pc.createDataChannel("sendDataChannel",
        {reliable: false});
      sendChannel.onmessage = handleMessage;
      trace('Created send data channel');
    } catch (e) {
      alert('Failed to create data channel. ' +
            'You need Chrome M25 or later with RtpDataChannel enabled');
      trace('createDataChannel() failed with exception: ' + e.message);
    sendChannel.onopen = handleSendChannelStateChange;
    sendChannel.onclose = handleSendChannelStateChange;
  } else {
    pc.ondatachannel = gotReceiveChannel;

function sendData() {
  var data = sendTextarea.value;
  trace('Sent data: ' + data);

function gotReceiveChannel(event) {
  trace('Receive Channel Callback');
  sendChannel = event.channel;
  sendChannel.onmessage = handleMessage;
  sendChannel.onopen = handleReceiveChannelStateChange;
  sendChannel.onclose = handleReceiveChannelStateChange;

function handleMessage(event) {
  trace('Received message: ' + event.data);
  receiveTextarea.value = event.data;

function handleSendChannelStateChange() {
  var readyState = sendChannel.readyState;
  trace('Send channel state is: ' + readyState);
  enableMessageInterface(readyState == "open");

function handleReceiveChannelStateChange() {
  var readyState = sendChannel.readyState;
  trace('Receive channel state is: ' + readyState);
  enableMessageInterface(readyState == "open");

function enableMessageInterface(shouldEnable) {
    if (shouldEnable) {
    dataChannelSend.disabled = false;
    dataChannelSend.placeholder = "";
    sendButton.disabled = false;
  } else {
    dataChannelSend.disabled = true;
    sendButton.disabled = true;

function handleIceCandidate(event) {
  console.log('handleIceCandidate event: ', event);
  if (event.candidate) {
      type: 'candidate',
      label: event.candidate.sdpMLineIndex,
      id: event.candidate.sdpMid,
      candidate: event.candidate.candidate});
  } else {
    console.log('End of candidates.');

function doCall() {
  var constraints = {'optional': [], 'mandatory': {'MozDontOfferDataChannel': true}};
  // temporary measure to remove Moz* constraints in Chrome
  if (webrtcDetectedBrowser === 'chrome') {
    for (var prop in constraints.mandatory) {
      if (prop.indexOf('Moz') !== -1) {
        delete constraints.mandatory[prop];
  constraints = mergeConstraints(constraints, sdpConstraints);
  console.log('Sending offer to peer, with constraints: n' +
    '  '' + JSON.stringify(constraints) + ''.');
  pc.createOffer(setLocalAndSendMessage, null, constraints);

function doAnswer() {
  console.log('Sending answer to peer.');
  pc.createAnswer(setLocalAndSendMessage, null, sdpConstraints);

function mergeConstraints(cons1, cons2) {
  var merged = cons1;
  for (var name in cons2.mandatory) {
    merged.mandatory[name] = cons2.mandatory[name];
  return merged;

function setLocalAndSendMessage(sessionDescription) {
  // Set Opus as the preferred codec in SDP if Opus is present.
  sessionDescription.sdp = preferOpus(sessionDescription.sdp);

function requestTurn(turn_url) {
  var turnExists = false;
  for (var i in pc_config.iceServers) {
    if (pc_config.iceServers[i].url.substr(0, 5) === 'turn:') {
      turnExists = true;
      turnReady = true;
  if (!turnExists) {
    console.log('Getting TURN server from ', turn_url);
    // No TURN server. Get one from computeengineondemand.appspot.com:
    var xhr = new XMLHttpRequest();
    xhr.onreadystatechange = function(){
      if (xhr.readyState === 4 && xhr.status === 200) {
        var turnServer = JSON.parse(xhr.responseText);
      console.log('Got TURN server: ', turnServer);
          'url': 'turn:' + turnServer.username + '@' + turnServer.turn,
          'credential': turnServer.password
        turnReady = true;
    xhr.open('GET', turn_url, true);

function handleRemoteStreamAdded(event) {
  console.log('Remote stream added.');
 // reattachMediaStream(miniVideo, localVideo);
  attachMediaStream(remoteVideo, event.stream);
  remoteStream = event.stream;
//  waitForRemoteVideo();
function handleRemoteStreamRemoved(event) {
  console.log('Remote stream removed. Event: ', event);

function hangup() {
  console.log('Hanging up.');

function handleRemoteHangup() {
  console.log('Session terminated.');
  isInitiator = false;

function stop() {
  isStarted = false;
  // isAudioMuted = false;
  // isVideoMuted = false;
  pc = null;


// Set Opus as the default audio codec if it's present.
function preferOpus(sdp) {
  var sdpLines = sdp.split('rn');
  var mLineIndex;
  // Search for m line.
  for (var i = 0; i < sdpLines.length; i++) {
      if (sdpLines[i].search('m=audio') !== -1) {
        mLineIndex = i;
  if (mLineIndex === null) {
    return sdp;

  // If Opus is available, set it as the default in m line.
  for (i = 0; i < sdpLines.length; i++) {
    if (sdpLines[i].search('opus/48000') !== -1) {
      var opusPayload = extractSdp(sdpLines[i], /:(d+) opus/48000/i);
      if (opusPayload) {
        sdpLines[mLineIndex] = setDefaultCodec(sdpLines[mLineIndex], opusPayload);

  // Remove CN in m line and sdp.
  sdpLines = removeCN(sdpLines, mLineIndex);

  sdp = sdpLines.join('rn');
  return sdp;

function extractSdp(sdpLine, pattern) {
  var result = sdpLine.match(pattern);
  return result && result.length === 2 ? result[1] : null;

// Set the selected codec to the first in m line.
function setDefaultCodec(mLine, payload) {
  var elements = mLine.split(' ');
  var newLine = [];
  var index = 0;
  for (var i = 0; i < elements.length; i++) 
    if (index === 3)
      // Format of media starts from the fourth. newLine[index++] = payload;
      // Put target payload to the first.
    }     if (elements[i] !== payload)
    {       newLine[index++] = elements[i]; 
    }   }   return newLine.join(' ');
// Strip CN from sdp before CN constraints is ready.
function removeCN(sdpLines, mLineIndex)
{   var mLineElements = sdpLines[mLineIndex].split(' '); 
 // Scan from end for the convenience of removing an item. 
 for (var i = sdpLines.length-1; i >= 0; i--) {
    var payload = extractSdp(sdpLines[i], /a=rtpmap:(d+) CN/d+/i);
    if (payload) {
      var cnPos = mLineElements.indexOf(payload);
      if (cnPos !== -1) {
        // Remove CN payload from m line.
        mLineElements.splice(cnPos, 1);
      // Remove CN line in sdp
      sdpLines.splice(i, 1);

  sdpLines[mLineIndex] = mLineElements.join(' ');
  return sdpLines;

You are Done With Your First WebRTC Application

Bravo!! You did it. You are now proud owner of a brand new WebRTC video-chat app. It may seem quite basic to you but it has the all the essence of a solid app base. You can now start adding the CSS to make the page more appealing. Right now your app only takes into consideration of one-to-one video chatting, hence to add more you can choose the add conference option and if you feel like you can go on launch tons of Mobile applications out of it, like we did. :-)

mobile ,webrtc

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}