30/01/2014

From VoIP Client to server side MCU.

Whenever people ask me about VoIP client vendors – you know – those writing software based IP phones, my instinct and gut feeling says that they are more than challenged by WebRTC. This is why this chat I had with Emil Evov, CEO of Blue Jimp, the company behind the Jitsi open source VoIP client, was so interesting to me. Jitsi is interesting – because they shifted gears towards WebRTC and backend video bridge solution. If you want to learn more about this transition Blue Jimp and Jitsi are making, then check out Emil’s answers.

What is Jitsi Videobridge all about?

The IETF term is a Selective Forwarding Unit (SFU). More commonly, however, such servers are known as “video routers” or MCUs. It is pretty much the same piece of tech as the one Google use for Hangouts. In other words, in a multi-party video conference, Jitsi Videobridge is the server that receives video from every participant and then relays it to everyone else. Relaying the video, as opposed to mixing it, makes for a very lightweight solution that could easily have hundreds of participants on a single VM. It also allows for much better quality and lower latency. This wasn’t such a common approach until a few years ago when most conferencing services used to mix content. As bandwidths rise however video relaying is becoming the most reasonable choice. An interesting note here is that Jitsi Videobridge is not a service. It is an open source (LGPL) server application that you can download and run in your own data center / cloud / premises.

What about meet.jit.si then? That’s a service.

Right, meet.jit.si is a service that uses Jitsi Videobridge. It runs the JitMeet application. Philipp Hancke from Estos, who is known to stop by this blog on occasion, wrote a JS frontend for the bridge at the end of last year. It was then contributed to the community, became JitMeet and we have been working on it together ever since. The power of open source 😉 Still, none of this is a closed down service. Everyone can download, deploy and run the different components as they see fit.

How about Jitsi? What is it and how does it fit?

Jitsi is where we come from. It’s an open source video communicator client with a lot of features and a significant focus on security and privacy. It supports things like ZRTP and OTR for call and chat encryption. This has made it rather popular after the Prism affair last year. For a number of years (more than ten actually) we have been evolving Jitsi in a number of ways. We got to a point where our media stack was quite feature rich: we had ICE, SRTP, audio mixing and video routing. So we thought that we should make the core available for others to use. This is how libjitsi was born. It’s basically an alternative to the webrtc.org media stack and Jitsi Videobridge uses it to provide its video routing features.

So, you started from a VoIP software client, and now offer a MCU of sorts. Why?

Great question! Often, while working on the client, we were frustrated by the lack of advanced video features in existing media servers. The thing is that Jitsi could already host conferences even as a client but doing it on an end-user’s laptop is rarely a good thing. You run into all of these bandwidth and reliability issues. So, we thought, why don’t we push all this to the network? We did and this is how Jitsi Videobridge was born. We did our first release of the bridge in January 2013. Then, we started adding WebRTC compatibility. Luckily our friend Tim Panton and the Bouncy Castle team had already taken care of a DTLS implementation for Java and we used that. We also added ICE support through our ice4j lib. Then Philipp Hancke came along and started playing with all this and eventually came up with what later became JitMeet. Exciting times!

Being an open source vendor – how do you make a living out of Jitsi?

We sell development. We’ve been doing this for quite a while now. We founded BlueJimp, the company I work for, in 2009 and we started this model with Jitsi. It would always go the same way: we’d come up with a tech and then people will come by asking us to either extend it or adapt it to their infrastructure. Right now it seems that the same is also working out pretty nicely for Jitsi Videobridge. People have different ideas and different requirements for their services. Open source is great in that it gives you a head start for free, but sometimes you’d turn out to be missing last 5% of functionality that you need. That’s where we come in with BlueJimp. We are always considering other options of course, we like changing and evolving, so far however, this model is working out just fine.

What excites you about working in WebRTC?

I love the energy that it creates in the RTC diaspora. We’ve been having a number of technologies for years only never using them: ICE, Trickle ICE, Bundle, rtcp-mux, TURN/TCP, SCTP, FEC, RTX … We’ve had all these techs for ages but they were only sporadically employed here and there. In some cases, like Trickle ICE or Bundle, they were actually strongly opposed at the IETF. Now everything has changed. Everyone is implementing ICE, everyone is doing SRTP and the IETF is fervently working on specifying the missing pieces. For some of the techs, the people who were once opposing them are now pushing or even chairing the efforts. It’s quite interesting to watch! Obviously end users are the big winners here as quality video communication is now much more accessible than before and they are beginning to see the benefits in places where such integration would have been unthinkable a few years ago. A second thing that I love about WebRTC, this time more as a Jitsi developer, is how it makes our lives easier in so many ways. It adds a very easy way for us to provide applications compatible with Jitsi Videobridge. You now only need to go to meet.jit.si and you can try it out. You can still end up installing Jitsi for a number of reasons (privacy for one) but the entry-level barrier is much lower.

What signaling have you decided to integrate on top of WebRTC?

I know that this is a favorite topic of yours :). Having been wedded and divorced with signalling protocols twice in the past we are now very pragmatic about these things. Jitsi Videobridge’s arch is very protocol agnostic. It only cares about RTP and media. This is very important because people are likely to want and integrate the bridge very differently depending on their service requirements. JitMeet on the other hand, the JS app that runs in the browser and that we currently use to demo the bridge, uses XMPP and the COLIBRI protocol. Why? It makes our lives easier. Here’s an example: After Philipp created the first version of the JS app, adding instant messaging was one of the most obvious feature requests. Well, XMPP made this a matter of literally minutes for him. In the future, things like sharing nicknames, showing participant rosters, adding moderation, sharing slides, all of these are all be made very easy with XMPP. We are glad we went that way. This doesn’t mean that we consider XMPP the right choice for every situation. We don’t and I don’t believe anyone does. Signalling protocols are just a tool. They are solving a bunch of problems, so if you are on a path that would put you against such problems it only makes sense not to reinvent the wheel. Also it’s not a decision that is taken once and for all. You can always change if things don’t work out. Still, it is much easier to begin with a tool and then rebuild it your way based on experience, than it is to start designing from scratch while you don’t even know where you are headed.

Backend. What technologies and architecture are you using there?

Jitsi Videobridge is almost entirely home-grown. We do use the Opus reference implementation, but everything else is libjitsi and ice4j. As for the rest, our meet.jit.si deployment currently runs Prosody and Nginx. The rest of the logic lives in the JS.

Where do you see WebRTC going in 2-5 years?

I see it actually taking off at that point. People really don’t know about it yet. I’ve been quite surprised to find this even when talking to VoIP professionals. I guess that’s partially due to lack of support from some of the top 5 browser vendors and another part to the stability issues that still float around. We are clearly still in the early stages. Standard multi-stream support is still to come and the differences between Firefox and Chrome are still substantial. Obviously we’ll get through all this because WebRTC is worth it but it won’t happen fast because there’s a lot of complexity. My expectation is that by the end of 2014, early 2015 we should see the other two major browsers join the parade. Interop issues will probably subsist through most of 2015. I suspect that by 2016 we should see a significant surge in adoption and WebRTC will be on its way of becoming a mainstream tech.

If you had one piece of advice for those thinking of adopting WebRTC, what would it be?

Well … that would depend on who they are. I definitely wouldn’t want a surgeon to operate on me through a WebRTC app. To all those who are thinking of adding non-critical RTC features to applications: this should really be the first thing you look at. Oh and of course: use trickle ICE! 😉

Given the opportunity, what would you change in WebRTC?

I think we’d be better off without SDP. I was among those who thought it a good idea early on, but that was before we realized how unsuitable it was for all the variety that WebRTC has to offer. We are no longer in a caller/callee model and Offer/Answer doesn’t fit that well. Obviously we can’t just dump it now. We are too far ahead, but I think what the ORTC community (ORCA) are looking at, a lower level API that would allow the existing one to be implemented on top of it, is a very interesting approach. So I am looking forward to a better WebRTC 2.0

What’s next for Jitsi?

We have a bunch of interesting problems to solve with Jitsi Videobridge: optimizations for conferences with hundreds of participants, improved mobile support, SVC and simulcasting. We’ll be tackling these in the future. As for Jitsi, we are about to start building a new HTML5 based graphical user interface that would allow us to share a number of GUI elements between JitMeet and Jitsi. This should allow us some very interesting combinations between rich and web RTC clients. We are looking forward to making it happen!

– The interviews are intended to give different viewpoints than my own – you can read more WebRTC interviews.