Today i will talk about one of the common discussion while implementing the Live Video Streaming Solutions i.e Whether should we choose CPaaS or pick some Open-Sources Media severs to implement the same?
Let’s start and see the Pros and Cons of these Architecture choices…
There are 3 ways to build the WebRTC applications
- WebRTC Native
- Using Open-Source Media servers
- CPaaS — Communication Platform as a Service
WebRTC Native — It means you are developing everything from scratch by using WebRTC standards and Libraries. So what you will have to do is you must build and handle below things
- STUN / TURN servers
- Application Signalling
- Recording implementation
- Choosing Videos Codecs
- Choosing the right topology for 1:1 or 1:M or M:M video chat
- Browser and Mobile support
- Any other add-ons features for e.g some kind of AI in your videos
- Infrastructure management and update of related modules [H/W update, WebRTC update, S/W update e.t.c]
- Scaling complexity
As you are developing everything from scratch it takes more time and have more complexity. This approach is the least likely used approach due to lot of extra complexity involved.
Open-Source Media servers — It’s a free available solutions which will abstract many complexity which you face while developing app from WebRTC native and gives you wrappers and some additional functionalities around the WebRTC standards for e.g. below are the set of task which Media server will handle and reduce the complexity
- Available Recording features with different layouts
- Signalling features
- Transcoding
- Group chat
- Scaling capabilities
- STUN or TURN [Some of the Open-Source do include this in their core while in some cases you will have externally use it]
- Video Transcript
- In-build typologies may be SFUs or MCUs
- Some kind of AI like FaceOverlay, PointerDetection, CrowdDetection e.t.c

So what we have seen here is the Open-Source Media severs are reducing many development and a bit of scaling burdens for us but the scaling and infrastructure management is still an overhead. In this case we will have to manage all of our infrastructure and hosting around this and need to pay infrastructure cost for the same apart from additional scaling challenges in case of serving hundreds of participants in a single meeting or running multiple concurrent meetings with multiple group video calls.
There are multiple Open-Source Media servers are available but i have picked the 3 most popular among them
- Kurento
- Jitsi
- Janus
So in summary we are still going to handle a lot of complexity of scaling the Applications but this is a kind of 1 step LESS COMPLEXITY with multiple more features than directly building it using WebRTC standard itself.
CPaaS — It’s basically a commercial products which requires very less time to built your Live streaming app and is based on uses. It gives you more advance features and abstract many complexities which you face in Opensource Media Servers. It is as simple as just integrating a few APIs and your APP is ready.
In summary we can say it handles and reduces all the complexity part which we face in “WebRTC native” and “Opensource Media servers” apart from additional features like
- Analytics
- Latest update and support from WebRTC
- Hundreds or Thousands of participant support in a single meeting [1:M]
- Group video call support [M:M], This number can vary from provider to provider ranging from may be 4 to 20.
- Additional features like VOIP support e.t.c

There are multiple CPaaS available but i have picked the 3 most popular among them
- openTOK [Vonage]
- Twillo
- Agora
So let’s look the final comparison table which may help you to take decision

It’s all dependent on what is most important for you for e.g. if you are concerned about current Up front cost and want to make your APP live ASAP then i think CPaaS is the perfect choice for this.
Similarly you can pick other measurable component and decide which suits you most.