Issues that plague real-time communications that may be solved with artificial intelligence can be divided into two categories: service level and infrastructure level. Service level applications are mainly customer-facing and feature-based, while infrastructure level applications are mainly developer-facing and optimization-based. No one side - service level or infrastructure level - is more important or influential than the other. Both serve unique and intriguing purposes to further real-time communications.

Service level applications are implemented primarily on the customer-facing side. This includes features such as speech analytics, image recognition, and background blur. They tend to be very exciting for the end user and may garner a significant amount of media attention. These applications are especially prevalent with new, emerging use cases of real-time communications, and are often associated with an awe moment that indicates: the future is now.

Service Level Artificial Intelligence Applications in Real-time Communications

These are but a few of the key service level applications being developed today.

Speech Analytics: Real-time speech analytics lets organizations have deep insight into customer interactions and direct conversation flows, so they can be as productive and helpful as possible. NICE and other companies are selling this service to call centers to improve customer experience.

Transcription: Transcription uses natural language processing to transcribe conversations. Companies like Otter are using transcription to give organizations valuable insights into their meetings, so employees can always be aware of what was said and when.

Translation: Translation uses natural language processing to identify and subsequently convert one language into another. Companies including Waverly Labs are using translation to allow organizations and individual employees to communicate across language barriers without the need for a translator.

Image Recognition: Image recognition is being used to identify objects in video through video analysis. This software is being offered as an API or service by companies like Google, with their Video Intelligence API , and Amazon, with their Amazon Rekognition Video .

Emotion Detection: Emotion detection lets businesses directly understand their customer base using facial recognition algorithms. Cogito performs in-call voice analysis to identify how customers are feeling, which gives call centers an undeniable edge.

Conference Zoom: Conference zoom can be used to identify speakers in the room and zoom in on their faces when they are talking. Google’s Hangouts Meet Hardware Kit is designed for just that, so users can interact on a more personal level despite being in different geographic locations.

Background Blur: Background blur uses artificial intelligence to blur out the background of a users video feed to provide some privacy. This useful feature was recently introduced by Microsoft Teams .

Audio Quality: Audio quality improvements use machine learning to separate conversation from background noise, and remove the background noise. Mozilla’s RNNoise: Learning Noise Suppression differentiates between voice and noise to reduce background noise.

There are of course more service level applications in development today. These are but a few important ones changing the real-time communications industry.