Pages

Monday, 5 October 2020

Better videos with new NVIDIA cloud-AI videostreaming platform

Source: NVIDIA. Image thumbnails showing what NVIDIA Maxine can offer, including better video compression; higher-resolution videos; auto-centring of the subject onscreen; face/gaze alignment; AI-driven avatars; instant translation; muting of the background; and virtual backgrounds for videos.
Source: NVIDIA. NVIDIA Maxine will allow better video compression; higher-resolution videos; auto-centring of the subject onscreen; face/gaze alignment; AI-driven avatars; instant translation; muting of the background; and virtual backgrounds for videos.

NVIDIA has announced NVIDIA Maxine, providing developers with a cloud-based suite of GPU-accelerated artificial intelligence (AI) videoconferencing software that will change how we stream video — the Internet’s No. 1 source of traffic.

NVIDIA Maxine is a cloud-native streaming video AI platform that makes it possible for service providers to bring new AI-powered capabilities to the more than 30 million web meetings estimated to take place every day. Videoconference service providers running the platform on NVIDIA GPUs in the cloud can offer users new AI effects. As the data is processed in the cloud rather than on local devices, users will not need specialised hardware. For developers, Maxine's modular design means they can easily select which AI capabilities to integrate into their videoconferencing solutions.

“Videoconferencing is now a part of everyday life, helping millions of people work, learn and play, and even see the doctor,” said Ian Buck, VP and GM, Accelerated Computing at NVIDIA.

NVIDIA Maxine integrates our most advanced video, audio and conversational AI capabilities to bring breakthrough efficiency and new capabilities to the platforms that are keeping us all connected.”

The Maxine platform reduces how much bandwidth is required for video calls. Instead of streaming the entire screen of pixels, the AI software sends just what's changed, intelligently re-animating the face in the video on the other side. This makes it possible to stream video with far less data flowing back and forth across the Internet. Using this new AI-based video compression technology running on NVIDIA GPUs, developers can reduce video bandwidth consumption down to one-tenth of the requirements of the H.264 streaming video compression standard. This cuts costs for providers and delivers a smoother videoconferencing experience for end users, who can enjoy more AI-powered services while streaming less data on their computers, tablets and phones.

Breakthrough technology by NVIDIA researchers that will be included in Maxine will make videoconferencing feel more like face-to-face conversation, said NVIDIA. Videoconference service providers will be able to take advantage of NVIDIA research in generative adversarial networks (GANs) to offer new features. For example, face alignment enables faces to be automatically adjusted so that people appear to be facing each other during the call, while gaze correction helps simulate eye contact, even if the camera isn’t aligned with the user's screen. With videoconferencing growing by 10x since the beginning of the year, these features help people stay engaged in the conversation rather than looking at their camera.

Developers can also add features that allow call participants to choose their own animated avatars with realistic animation that are automatically driven by their voice and emotional tone in real time. An auto frame option allows the video feed to follow the speaker even if they move away from the screen.

Using conversational AI features powered by the NVIDIA Jarvis SDK, developers can integrate virtual assistants that use state-of-the-art AI language models for speech recognition, language understanding and speech generation. The virtual assistants can take notes, set action items and answer questions in human-like voices. Additional conversational AI services such as translation, closed captioning* and transcriptions help ensure participants can understand what’s being discussed on the call.

To handle unpredictable demands for videoconferencing, NVIDIA Maxine takes advantage of AI microservices running in Kubernetes container clusters on NVIDIA GPUs to help developers scale their services according to real-time demands. Users can run multiple AI features simultaneously while remaining within application latency requirements.

The Maxine platform integrates technology from several NVIDIA AI software development kits (SDKs) and application programming interfaces (APIs). In addition to NVIDIA Jarvis, the Maxine platform also leverages the NVIDIA DeepStream high-throughput audio and video streaming SDK and the NVIDIA TensorRT SDK for high-performance deep learning inference. 

The AI audio, video and natural language capabilities provided in the NVIDIA SDKs used in the Maxine platform were developed through hundreds of thousands of training hours on NVIDIA DGX systems, the leading platform for training, inference and data science workloads.

Details:

Computer vision AI developers, software partners, startups and computer manufacturers creating audio and video apps and services can apply for early access to the NVIDIA Maxine platform.

Explore:

Watch how Maxine's AI capabilities change video

*Closed captions are subtitles that can be turned on and off, as opposed to open captions which are a permanent part of the video.

1 comment:

  1. This is an awesome post.Really very informative and creative contents. These concept is a good way to enhance the knowledge.I like it and help me to development very well.Thank you for this brief explanation and very nice information.Well, got a good knowledge.
    Blockchain technology

    ReplyDelete