Pages

Wednesday, 14 August 2019

NVIDIA makes breakthroughs in real-time conversational AI

NVIDIA has announced breakthroughs in language understanding technology that allow businesses to engage more naturally with customers using real-time conversational artificial intelligence (AI).

NVIDIA's AI platform is the first to train one of the most advanced AI language models – BERT – in less than an hour and complete AI inferences in just over two milliseconds. This groundbreaking level of performance makes it possible for developers to use language understanding for their applications.

Limited conversational AI services have existed for several years, says NVIDIA, but they have not been able to operate with human-level comprehension due to the inability to deploy extremely large AI models in real time. NVIDIA has addressed this problem by adding key optimisations to its AI platform - achieving speed records in AI training and inference and building the largest language model of its kind to date.

"Large language models are revolutionising AI for natural language," said Bryan Catanzaro, VP, Applied Deep Learning Research at NVIDIA. 

"They are helping us solve exceptionally difficult language problems, bringing us closer to the goal of truly conversational AI. NVIDIA's groundbreaking work accelerating these models allows organisations to create new, state-of-the-art services that can assist and delight their customers in ways never before imagined."

AI services powered by natural language understanding are expected to grow exponentially in the coming years. Digital voice assistants alone are anticipated to climb from 2.5 billion to 8 billion within the next four years, according to Juniper Research. Additionally, Gartner predicts that by 2021, 15% of all customer service interactions will be completely handled by AI, an increase of 400% from 2017.

NVIDIA's finetuning has resulted in three new natural language understanding performance records:

• Fastest training: Running the large version of one of the world's most advanced AI language models – Bidirectional Encoder Representations from Transformers (BERT) – an NVIDIA DGX SuperPOD using 92 NVIDIA DGX-2H systems running 1,472 NVIDIA V100 GPUs slashed the typical training time for BERT-Large from several days to just 53 minutes. 

Additionally, NVIDIA trained BERT-Large on just one NVIDIA DGX-2 system in 2.8 days - demonstrating NVIDIA GPUs' scalability for conversational AI.

• Fastest inference: Using NVIDIA T4 GPUs running NVIDIA TensorRT, NVIDIA performed inference on the BERT-Base Squad by dataset in 2.2 milliseconds – well under the 10- millisecond processing threshold for many real-time applications, and a sharp improvement over the 40 milliseconds achievable with highly optimised CPU code.

• Largest model: With a focus on developers' ever-increasing need for larger models, NVIDIA Research built and trained the world's largest language model based on Transformers, the technology building block used for BERT and a growing number of other natural language AI models. 

NVIDIA's custom model, with 8.3 billion parameters, is 24 times the size of BERT-Large.

Hundreds of developers worldwide are already using NVIDIA's AI platform to advance their own language understanding research and create new services.

Microsoft Bing is using the power of its Azure AI platform and NVIDIA technology to run BERT and drive more accurate search results.

"Microsoft Bing relies on the most advanced AI models and computing platform to deliver the best global search experience possible for our customers," said Rangan Majumder, Group Program Manager, Microsoft Bing. 

"In close collaboration with NVIDIA, Bing further optimised the inferencing of the popular natural language model BERT using NVIDIA GPUs, part of Azure AI infrastructure, which led to the largest improvement in ranking search quality Bing deployed in the last year. 

"We achieved two times the latency reduction and five times throughput improvement during inference using Azure NVIDIA GPUs compared with a CPU-based platform, enabling Bing to offer a more relevant, cost-effective, real-time search experience for all our customers globally."

Several startups in NVIDIA's Inception programme, including Clinc, Passage AI and Recordsure, are also using NVIDIA's AI platform to build conversational AI services.

Clinc has made NVIDIA GPU-enabled conversational AI solutions accessible to more than 30 million people globally through a customer roster that includes car manufacturers, healthcare organisations and some of the world's leading financial institutions, including Turkey's largest bank, Isbank.

"Clinc's leading AI platform understands complex questions and transforms them into powerful, actionable insights for the world's leading brands," said Jason Mars, CEO of AI startup Clinc.

"The breakthrough performance that NVIDIA's AI platform provides has allowed us to push the boundaries of conversational AI and deliver revolutionary services that help our customers use technology to engage with their customers in powerful, more meaningful ways."

No comments:

Post a Comment