Toshiba Develops World’s Most Accurate Highly Versatile Visual Question Answering AI

Toshiba has developed the world’s most accurate highly versatile Visual Question Answering (VQA) AI, able to recognize not only people and objects, but also colors, shapes, appearances and background details in images. The AI overcomes the long-standing difficulty of answering questions on the positioning and appearance of people and objects, and has the ability to learn information required to handle a wide range of questions and answers. It can be applied to a wide range of purposes without any need for customization.

In experiments using a public dataset comprising a large volume of images and data text, the VQA AI correctly answered 66.25% of questions without any pre-learning and 74.57% with pre-learning. For example, the AI can find a worker standing in a designated place by asking questions like, “is the person on a black mat?” which requires recognition of the individual, position, shape and color. Applying it to safety monitoring systems at production sites is expected to help improve safety and to reduce workloads on onsite supervisors. It can also be used to identify specific scenes in broadcast content and surveillance video footage.

READ  Toshiba, Sojitz and CBMM Partner to Commercialize Next-Generation Lithium-ion Batteries

Toshiba’s new AI meets the need for flexibility with the world’s highest accuracy in answering questions, and it is also able to change or add questions quickly. Its ability to recognize not only people and objects but also image backgrounds, plus the extensive database at its disposal, ensure that it can process quickly the features of images and pre-learned questions to derive the correct answer.

READ  Toshiba launches Industry’s first 4K HDMI to MIPI Dual-DSI converter chipset with video format conversion

After learning a large set of images, questions and answers that cover the presence of people and objects, and information such as their location and status, the AI is able to provide an appropriate answer to a question from approximately 3,000 answer patterns. The AI is highly flexible and can be updated by adding inspection items, or changed to handle a different situation, by a simple “Image and Question” process of adding new question sentences.

The global AI market, including software, hardware, and services, is forecast to grow 16.4% year over year in 2021 to $327.5 billion and is expected to reach $554.3 billion by 2024.