NEC develops technology for video-to-text generative AI

Japanese electronics giant NEC has developed a generative AI technology that can analyze video footage and explain what it sees in text.

NEC’s newly developed AI uses its ability to recognize faces or objects in the video and describe them in words.

Then the information is fine-tuned into coherent text by using generative AI.

Possible applications include producing accident reports by studying vehicle dash cam footage, or creating work logs by analyzing construction site videos.

For example, if you ask the new AI to analyze dash cam footage that shows a motorcycle falling over, it will produce something of a word salad to describe what has happened.

Then generative AI cleans up the wording to produce a clearer description.

Here’s the result.
“It is believed that the motorcycle crashed into the black car without noticing that it had stopped.”

Generative AI programs usually excel at analyzing text and images, but are said to be less competent when it comes to dealing with video footage.

NEC`s new AI technology aims to offset that shortcoming.

Source link

credite