Multimodal Solutions

Back to Home

Multimodal Solutions

Gaze and Speech

If you don’t already have it, (most of) the cars your company builds will soon be equipped with a driver monitoring camera, the main purpose of which is to comply with Euro NCAP regulations concerning drowsiness and distraction detection (Euro NCAP roadmap, Driver Monitoring (2020) p.7). These cameras can be set up such that they also, over and above detecting distraction, give an indication as to the direction of the driver’s gaze.

Many people try to bring gesture interaction into the car. While there are some products on the market, gesture recognition suffers from a limited camera opening angle,, resulting in a recognition weakness outside a ‚sweet spot‘. Further, there are no established gestures that allow for an ‚intuitive‘ interaction: users have to learn to perform system-defined gestures in a ‚reliable way‘.

The latter is the background for my patent on ‚iconic‘ gestures (submitted 2010, granted in the EU in 2015 and in the US in 2016). The basic idea is a straightforward combination of speech and gesture. If a gesture is recognized below a threshold of confidence, a speech dialogue requests clarification or confirmation from the user. The notion of ‚gesture‘ in this document implies head and gaze gestures. This means thatthe claims of this patent also cover this case: A gaze in some direction, as detected by the driver monitoring camera, can be interpreted as a gesture indicating the wish to perform some action in that particular area (or Region-of-Interest, RoI), as would a finger-pointing gesture. Now, a speech command uttered in conjunction with the gaze or the pointing can trigger the execution of a command pertaining to a function or a control element located in that RoI, as is described in more detail patent disclosure.

Of course, you need to get the timing right, and it also helps to have a good feedback dialogue. Let me show you how this is done!

Haptics and Gaze

If you have and/or want to keep haptic devices dedicated to a specific function control like opening or closing windows, roofs etc., then the multimodal exploitation of gaze still allows you to reduce the number of these buttons by letting the driver’s attention, as expressed through the gaze target area, set the context for the target device for the function. You can, for example, keep one push-open / pull-close button in the driver’s reach, and enable multimodal interaction with speech like ‚open all‘, where, as described above, the button allows to fine-adjust the exact opening position. Or, exploiting gaze, you can enable this button to just work on the window the driver is looking at.

Also this combination of modalities allow for the control of many more functions and devices - let me discuss this with you!

Speech and Haptics

Speech is not good for analogous settings. In every input method having to do with pattern recognition, there is an inherent offset, a delay, between the actual end of the recognized signal and the endpoint detection. This is why, with speech or free gestures, you never get the fine adjustments just right. For analogous settings, you preferably use a device where the end of the activation immediately terminates the actuation - a haptic device.

Your car probably has plenty of them already now. Dials, rolling knobs, touch surfaces, just different things for different functions in different positions. The good new is: If you do it right, you can get rid of as many of them as you like. Multimodal interaction allows the user to allocate a function control to a haptic device. We can make sure, using good dialogic feedback, that the function allocation is clear to the user.

Take, as just one example, the vertical and longitudinal setting of the steering wheel. Using multimodal interaction, the driver can allocate the function controls to rolling knobs on the wheel itself, such as are otherwise used for volume control and cruise control speed setting. The driver can now adjust the wheel to exactly the position she or he needs, while leaving the hands on the wheel and keeping the posture. So, rather than forcing the driver to reach out to an offset device like a touchscreen, multimodality brings the function control within reach of the driver.

Today, many see the touch screen as the one solution to reduce the number of buttons. However, buttons do not got really away if you just bury them in the menu hierarchy of a touch screen device.

Let me show you how to really get rid of them!

What else can I help you with?

I'm happy to answer questions! Please contact me!

Contact

Heisterkamp Consulting

Max-Johann-Str. 11
89155 Erbach
Germany

E-mail: Diese E-Mail-Adresse ist vor Spambots geschützt! Zur Anzeige muss JavaScript eingeschaltet sein!
Phone: +49 (0) 176 1052 4719