Minimal Latency Speech-Driven Gesture Generation for Continuous Interaction in Social XR

Niklas Krome; Stefan Kopp

doi:10.1109/AIxVR59861.2024.00038

2024 IEEE International Conference on Artificial Intelligence and eXtended and Virtual Reality (AIxVR)

Minimal Latency Speech-Driven Gesture Generation for Continuous Interaction in Social XR

Year: 2024, Pages: 236-240

DOI Bookmark: 10.1109/AIxVR59861.2024.00038

Authors

Niklas Krome, Bielefeld University,Faculty of Technology,Bielefeld,Germany
Stefan Kopp, Bielefeld University,Faculty of Technology,Bielefeld,Germany

Abstract

Social XR applications usually require advanced tracking equipment to control one’s own avatar. We explore if AI-based co-speech gesture generation techniques can be employed to compensate for the lack of tracking hardware that many users face. One main challenge is to achieve convincing behavior quality without introducing too much latency. Previous work has shown that both depend – in opposite ways – on the length of the audio chunk the gestures are generated from, and that gesture quality of existing models declines with lower chunk sizes while still not reaching sufficiently low latency to enable fluent interaction. In this paper we present an approach that is able to generate continuous gesture trajectories frame by frame, minimizing latency and yielding delays well below buffer sizes of voice communication systems or video calls. A project page with videos of the generated gestures is available at https://nkrome.github.io/FrameCAGE.html.

Like what you’re reading?

Already a member?

Get this article FREE with a new membership!

Exploiting Speech/Gesture Co-occurrence for Improving Continuous Gesture Recognition in Weather Narration
Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)
Real-time continuous gesture recognition for natural human-computer interaction
2014 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC)
Real-Time and Accurate Gesture Recognition With Commercial RFID Devices
IEEE Transactions on Mobile Computing
Gesture Recognition Using MediaPipe for Online Realtime Gameplay
2022 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)
Continuous Gesture Segmentation Method Based on Micro-Doppler Feature Via Millimeter Wave Radar
2022 14th International Conference on Signal Processing Systems (ICSPS)
On-device Real-time Custom Hand Gesture Recognition
2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)
Gesture Authentication for Smartphones: Evaluation of Gesture Password Selection Policies
2020 IEEE Symposium on Security and Privacy (SP)
Low-latency hand gesture recognition with a low resolution thermal imager
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
SnapGesture: CNN-Powered Real-Time Hand Gesture Analysis
2024 International Conference on Emerging Innovations and Advanced Computing (INNOCOMP)
A Conceptual Hand Gesture Controlled Web Game Using Webcam
2024 6th International Conference on Sustainable Technologies for Industry 5.0 (STI)

Minimal Latency Speech-Driven Gesture Generation for Continuous Interaction in Social XR

Authors

Abstract

Related Articles