This image showsRamin Hedeshy

Ramin Hedeshy

Dr.

Researcher
AI
Analytic Computing

Contact

Germany

Subject

Ramin completed his PhD at the University of Stuttgart’s Institute for Artificial Intelligence, where he conducted research under the supervision of Prof. Dr. Steffen Staab. His work focuses on human–computer interaction and AI, particularly multimodal interaction techniques that combine eye tracking with touch or non‑lexical voice input.

During his time at the university, he contributed to the EXIST‑funded project Semanux, which aims to make digital interaction more inclusive by enabling people with disabilities to control computers using their individual capabilities.

His research has been published at leading conferences such as ACM CHI, ACM ETRA, and INTERSPEECH, spanning novel methods for eye typing as well as machine‑learning approaches to classifying non‑verbal voice expressions, including a 2023 INTERSPEECH paper on deep‑learning methods for recognizing humming and other non‑lexical vocal inputs. He also taught courses in Human–Computer Interaction, Information Retrieval, and Machine Learning, and supervised student theses.

Before and after his doctoral studies, he worked in industry, including roles at Bliksund (Norway) and Union Betriebs‑GmbH (Bonn), contributing to a range of IT projects such as a rules repository system for the CDU and the personal homepage of Angela Merkel. He continues to apply his expertise in multimodal interaction and accessible computing in his current industry position, including ongoing work on Tiltility, a research‑driven system for camera‑based interaction.

His dissertation is available through the University of Stuttgart library:
Spatiotemporal fusion of nonverbal voice & eye gaze for human-computer interactions

  1. Hedeshy, R., Menges, R., & Staab, S. (2023). CNVVE: Dataset and Benchmark for Classifying Non-verbal Voice Expressions. Interspeech 2023, August 20--24, 2023. Dublin, Irland.
  2. Hedeshy, R., Kumar, C., Lauer, M., & Steffen, Staab. (2022). All Birds Must Fly: The Experience of Multimodal Hands-free Gaming with Gaze and Nonverbal Voice Synchronization. INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION (ICMI ’22), November 7--11, 2022, Bengaluru, India. https://doi.org/10.1145/3536221.3556593
  3. Hedeshy, R., Kumar, C., Menges, R., & Staab, S. (2021). Hummer: Text Entry by Gaze and Hum. CHI Conference on Human Factors in Computing Systems (CHI ’21), May 8--13, 2021, Yokohama, Japan. https://doi.org/10.1145/3411764.3445501
  4. Hedeshy, R., Kumar, C., Menges, R., & Staab, S. (2020). GIUPlayer: A Gaze Immersive YouTube Player Enabling Eye Control and Attention Analysis. ETRA ’20 Adjunct: 2020 Symposium on Eye Tracking Research and Applications, Stuttgart, Germany, June 2-5, 2020, Adjunct Volume, 1:1–1:3. https://doi.org/10.1145/3379157.3391984
  5. Kumar, C., Hedeshy, R., MacKenzie, I. S., & Staab, S. (2020). TAGSwipe: Touch Assisted Gaze Swipe for Text Entry. CHI ’20: CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, April 25-30, 2020, 1–12. https://doi.org/10.1145/3313831.3376317
  • HCIIR SS2021
  • Machine learning Tutorial SS2020
  • Semanux
    Semanux is developing technologies that make it possible to operate a computer via a combination of various input means, mostly eliminating the need for a mouse and a keyboard. More info at www.semanux.com

  • MICME
    The MICME project aims to combine different technologies from gesture recognition, eye tracking, voice control, and AR/VR technology into a system that can be used in the operating room.
To the top of the page