Alkis Koudounas

PhD Candidate & ML Researcher | Turin, IT

I am a half Greek-half Italian PhD candidate at Politecnico di Torino, Italy, specializing in speech processing and responsible AI. My current research focuses on aligning speech language models with human feedback. I love exploring problems and envisioning unseen solutions.

Email Google Scholar GitHub Twitter LinkedIn CV

Selected Publications

voc2vec: A Foundation Model for Non-Verbal Vocalization

Alkis Koudounas, Moreno La Quatra, et al.

IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2025

Paper Code Models

A Contrastive Learning Approach to Mitigate Bias in Speech Models

Alkis Koudounas, Flavio Giobergia, et al.

Interspeech (Best Student Paper Award), 2024

Paper Code

Benchmarking Representations for Speech, Music, and Acoustic Events

Moreno La Quatra, Alkis Koudounas, et al.

IEEE International Conference on Acoustics, Speech and Signal Processing Workshops (ICASSPW), 2024

Paper Code Models

Open-Source Projects

voc2vec suite

📥 6,594 downloads

voc2vec is the first universal representation model designed for non-verbal human vocalization tasks.

HuggingFace → GitHub →

AudioSet-pretrained suite

📥 19,589 downloads

SSL models (HuBERT, wav2vec 2.0) pre-trained on AudioSet for general-purpose audio representation learning.

HuggingFace → GitHub →

DeepDialogue

📥 38,464 downloads

DeepDialogue-orpheus is a large-scale multimodal dataset containing 40,150 high-quality multi-turn dialogues spanning 41 domains and incorporating 20 distinct emotions with coherent emotional progressions.

HuggingFace (XTTS-v2 variant) → HuggingFace (Orpheus variant) →

Experience

Applied Research Intern

Amazon AGI - Speech & Audio Foundation, Sunnyvale, CA

Aug - Nov 2025

Worked on reward modeling for speech language models. The model is currently in production.

Applied Research Intern

Amazon AGI - Speech Understanding, Aachen, DE

Jun - Dec 2024

Worked on reward and confidence modeling for automatic speech recognition and speech translation systems.

Research Intern

Thales Alenia Space, Turin, IT

May - Nov 2021

Worked on object detection and 6DoF pose estimation with ToF cameras.

Research Intern

Tokyo University Agriculture and Technology, Tokyo, JP

Mar - May 2019

Worked on gradient-based learning methods extended to manifolds.

Education

PhD in Computer Engineering

Politecnico di Torino, Turin, IT

2022 - Present

Thesis: Toward Robust, Responsible and Trustworthy Speech Foundation Models

MSc in Computer Engineering

Politecnico di Torino, Turin, IT

2019 - 2021

Thesis: Object detection and 6DoF pose estimation with ToF cameras

BSc in Computer Engineering

Università Politecnica delle Marche, Ancona, IT

2016 - 2019

Thesis: Gradient-based learning methods extended to manifolds

Other

Teaching

Teaching Assistant

Politecnico di Torino, 2023 - Present

Assisted in teaching graduate courses on data science and machine learning fundamentals. Conducted lab sessions and graded assignments.

Guest Lecturer - Deep Learning for Speech Processing

Politecnico di Torino, 2023 - Present

Delivered lectures on transformer architectures and their applications in speech, audio and music processing and generation.

Awards & Honors

Outstanding PhD Award

Politecnico di Torino, 2025

Best Student Paper Award

Interspeech, 2024

Travel Grant Recipient

Interspeech, ICASSP, KDD 2024

Service

Organizer

SPADE Workshop at ICASSP, 2024

Online Chair & Head of Volunteers

ECML-PKDD, 2023

Reviewer

Interspeech, ICASSP, SLT/ASRU, KDD

Volunteering & Extracurriculars

Tutor & Supervisor

MALTO student team, 2023 - Present

AYA Italian Language Ambassador

Cohere for AI, 2024