I’m a speech researcher at Samsung Research, focusing on a broad range of areas within speech recognition and spoken keyword spotting. My work includes:
- Developing text-audio representation learning for spoken keyword detection with text-only enrollment using tiny models.
- Estimating internal language models learned by speech models to improve inference-time integration with external language models.
- Full-stack ASR engineering for Samsung Bixby, including large-scale training, model compression, and inference-time biasing.
In addition to my research in speech processing, I am deeply intrigued by mechanistic interpretability, driven by the inconsistencies I’ve observed between common beliefs and how models actually function. My curiosity motivates me to address these challenges through representation understanding and causal abstraction.
Prior to Samsung Research, I received B.S. in CSE from Seoul National University.
News!
(09/24) I will be presenting my paper on spoken keyword detection at INTERSPEECH 2024!