Junhyeok Kim

Hello!👋 I'm a Ph.D. student at MICV Lab at Yonsei University (Prof. Seong Jae Hwang). I'm currently interested in mechanistic interpretability, vision-language models, and a little bit of medical imaging!

  • Recently, my first first-author paper, which interprets ViT features and uncovers circuits to understand the model’s underlying mechanisms, has been accepted to NeurIPS ’25.
  • I am actively looking for internship opportunities. Please feel free to reach out!

Email    Scholar    LinkedIn    Github

profile photo

Research

Currently, my primary research interest lies in Mechanistic Interpretability (MI). By leveraging MI, we can understand what capabilities an AI model specializes in and what capabilities it requires. I believe that this deep understanding of AI models will ultimately serve as a major foundation for advancing towards Artificial General Intelligence (AGI).

blind-date Interpreting Attention Heads for Image-to-Text Information Flow in Large Vision-Language Models
Jinyeong Kim, Seil Kang, Jiwoo Park, Junhyeok Kim, Seong Jae Hwang
NeurIPS MechInterp Workshop 2025 Spotlight!

The image-to-text transfer in LVLMs depends on specialized attention head groups that are determined by the semantic content of the image.

blind-date Interpreting vision transformers via residual replacement model
Junhyeok Kim*, Jinyeong Kim*, Yumin Shim, Joohyeok Kim, Sunyoung Jung, Seong Jae Hwang
NeurIPS 2025

Residual replacement model explains end-to-end decision-making process of vision transformers in human-understandable scale.

blind-date Backbone Augmented Training for Adaptations
Jae Wan Park, Junhyeok Kim, Youngjun Jun, Hyunah Ko, Seong Jae Hwang
arXiv 2025

To solve the problem of scarce adaptation data, the pre-training data of the backbone model can be selectively utilized to augment the adaptation dataset.

blind-date Your large vision-language model only needs a few attention heads for visual grounding
Seil Kang, Jinyeong Kim, Junhyeok Kim, Seong Jae Hwang
CVPR 2025 Highlight!

A few attention heads in frozen LVLMs demonstrate strong visual grounding capabilities. These "localization heads" immediately enable training-free detection and segmentation.

blind-date See what you are told: Visual attention sink in large multimodal models
Seil Kang*, Jinyeong Kim*, Junhyeok Kim, Seong Jae Hwang
ICLR 2025

Large multimodal models consistently see irrelevant parts of the input. Our work demystifies this phenomenon, dubbed "visual attention sink," and proposes a simple mitigation strategy.

blind-date WoLF: Wide-scope Large Language Model Framework for CXR Understanding
Seil Kang, Junhyeok Kim, Donghyun Kim, Hyo Kyung Lee, Seong Jae Hwang
arXiv 2024

WOLF is a novel Large Language Model framework for Chest X-ray understanding that integrates patient Electronic Health Records (EHR).

Miscellaneous

  • Need another Junhyeok Kim? He's just one click away!
    (He is my mate as well as my namesake.)
  • I'm the drummer at MICCAI 2025 Gala Dinner!
  • I love music (2023).
  • I really love music (2024).

This website is built based on Jon Barron's personal website.