Heejun Lee

Heejun Lee

AI Researcher in Seoul, South Korea

ainl@kaist.ac.kr

About

AI Research Engineer at DeepAuto.ai and a combined M.S./Ph.D. candidate at KAIST, I have pioneered scalable, efficient attention mechanisms for large language models, including the award-winning HiP and Delta Attention algorithms. My work has led to a 52% reduction in serving costs and enabled million-token context inference on a single GPU, with multiple first-author publications at top-tier conferences like ICLR. My unique strength lies in bridging theoretical innovation with practical system optimization, delivering real-world impact in AI deployment. Looking ahead, I am committed to advancing the frontier of efficient AI systems, making state-of-the-art language models more accessible and sustainable at scale.

Work Experience

DeepAuto.ai

Seoul, South Korea

CAIO

Dec. 2023 - now

  • Joined as CAIO from Aug. 2025. Lead agentic engineering AI projects.

  • Developed ScaleServe, a cost-efficient LLM serving framework that reduces end-to-end serving costs by approximately 52% by integrating novel, training-free attention mechanisms.

  • Invented HiP Attention (ICLR 2025), a training-free attention algorithm that speeds up long-context inference by 50% and enables serving million-token contexts on a single GPU via KV cache offloading.

  • Designed Delta Attention, a novel correction algorithm that boosts sparse attention accuracy by 20-30% on the RULER benchmark with only a marginal (<10%) latency overhead.

  • Engineered and integrated custom attention modules into serving frameworks like vLLM and SGLang, reducing computational complexity for long contexts from quadratic (O(n^2)) to near-linear (O(n)).

Education

Korea Advanced Institute of Science and Technology

Korea Advanced Institute of Science and Technology

Artificial Intelligence

Sep. 2024 - Feb. 2030

  • Combined M.S./Ph.D. program

Korea Advanced Institute of Science and Technology

Korea Advanced Institute of Science and Technology

Computer Science

Mar. 2020 - Aug. 2024

  • GPA: 3.97/4.3

  • College of Engineering Dean's List (Spring 2022)

  • College of Engineering Leadership Award on Research Excellence (Spring 2022, Spring 2023)

Skills

Languages

  • Python

  • C++

  • C#

Frameworks & Libraries

  • PyTorch

  • Hugging Face

  • vLLM

  • SGLang

  • OpenAI Triton

  • .NET

Awards

College of Engineering Dean's List

KAIST

  • Recognized for outstanding academic achievement in Spring 2022

Jun. 2022

College of Engineering Leadership Award on Research Excellence

KAIST

  • Spring 2022

Jun. 2022

College of Engineering Leadership Award on Research Excellence

KAIST

  • Spring 2023

Jun. 2023

Publications

  • Authors: Jeffery Willette, Heejun Lee, Sung Ju Hwang

  • arXiv Preprint

  • Authors: Heejun Lee*, Geon Park*, Jaduk Suh*, Sung Ju Hwang

  • arXiv Preprint

  • Authors: Heejun Lee*, Geon Park*, Youngwan Lee*, Jaduk Suh*, et al.

  • ICLR 2025

  • Authors: Jeffrey Willette, Heejun Lee, Youngwan Lee, Myeongjae Jeon, Sung Ju Hwang

  • ICLR 2025

  • Authors: Heejun Lee, Jina Kim, Jeffery Willette, Sung Ju Hwang

  • ICLR 2024

  • Authors: Heejun Lee, Minki Kang, Youngwan Lee, Sung Ju Hwang

  • ICLR 2023

Contacts

Email

ainl@kaist.ac.kr

Custom

+82 10-7757-5176

GitHub

https://github.com/gmlwns2000