Academic Homepage

Zhe Yang

Master's student in Artificial Intelligence (MSAI) at Nanyang Technological University, working on multimodal and agent-related research with Associate Professor Ziwei Liu at MMLab@NTU.

My current research focuses on multimodal learning, agent systems, and personalized AI. I previously worked on LLM memorization, data privacy, and tokenization-related security risks with Prof. Michael R. Lyu at CUHK.

Email CV Publications

Singapore
MSAI at NTU since August 2025
3 papers currently under review

About

Multimodal AI, agent systems, and grounded personalization

I am currently pursuing the Master of Science in Artificial Intelligence (MSAI) at Nanyang Technological University. Since August 2025, I have been working with Associate Professor Ziwei Liu at MMLab@NTU on papers and research projects, with a current focus on multimodal systems and agentic intelligence.

Before NTU, I received my B.Sc. in Computer Science from The Chinese University of Hong Kong, graduating with a CGPA of 3.785/4.0. My earlier work centered on understanding memorization in large language models, with emphasis on data compressibility, tokenization, and security risks in code LLMs.

Education

2025.08 - Present M.S. in Artificial Intelligence, NTU
2021.09 - 2025.07 B.Sc. in Computer Science, CUHK
2023.07 Summer School, Peking University

Research Interests

Multimodal learning and grounding
Agent systems and contextual reasoning
Personalized AI and memory-centric systems
LLM memorization, privacy, and security

Research

Current directions

Multimodal Agents

Designing systems that can search, perceive, and reason over heterogeneous digital environments, with emphasis on grounded multimodal interaction.

Personalized AI

Building and evaluating systems that recover user-level context from long-horizon, file-system-scale behavioral and multimodal traces.

LLM Memorization and Security

Studying how data properties and tokenization affect memorization, leakage risk, and privacy vulnerabilities in large language models.

Publications

Selected papers

The papers below include recent work currently under review and ongoing research projects. At this stage, the four papers listed here are all in the review cycle.

HippoCamp: Benchmarking Contextual Agents on Personal Computers

Zhe Yang, Shulin Tian, Kairui Hu, Shuai Liu, Hoang-Nhat Nguyen, Yichi Zhang, Zujin Guo, Mengying Yu, Zinan Zhang, Jingkang Yang, Chen Change Loy, Ziwei Liu

A benchmark for evaluating contextual agents over realistic multimodal personal file systems, with a focus on search, evidence perception, and multi-step reasoning.

Explore the interactive website to experience what the three personal computers look like, and visit the project page for the benchmark overview.

Explore PCs Project Page

Multimodal agents Benchmarking Personalized memory

FileGram: Grounding Agent Personalization in File-System Behavioral Traces

Shuai Liu, Shulin Tian, Kairui Hu, Yuhao Dong, Zhe Yang, Bo Li, Jingkang Yang, Chen Change Loy, Ziwei Liu

A framework for grounding agent memory and personalization in file-system behavioral traces, spanning data generation, benchmarking, and memory architecture.

Data Compressibility Quantifies LLM Memorization

Yizhan Huang, Zhe Yang, Meifang Chen, HUANG Nianchen, Jianping Zhang, Michael R. Lyu

This work studies how data compressibility relates to memorization in LLMs and proposes a quantitative perspective on memorization behavior.

Understanding Secret Leakage Risks in Code LLMs: A Tokenization Perspective

Meifang Chen, Zhe Yang, HUANG Nianchen, Yizhan Huang, Yichen Li, Zihan Li, Michael R. Lyu

An investigation into how BPE tokenization contributes to secret memorization and leakage risks in code LLMs through what we term gibberish bias.

RAUIE: A Relation-Augmented Document-level Event Extraction Model Based on UIE

Co-author

Presented at AINIT 2024. Earlier work on document-level information extraction.

Experience

Research and training

2025.08 - Present

Research with Associate Professor Ziwei Liu, MMLab@NTU

Working on multimodal and agent-oriented papers and research projects while pursuing the M.S. in AI at NTU.

2024.09 - 2025.07

Final-Year Research with Prof. Michael R. Lyu

Focused on LLM memorization, entropy-based characterization, dataset inference, and tokenization-related security risks in code LLMs.

2024.07 - 2024.08

Algorithm Intern, Geovis Technology

Worked on automated data acquisition, knowledge graph modeling, fine-tuning, and graph-based analysis pipelines.

2024.06 - 2024.09

UG Summer Research Internship

Evaluated memorization difficulty in large language models through entropy, perplexity, and memorization-rate analysis on open-source models.

2023.06 - 2023.09

UG Summer Research Internship

Studied adversarial attacks on gender recognition systems and analyzed robustness and fairness issues in facial-recognition APIs.

Honors

Selected awards

2025

HKSAR Government Scholarship

Top 1% in Computer Science.

2022 - 2025

Dean's List

Top 10% in the Faculty of Engineering, CUHK.

2024

CSE Scholarship Silver Award

Top 1% in the Faculty of Engineering, CUHK.

Contact

Get in touch

Email zhe012@e.ntu.edu.sg

Location Singapore

CV Download PDF

If you would like to discuss multimodal learning, agent systems, personalized AI, or LLM memorization and security, feel free to reach out.