GEPO: Group Expectation Policy Optimization for Heterogeneous Reinforcement Learning. Han Zhang (张晗), Ruibin Zheng, Zexuan Yi, et al. International Conference on Learning Representations (ICLR), 2026.
LANCET: Correcting Large Language Model Behavior via Influence Function. Han Zhang (张晗), Zhuo Zhang, Yi Zhang, et al. AAAI Conference on Artificial Intelligence (AAAI), 2025. (Oral)
COPR: Continual Human Preference Learning via Optimal Policy Regularization. Han Zhang (张晗), Lin Gui, Yu Lei, et al. Association for Computational Linguistics (ACL), 2025.
CPPO: Continual Learning for Reinforcement Learning with Human Feedback. Han Zhang (张晗), Yu Lei, Lin Gui, et al. International Conference on Learning Representations (ICLR), 2024.
CLLE: A benchmark for continual language learning evaluation in multilingual machine translation. Han Zhang (张晗), Sheng Zhang, Yang Xiang, et al. Empirical Methods in Natural Language Processing (EMNLP), 2022.
Incremental pre-training from smaller language models. Han Zhang (张晗), Wang Hui, Xu Ruifeng. Proceedings of the 10th SIGHAN Workshop on Chinese Language Processing (SIGHAN-10), 2025.
An Orthogonality-based Dual-memory Framework for Continual Text Classification. Han Zhang (张晗), Yu Lei, Bin Liang, et al. IEEE Transactions on Audio, Speech and Language Processing (TASLP), 2025.
Prompt-based prototypical framework for continual relation extraction. Han Zhang (张晗), Bin Liang, Min Yang, et al. IEEE Transactions on Audio, Speech and Language Processing (TASLP), 2022.
张晗 (Hanlard)
邮箱: zhangh04@pcl.ac.cn
谷歌学术: 链接