Zi Liang (Research Page)
Table of Contents
1. Introduction
1.1. Introducing Myself
My name is Zi Liang, now a PhD student in the Astaple Group of Hong Kong Polytechnic University (PolyU). My supervisor is Prof. Haibo Hu. I begin my research in Xi'an Jiaotong University, under the supervision of Prof. Pinghui Wang and Ruofei (Bruce) Zhang. I also work closely with Dr. Nuo Xu, Shuo Zhang, Yaxin Xiao, Xinwei Zhang, and Yanyun Wang.
I focus on analyzing (pursue) the potential risks contained in current transformer-based large language models and dive into the analysis of why and how neural networks work and raise a vulnerability. My research can be divided into the following two categories:
- Revealing new threats or defence against existing attacks. To provide a comprehensive evaluation on popular AI services or techniques incorporated with in-depth theoretical analysis or a series of intuitive explorations, such as my recent study in prompt extraction attacks (link);
- Better understanding to models and learning. To explain why and how safety issues produced and what such a problem means during the training/inference of models.
Also, I am very familiar with natural language processing since 2020, especially in constructing conversational AIs. From 2024 I am also excited to the future of AI when employing reinforcement learning.
1.2. Introducing My Recent Research
1.2.1. Why Are My Prompts Leaked? Unraveling Prompt Extraction Threats in Customized Large Language Models [Arxiv'24]
This paper uncover the threat of prompt leakage on customized prompt-based services, such as OpenAI's GPTs. It aims to answer three questions:
- Can LLM's alignments defend against prompt extraction attacks?
- How do LLMs leak their prompts?
- Which factors of prompts and LLMs lead to such leakage?
We provide a comprehensive and systemic evaluation to answer question 1 and 3, and propose two hypothesis with experimental validation for question 2. We also propose several easy-to-adpot defending strategy based on our discovery.
Click here if you are also interested in this research.
1.2.2. MERGE: Fast Private Text Generation [AAAI'24]
This paper propose a new privacy-preserving inference framework for current transformer-based generative language models based on Secret Sharing and Multi-party Security Computation (MPC). It is also the first private inference framework specifically designed for NLG models. 10x of speedup are provided via our propose method.
If you are curious about how cryptography protect the privacy of user contents and models and how we optimize the inference procedure, click here for more details.
2. Experiences
- 2016.09-2020.06: Bachelor Degree, in Northeastern University, on cybernetics (Control Theory);
- 2020.09-2023.06: Master Degree, in the iMiss Group of Xi'an Jiaotong University, on software engineer and research for Conversational AI and NLP Security;
- 2023.11-now: PhD Student, in the The Hong Kong Polytechnic University in Hong Kong. Research of interests: AI security and Natural Language Processing.
3. Publications
- Alignment-Aware Model Extraction Attacks on Large Language Models Zi Liang, Qingqing Ye, Yanyun Wang, Sen Zhang, Yaxin Xiao, Ronghua Li, Jianliang Xu, Haibo Hu - arXiv preprint arXiv:2409.02718, 2024 [Paper] [Code]
- PAIR: Pre-denosing Augmented Image Retrieval Model for Defending Adversarial Patches Z Zhou, P Wang, Z Liang, R Zhang, H Bai - MM 2024
- Why Are My Prompts Leaked? Unraveling Prompt Extraction Threats in Customized Large Language Models Z Liang, H Hu, Q Ye, Y Xiao, H Li - arXiv preprint arXiv:2408.02416, 2024 [Paper][Code]
- TSFool: Crafting Highly-Imperceptible Adversarial Time Series through Multi-Objective Attack Yanyun Wang, Dehui Du, Haibo Hu, Zi Liang, Yuanhao Liu - ECAI, 2024
- Merge: Fast private text generation Z Liang, P Wang, R Zhang, N Xu, S Zhang, L Xing… - AAAI, 2024 [Paper] [Code]
- "Healing Unsafe Dialogue Responses with Weak Supervision Signals." Z Liang, … - arXiv preprint arXiv:2305.15757 (2023). [Paper][Code]
- Multi-action dialog policy learning from logged user feedback S Zhang, J Zhao, P Wang, T Wang, Z Liang, J Tao… - AAAI, 2023
4. Contact Me
- GitHub: https://github.com/liangzid
- MAIL: zi1415926.liang@connect.polyu.hk
- Wechat: paperacceptplease
- Google Scholar: HERE