Zi Liang (Research Page)

Table of Contents

danjin.jpg

1. Introduction

1.1. Introducing Myself

My name is Zi Liang, now a PhD student in the Astaple Group of Hong Kong Polytechnic University (PolyU). My supervisor is Prof. Haibo Hu. I begin my research in Xi'an Jiaotong University, under the supervision of Prof. Pinghui Wang and Ruofei (Bruce) Zhang. I also work closely with Dr. Nuo Xu, Shuo Zhang, Yaxin Xiao, Xinwei Zhang, and Yanyun Wang.

I specialize in analyzing the potential risks inherent in language models, with a focus on understanding why and how neural networks function and identifying vulnerabilities within them. My research is driven by a deep curiosity to uncover the mechanisms behind these models and to address the security challenges they present.

My work can be categorized into two main areas:

  • Uncovering New Threats and Developing Defenses: I conduct comprehensive evaluations of popular AI services and techniques, combining in-depth theoretical analysis with practical experimentation, such as our study in prompt extraction attacks (link);
  • Enhancing Understanding of Models and Learning Processes: I aim to explain the root causes of safety issues in AI systems, examining how these problems arise during model training and inference, and what they imply for the broader field of machine learning.

In addition to my research, I have extensive experience in natural language processing (NLP), particularly in building conversational AI systems, which I have been actively involved in since 2020. More recently, starting in 2024, I have developed a strong interest in the future of AI, particularly in the application of reinforcement learning to advance the capabilities and safety of intelligent systems.

1.2. Introducing My Recent Research

1.2.1. Why Are My Prompts Leaked? Unraveling Prompt Extraction Threats in Customized Large Language Models [Arxiv'24]

This paper uncover the threat of prompt leakage on customized prompt-based services, such as OpenAI's GPTs. It aims to answer three questions:

  1. Can LLM's alignments defend against prompt extraction attacks?
  2. How do LLMs leak their prompts?
  3. Which factors of prompts and LLMs lead to such leakage?

We provide a comprehensive and systemic evaluation to answer question 1 and 3, and propose two hypothesis with experimental validation for question 2. We also propose several easy-to-adpot defending strategy based on our discovery.

Click here if you are also interested in this research.

1.2.2. MERGE: Fast Private Text Generation [AAAI'24]

This paper propose a new privacy-preserving inference framework for current transformer-based generative language models based on Secret Sharing and Multi-party Security Computation (MPC). It is also the first private inference framework specifically designed for NLG models. 10x of speedup are provided via our propose method.

If you are curious about how cryptography protect the privacy of user contents and models and how we optimize the inference procedure, click here for more details.

2. Experiences

  1. 2016.09-2020.06: Bachelor Degree, in Northeastern University, on cybernetics (Control Theory);
  2. 2020.09-2023.06: Master Degree, in the iMiss Group of Xi'an Jiaotong University, on software engineer and research for Conversational AI and NLP Security;
  3. 2023.11-now: PhD Student, in the The Hong Kong Polytechnic University in Hong Kong. Research of interests: AI security and Natural Language Processing.

3. Publications

  • Exporing Intrisic Alignments within Text Corpus Zi Liang, Pinghui Wang, Ruofei Zhang, Haibo Hu, … - AAAI, 2025
  • Alignment-Aware Model Extraction Attacks on Large Language Models Zi Liang, Qingqing Ye, Yanyun Wang, Sen Zhang, Yaxin Xiao, Ronghua Li, Jianliang Xu, Haibo Hu - arXiv preprint arXiv:2409.02718, 2024 [Paper] [Code]
  • PAIR: Pre-denosing Augmented Image Retrieval Model for Defending Adversarial Patches Z Zhou, P Wang, Z Liang, R Zhang, H Bai - MM 2024
  • Why Are My Prompts Leaked? Unraveling Prompt Extraction Threats in Customized Large Language Models Z Liang, H Hu, Q Ye, Y Xiao, H Li - arXiv preprint arXiv:2408.02416, 2024 [Paper][Code]
  • TSFool: Crafting Highly-Imperceptible Adversarial Time Series through Multi-Objective Attack Yanyun Wang, Dehui Du, Haibo Hu, Zi Liang, Yuanhao Liu - ECAI, 2024
  • Merge: Fast private text generation Z Liang, P Wang, R Zhang, N Xu, S Zhang, L Xing… - AAAI, 2024 [Paper] [Code]
  • "Healing Unsafe Dialogue Responses with Weak Supervision Signals." Z Liang, … - arXiv preprint arXiv:2305.15757 (2023). [Paper][Code]
  • Multi-action dialog policy learning from logged user feedback S Zhang, J Zhao, P Wang, T Wang, Z Liang, J Tao… - AAAI, 2023

4. Contact Me

  1. GitHub: https://github.com/liangzid
  2. MAIL: zi1415926.liang@connect.polyu.hk
  3. Google Scholar: HERE

Author: liangzid (2273067585@qq.com) Create Date: Fri Sep 10 14:23:46 2021 Last modified: 2025-02-06 Thu 16:17 Creator: Emacs 29.2 (Org mode 9.6.28)