lzy37ld

May 2025

Check out our latest work, RedTeamCUA, a controlled, realistic, interactive sandbox environment for Computer-Use Agents adversarial testing. Sorry to say that the realistic end-2-end risks are not hypothetical, but true (Latest Claude 4 Opus, the most advncaed CUA from Anthropic, shows as high as 48% ASR on our RTC-Bench)! There is still a long way to go for wide-spread real-world deployment of CUAs.

May 2025

AdvAgent is accepted by ICML 2025. Congrats and thanks to my great collaboraters!

Jan 2025

Excited to share that 3 papers (EIA,AIHF,SAB) are accepted by ICLR 2025. Thanks to all my great collaboraters!

Jan 2025

Be honored to know that EIA has been incorporated into the UCSD CSE 291 (LLM Security) course materials!

Sep 2024

Feel excited to announce that our investigation into long-tail knowledge is accepted at EMNLP 2024!

Sep 2024

Our new preprint: Environmental Injection Attacks (EIA) at here. This work explores the privacy risks assoticated with the web agent. EIA is one form of indirect prompt injection but specifically targets the environment where state-changing actions happen. We design two injection strategies tailored to the web environments and explore different positions within the webpage to identify the vulnerable regions. More importantly, we provide implications about the levels of the human supervision to banlance the trade-off between autonomy and security, and discuss different defensive approaches, both for pre- and post-deployment stage of the website, with their limitations. Feel free to check the X post here as well.

Aug 2024

Release the raw datasets of AmpleGCG-plus, containing millions of optimized suffixes with their corresponding evaluation results. Check out more details in here. Should be very useful for your if you'd like to build sth upon the GCG.

Aug 2024

Release the AmpleGCG-plus series of models with enhanced training data quality and quantity. Check them out at here. Highlights are 1) higher ASR in fewer attempts under stringent evaluators; 2) pushing the ASR of GPT-4 to 22%. Find the Twitter post at here.

July 2024

Thrilled to announce that AmpleGCG has been accepted at COLM 2024.

June 2024

Don't waste your demonstration data and utilize them for joint preference learning. Check out our preprint "Joint Demonstration and Preference Learning Improves Policy Alignment with Human Feedback" here. This is my first time delving into the field of RL and I believe I will have more chances to dig deeper into it in the future.

April 2024

Very proud to have my first author paper AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs. Really learn a lot from the journey and can not do it without the help from my advisor. Check out the Twitter post at here.

March 2024

RAG is popular technique to reduce hallucination and provide up-to-date knowledge to static parameteric memory. Wonder how hard is it for LLM to attribute the generation back to the provided reference? Check out the AttributionBench here.

Jan 2024

Agents are growing like viruses. But how can we ensure their safety? Check out our paper "A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents", investigating the security of agents by mapping adversarial attacks from LLMs to Agents.

Nov 2023

Finally, our paper "In Search of the Long-Tail: Systematic Generation of Long-Tail Knowledge via Logical Rule Guided Search" is arxived. One take I have is that: always play around with the long-tail data to examnie the true capability of models. I feel incredibly fortunate to have the opportunity to collaborate with renowned advisors like Yejin Choi and Xiang Ren, especially considering I've only been studying NLP for less than one year.

Aug 2023

Start my Phd journey @ OSU, guided by Prof. Huan Sun. IDK what will happen and I am bit nervous, but also excited, honestly as I have little to no experience in the NLP field and computer science. But who knows, right? Let's see

Hi, I'm Zeyi Liao.

preprint, 2025 PDF, Page

International Conference on Learning Representations (ICLR), 2025 PDF

Selected as course materials for UCSD CSE 291 (LLM Security)

Conference on Language Modeling (COLM), 2024 PDF

and PDF about AmpleGCG-plus

Discussed by AI safety news

International Conference on Learning Representations (ICLR), 2025 PDF

Annual Conference of the Association for Computational Linguistics (Findings of ACL), 2024 PDF

Preprint on Arxiv, 2024 PDF

Empirical Methods in Natural Language Processing (EMNLP), 2024 PDF

ACM International Conference on Information and Knowledge Management (CIKM), 2023 PDF

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022 PDF

Email: [last name].629@osu.edu or [acronym of "liao ze yi"]37ld@gmail.com

Feel free to contact me if you are interested in my research or want to discuss related research topics :>