Date
Apr 2, 2025, 1:30 pm2:30 pm
Location
Friend 006

Details

Event Description

Webinar stream and in-person viewing at FC 006

Abstract: Adversarial machine learning as a field has existed for over twenty years. But the last few years have seen significant changes to the types of problems we are studying as a field. Instead of evaluating the robustness of simple image or malware classifiers, we now consider a much broader set of problems in a much broader set of domains. This talk argues that in the era of LLMs, the field of adversarial ML studies problems that are (1) less clearly defined, (2) harder to solve, and (3) even more challenging to evaluate. I first comment on the challenges in this field broadly and then focus specifically on the problem of constructing “unfinetunable” models; a task that we have shown is considerably harder than first believed by breaking two state of the art defenses. I conclude by discussing how this difficulty of evaluation will impact the future.

Nicholas Carlini

Sponsor
Event organized by PLI