I’m an undergraduate at UC Berkeley doing what I can to make AI go well!
In the past, I’ve worked on multi-turn jailbreaks, singular learning theory, and interpretability.
I use this form for anonymous feedback/messages about how I can be better, and really appreciate people taking the time to fill it out.
I’m always up to chat. If you’re seeing this you should totally email me or reach out on twitter!
I also sometimes post on lesswrong :)
Research Publications
Also see my Google Scholar.
-
Emerging Vulnerabilities in Frontier Models: Multi-Turn Jailbreak Attacks Tom Gibbs*, Ethan Kosak-Hine*, George Ingebretsen*, Jason Zhang, Julius Broomfield, Sara Pieri, Reihaneh Iranmanesh, Reihaneh Rabbany, Kellin Pelrine. ArXiv Preprint, 2024
-
Approximating the Local Learning Coefficient in Neural Networks: A Comparative Analysis of Power Series Expansion Orders Sid Baines, Ayush Bharadwaj, George Ingebretsen, Hernan Iriarte, Maria Matveev Lucius Bushnaq. Draft, 2024
Posts
Probably Not A Ghost Story
Evals Research Ideas
Making Little Simz Gorilla Interactive Music Video
Computer Apps I Recommend
subscribe via RSS