Research

My research lies at the intersection of the philosophy of cognitive science and AI. I am also interested in all things communication, methodology, history, and theory of science.

The aim of my dissertation is to select, examine, and transfer tools from the study of human cognition toward understanding the cognitive properties of LLMs. Ultimately, my goal is to help model specific capabilities of systems that would make them difficult to control, such as strategic deception.

An abridged version of Chapter 1: What Can We Learn From The Behavioral Study of Large Language Models is available here.

An abridged version of Chapter 2: Three Levels for Large Language Models Cognition is available here.

I also presented a preliminary version at the Philosophy of Science Association (PSA) Conference 2024 in New Orleans, LA.

 

Recent Papers

  • A Problem to Solve Before Building a Deception Detector (Alignment Forum)
  • Three Levels for Large Language Model Cognition (Under Review)
  • What Can We Learn from the Behavioral Study of Large Language Models (Under Review)
  • Artificial Intelligence Safety as an Emerging Paradigm (Forthcoming)
  • A Problem for the Triggering Account of Innateness (R&R)