Symposium
Artificial Intelligence and Technology-based Interventions
Christina Desage, M.S. (she/her/hers)
Graduate Student
Palo Alto University
Oakland, California, United States
Tyrique Patterson, BA
PhD Candidate
Palo Alto University
Palo Alto, California, United States
YingHua Wu, MS
Student
Palo Alto University
Palo Alto, California, United States
Alexis Bechtel, MS
Student
Palo Alto University
Palo Alto, California, United States
Arjun Bharat, BS
Student
Palo Alto
Palo Alto, California, United States
Daniella Vaclavik, M.A., M.S., Ph.D. (she/her/hers)
Research Scientist
Florida International University
Miami, Florida, United States
Eduardo Bunge, Dr., Ph.D.
1791 Arastradero Road
Palo Alto University
Mountain View, California, United States
Background: Generative Artificial Intelligence (AI)-based conversational agents (CAs) hold promise to expand access to mental health care through the delivery of structured interventions. However, rigorous evaluation of treatment fidelity and the handling of common therapeutic factors remains limited.
Aims: This study illustrates the implementation of the Thera Turing Test (TTT) to compare the quality of Parent Management Training (PMT) conversations delivered by a human therapist versus an AI CA, Pat.
Methods: Two PMT modules (Psychoeducation and Special Time) were delivered by Pat and by a human therapist. Five graduate psychology students trained in PMT rated treatment fidelity and common therapeutic factors using the TTT framework. Pat’s performance was assessed across four simulated parenting styles (authoritative, authoritarian, permissive, uninvolved), and the human therapist with an authoritative parent. Treatment fidelity was defined as whether each treatment step was delivered. Common factors were rated on a 10-point Likert scale. Inter-rater reliability was examined, and comparisons between Pat and the human therapist were analyzed.
Results: Both Pat and the human therapist achieved high treatment fidelity across PMT modules. Psychoeducation sessions had treatment fidelity scores ranging from 86–98% for Pat and 88.9% for the human therapist. Special Time sessions ranged from 84–100% for Pat and 86% for the human therapist. Common factor ratings were highest for the human therapist (90.8%) and slightly lower across all Pat conversations. Pat delivered structured treatment content across different simulated parenting styles and demonstrated adaptability to varying parent characteristics.
Conclusion: Pat shows potential to deliver structured parent training consistently across diverse parenting styles. While human therapists excel in common factors, Pat’s strong treatment fidelity suggests AI CAs could be a valuable complement to traditional therapy. The TTT provides a framework to iteratively improve AI CAs, ensuring they are safe, effective, and ready to support real-world mental health care.