When Machines Testify: AI Companion Chat Logs and the Limits of Evidence Law

By Meeah Willig

Law enforcement is increasingly encountering a new category of digital evidence: conversational records between individuals and artificial intelligence (AI) systems. Recent investigations have reportedly included a suspect’s interaction history with large language models, prompting prosecutors to consider whether such exchanges illuminate intent, motive, or planning.^[1] The evidentiary stakes are especially high with AI companion platforms, such as Replika or Character.AI, which are designed to simulate emotional understanding and long-term relational continuity.^[2] Unlike ordinary messaging tools, these systems actively shape dialogue through adaptive prompts and affective responses. The resulting transcripts are co-constructed artifacts: part human disclosure, part machine-generated language.^[3]

This analytical piece argues that AI companion transcripts expose a structural gap in evidence law. While courts can formally admit these records under existing rules, uncritical application risks obscuring a deeper mismatch between human-centered evidentiary assumptions and machine-shaped dialogue. A more candid approach, grounded in heightened gatekeeping and reliability scrutiny, is necessary to prevent algorithmic design from artificially inflating the probative force of AI-mediated statements.

The Rise of AI Companions

The tendency to disclose to machines is not new.^[4] Users of early programs such as ELIZA responded as if the system understood them, despite knowing it did not. Contemporary AI companions operationalize this phenomenon at scale.^[5] These platforms are engineered for sustained engagement and emotional warmth, using responsive language calibrated to mirror empathy, continuity, and attentiveness.^[6] Many retain memory of prior disclosures, reference past conversations, and employ conversational cues that mimic human interaction.

The result is accelerated intimacy. Users may share sensitive thoughts, fantasies, or confessions more quickly than they would in human relationships.^[7] Critically, these exchanges are preserved as durable digital records held by third-party providers. What feels exploratory or therapeutic to the user is, from a legal standpoint, structured data subject to subpoena and use in court.

From Intimate Dialogue to Evidence

Courts routinely admit digital communications—emails, texts, and social media posts—to establish knowledge or intent.^[8] Those media, however, function primarily as passive conduits for human expression. AI companions differ because the system itself participates in shaping the conversation.^[9] Model-generated prompts may steer topics, frame hypotheticals, or reinforce particular narratives. When prosecutors later offer a defendant’s statements as admissions, the surrounding AI-generated language may be essential to understanding why those statements were made at all.

This co-construction complicates attribution. An apparent confession may reflect iterative prompting rather than spontaneous disclosure. Statements expressing intent may arise within speculative or role-playing exchanges encouraged by the system. Treating such records as straightforward admissions risks over-crediting language shaped by algorithmic design.

Evidentiary Analysis Under the Federal Rules

Because constitutional protections are limited once records are lawfully obtained from a provider, the Federal Rules of Evidence do most of the doctrinal work.

Relevance (Rule 401). AI companion transcripts will often clear the low relevance threshold. Statements about motive, fear, or planning can make consequential facts more or less probable. The harder question is why they appear probative. If incriminating language emerges after repeated model-driven prompts, relevance becomes conditional. Courts should treat probative force as dependent on a preliminary showing that the statement reflects the defendant’s own understanding rather than the system’s conversational influence.

Authentication (Rule 901). Authenticating ordinary digital messages typically requires proof of account control. With AI companions, authorship is divided. Even if the defendant typed particular words, their meaning may depend on the machine’s prior prompts. Moreover, when evidentiary weight depends on system behavior, certification that a record was copied from a server establishes genuineness, not reliability. Courts should demand a more robust foundation where probative force turns on how the model generates language.

Hearsay (Rule 801). A defendant’s statements qualify as party-opponent admissions. The AI’s side of the dialogue is more problematic. Although machine-generated text is not a “statement by a person,” it often functions as an assertion shaping meaning. A cautious approach is to admit AI responses solely for context, with limiting instructions that they are not offered for their truth. Courts should resist adoptive-admission theories that treat algorithmic language as assertive content capable of adoption.

Prejudice (Rule 403). AI transcripts present heightened risks of unfair prejudice and juror confusion. Jurors may anthropomorphize the system or accord undue weight to polished, time-stamped logs that appear objective despite opaque generation processes. Rule 403 empowers courts to mitigate these risks through redaction, expert explanation, or exclusion where probative value is substantially outweighed.

Conclusion

AI companion chat logs sit uneasily within an evidence system designed for human speakers. Although existing rules can accommodate them in form, uncritical admission risks granting algorithmically shaped language evidentiary weight it does not deserve. Courts need not exclude these records categorically. Instead, acknowledging the hybrid nature of human–machine dialogue and applying heightened scrutiny to reliability, attribution, and prejudice preserves core evidentiary values while preventing emerging technologies from reshaping what it means to prove a fact in court.

[1] Ana Faguy & Nardine Saad, ChatGPT Image Snares Suspect in Deadly Pacific Palisades Fire, BBC NEWS (Oct. 8, 2025), https://www.bbc.com/news/articles/c8exz5yg14ko.

[2] Cathy Mengying Fang et al., How AI and Human Behaviors Shape Psychosocial Effects of Chatbot Use: A Longitudinal Randomized Controlled Study (preprint, Mar. 25, 2025), https://www.media.mit.edu/publications/how-ai-and-human-behaviors-shape-psychosocial-effects-of-chatbot-use-a-longitudinal-controlled-study/.

[3] Julian De Freitas, Zeliha Oğuz Uğuralp & Ahmet Kaan Uğuralp, Emotional Manipulation by AI Companions, HBS WORKING PAPER(2025), https://www.hbs.edu/faculty/Pages/item.aspx?num=67750.

[4] See Dave Bergmann, ELIZA Effect at Work: Avoiding Emotional Attachment to AI Coworkers, IBM, https://www.ibm.com/think/insights/eliza-effect-avoiding-emotional-attachment-to-ai.

[5] Marita Skjuve et al., My Chatbot Companion – A Study of Human–Chatbot Relationships, INT’L J. HUM.-COMPUT. STUD., (2021).

[6] Ayelet Gordon-Tapiero, A Liability Framework for AI Companions, GEO. WASH. J.L. & TECH. (forthcoming 2025), https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5172386.

[7] Id.

[8] Jeffrey Lane & Fanny A. Ramirez, Social Media as Criminal Evidence: New Possibilities, Problems, AMERICAN SOCIOLOGICAL ASSOCIATION (2024), https://www.asanet.org/footnotes-article/social-media-criminal-evidence-new-possibilities-problems/.

[9] Skjuve et al., supra note 5.