avatar
AstroAI
Developing Artificial Intelligence to Solve the Mysteries of the Universe
  • HOME
  • RESEARCH
  • EARTHAI
  • PEOPLE
  • EVENTS
  • LATEST NEWS
  • LUNCH TALKS
  • SUMMER PROGRAM
  • WORKSHOP
  • APPLY
  • CONTACT
Home AstroAI Workshop 2025
Workshop_abstract
Cancel

AstroAI Workshop 2025

Details Invited Speakers Abstracts Register Schedule Venue Accommodations Code of Conduct

Rocco Di Tella

Building Honest Agents Through Introspection: Probe-driven Generation of Confidence Scores

Presenter: Rocco Di Tella

Title: Building Honest Agents Through Introspection: Probe-driven Generation of Confidence Scores

Date/Time: Monday, July 7th, 3:30 - 5:00 PM

Abstract: Large language models (LLMs) are the engine behind agentic AI, providing language processing, planning, and reasoning capabilities. Unfortunately, current LLMs do not directly provide a measure of confidence for the responses they produce. This poses a serious problem for high-risk applications, where only responses that are very likely to be correct should be accepted. We propose to explore supervised approaches for computing confidence measures for answers provided by LLMs. To this end, we will develop models that probe the LLM’s internal representations to predict whether an answer is correct or not, focusing on structured architectures with strong inductive biases to facilitate generalization to unseen tasks. We will train and evaluate our models on a variety of NLP datasets, using proper scoring rules to assess performance of the produced scores.

-->

© 2025 AstroAI. Some rights reserved.

Powered by Jekyll with Chirpy theme.

A new version of content is available.