Speaker 1: David Vartanyan (Carnegie Observatory)

Title: Avenues into, and Prospects for, ML Applications in Supernovae

Abstract: Core-collapse supernovae (CCSNe) have been known to explode. Recent theoretical success in reproducing these explosions permits the luxury of asking why. The computational expense and the stochasticity of the CCSNe problem endow ML as a valuable, and perhaps viable, asset to resolving the longstanding CCSNe problem. I will present a simple, intuitive metric, enabled by early forays of ML applied to the CCSNe context, that can ab-initio predict explosion outcome with 90% fidelity. These results are corroborated by recent work using a CNN classifier. I will conclude with prospects to transition from simple classification to predictive utility in the broad range of multi-scale CCSNe diagnostics that couple the neutrino-driven central engine of CCSNe with their vibrant displays as remnants.

Speaker 2: Congyue Deng (Stanford University)

Title: ”The reality of the universe is geometrical.” – E. A. Burtt. The Metaphysical Foundations of Modern Physical Science.

Abstract: Deep learning frameworks, whether supervised or unsupervised, have achieved remarkable success a large variety of problems in astrophysics. However, despite their ability to extract high-level information from data, they often struggle to capture exact geometric relationships. Even in the simplest cases, for example, pointcloud networks trained on well-aligned objects (e.g. chairs in an upright position) can fail when tested on objects in arbitrary poses (e.g. chairs in random orientations under an SE(3) transformation). This highlights the networks’ lack of geometric understanding of pose changes and, more broadly, group actions and geometric relations. These limitations are common across many learning frameworks, impacting their robustness and generalizability – particularly in real-world applications where explainability and trustworthiness are critical, such as processing data from scientific experiments. On the other hand, geometry is a language that is widely adopted in describing physical laws. Incorporating and enforcing geometric relations in neural networks pave a way of building deep learning systems that can understand and follow physical laws. In this talk, I will demonstrate how naively constructed neural networks fail to understand geometric transformations in a variety of scenarios. I will then introduce a series of works on incorporating geometric operators into the latent spaces of neural networks, enabling them to expressively represent different classes of geometric transformations, from the simplest linear transformations to the more complex multi-body movements and continuous diffeomorphism. In the end, I will briefly discuss the possible future directions of applying geometric-aware deep learning to astrophysical problems.

Watch the talk below!

]]>Speaker 1: Kangning Diao

Title:synax: A Differentiable and GPU-accelerated Synchrotron Simulation Package

Abstract: We introduce synax, a novel library for automatically differentiable simulation of Galactic synchrotron emission. Built on the JAX framework, synax leverages JAX’s capabilities, including batch acceleration, just-in-time compilation, and hardware-specific optimizations (CPU, GPU, TPU). Crucially, synax uses JAX’s automatic differentiation (AD) mechanism, enabling precise computation of derivatives with respect to any model parameters. This feature facilitates powerful inference algorithms, such as Hamiltonian Monte Carlo (HMC) and gradient-based optimization, which enables inference over models that would otherwise be computationally prohibitive. In its initial release, synax supports synchrotron intensity and polarization calculations down to GHz frequencies, alongside several models of the Galactic magnetic field (GMF), cosmic ray (CR) spectra, and thermal electron density fields. We demonstrate the transformative potential of AD for tasks involving full posterior inference using gradient-based techniques or Maximum Likelihood Estimation (MLE) optimization. Notably, we show that GPU acceleration brings a twenty-fold enhancement in efficiency, while HMC achieves a two-fold improvement over standard random walk Metropolis-Hastings (RWMH) when performing inference over a four-parameter test model. HMC still works on a more complex, 16-parameter model while RWMH fails to converge. Additionally, we showcase the application of synax in optimizing the GMF based on the Haslam 408 MHz map, achieving residuals with a standard deviation below 1 K.

Speaker 2: Matthew O’Callaghan

Title: Hamiltonian Monte Carlo with Normalizing Flow Priors

Abstract: Complex, data-driven priors are of paramount interest to the astronomical community. Bayesian inference involves selecting a prior distribution over parameters, a likelihood function for the observed data, and an appropriate inference algorithm. Hamiltonian Monte Carlo (HMC) has emerged as an efficient Markov Chain Monte Carlo (MCMC) algorithm, improving posterior sampling by simulating Hamiltonian dynamics in the proposal step, leading to faster convergence compared to traditional methods. Normalizing flows (NFs) are generative models that, under certain architectural assumptions, can serve as universal density approximators, enabling flexible modeling of complex distributions. In this talk, we begin with a brief introduction to HMC and NFs then investigate the conditions necessary for implementing NF priors in a HMC inference algorithm.

Watch the talk below!

]]>Speaker: Bonny Wang (Carnegie Mellon University)

Title: Machine-Learning Cosmology from Cosmic Voids

Abstract: Cosmic voids, the underdense regions in the galaxy distribution, are dominated by dark energy and account for most of the volume of the Universe. Thanks to their underdense feature, voids are particularly sensitive to cosmological information. In the past, the low number of voids and small survey volumes limited the research on voids, but recent large-scale surveys now enable big data approaches to fully explore voids’ cosmological implications. Current methods of extracting cosmological information from voids are limited by the progress in modeling void statistics, typically focusing on void size functions and void-galaxy cross-correlation functions. The relationship between other void properties and cosmological parameters remains underexplored. Machine learning provides a well-established framework for performing this task. Furthermore, recently, relying on machine learning, extracting cosmological constraints from the properties of one galaxy proved to be possible. However, these results embed underlying questions that still need to be answered: How should we select the one galaxy or group of galaxies to optimize the constraints? Machine-learning approach enabled us to address this question by considering galaxies in cosmic voids. We show that void galaxies provide stronger constraints on the matter density parameter compared to randomly selected galaxies. This result suggests that the distinctive characteristics of void galaxies may provide a cleaner and more effective environment for extracting cosmological information.

Watch the talk below!

]]>Speaker: Liam Connor (Harvard University)

Title: Ill-posed inverse problems in radio astronomy

Abstract: In the past decade, tremendous progress has been made by the computer vision community with respect to the classical ill-posed inverse problems, largely thanks to efficient neural network architectures for reconstruction. These include deblurring, deconvolution and superresolution, image inpainting, and 3D reconstruction. Radio interferometers have battled sparse image reconstruction for decades, relying mostly on iterative algorithms like CLEAN. In this talk, I will describe our work on learning-based methods for imaging in radio interferometry and 3D reconstruction in the context of cosmology. I will discuss the scientific value of super-resolution imaging on the upcoming DSA-2000 radio camera, for example strong gravitational lensing.

Watch the talk below!

]]>Speaker: Yanke Song (Harvard Stats)

Title: A Poisson-process AutoDecoder for X-ray Sources with Applications in Time-Domain

Abstract: X-ray observing facilities such as the Chandra X-ray Observatory and the eROSITA all sky survey have detected millions of astronomical sources associated with high-energy phenomena. The arrival of photons as a function of time follows a Poisson process and can vary by orders-of-magnitude, presenting obstacles for downstream tasks such as source classification, physical property derivation and anomaly detection. Previous work has either failed to directly capture the Poisson nature of the data or only focuses on Poisson rate function reconstruction. In this work, we present Poisson Process AutoDecoder (PPAD). PPAD is a neural field decoder that maps fixed-length latent features to continuous Poisson rate functions across energy band and time via unsupervised learning. It reconstructs the rate function and yields a representation at the same time. We demonstrate the efficacy of PPAD reconstruction, regression, classification and anomaly detection experiments using the Chandra Source Catalog.

Watch the talk below!

]]>Speaker: Dominic Chang (Harvard, BHI)

Title: Bayesian Black Hole Photogrammetry

Abstract: We propose an analytic dual-cone accretion model for horizon-scale images of the cores of low-luminosity active galactic nuclei, including those observed by the Event Horizon Telescope (EHT). Our model is of synchrotron emission from an axisymmetric, magnetized plasma, constrained to flow within two oppositely oriented cones that are aligned with the black hole’s spin axis. We show this model can accurately reproduce images of a variety of time-averaged general relativistic magnetohydrodynamic simulations and that it accurately recovers the black hole spin, orientation, emission scale height, peak emission radius, and fluid flow direction from these simulations within a Bayesian inference framework using radio interferometric data. We show that nontrivial topologies in the images of relativistic accretion flows around black holes can result in nontrivial multimodal solutions when applied to observations with a sparse array, such as the EHT 2017 observations of M87*. The presence of these degeneracies underscores the importance of employing Bayesian techniques to adequately sample the posterior space for the interpretation of EHT measurements. We fit our model to the EHT observations of M87* and find a 95% highest posterior density interval for the mass-to-distance ratio of θg ∈ (2.84, 3.75) μas, and give an inclination of θo ∈ (11°,24°). These new measurements are consistent with mass measurements from the EHT and stellar dynamical estimates, and with the spin axis inclination inferred from properties of the M87* jet.

Watch the talk below!

]]>Speaker: Shivam Raval (Harvard)

Title: If [0.32, 0.42, -0.18, … 0.86] is Monday, [0.48, -0.27, 0.98, … -0.22] is Interpretability, which direction is Shivam’s AstroAI Lunch Talk?

Abstract: Frontier language models have unique abilities to combine and connect seemingly unrelated concepts to provide novel, surprising yet seemingly plausible responses. A natural question arises: do they really understand human-interpretable concepts and if so, can we extract them from the model internals? One of the main goals of machine learning interpretability is to identify and disentangle complex representations of inputs into human-interpretable concepts for transparency, control, and safety. The main focus of this talk will be on techniques used to understand what a brain scan of a model encodes and how to decompose it into its most atomic units. Recent findings [1,2] suggest that interpretable features might be represented surprisingly as linear directions in the high dimensional space of the model’s activations. I will briefly talk about empirical findings that support this hypothesis, and how they can be operationalized towards designing better, aligned AI systems [3]. This so-called linear representation hypothesis has led to the use of sparse coding to decode internal activations of Large language models, and the introduction of Sparse Autoencoders (SAEs) for interpretability and model steering [4]. Using toy examples and synthetic datasets, I will highlight some benefits and challenges of using SAEs for interpretability and the effect of architectural choices on the learned features, and what the future can look like for language model interpretability. Finally, I will describe Lumiscope, an in-development platform for interactive interpretability that would allow researchers to study the internals of frontier models without having to implement an interpretability technique. With a case study of Patchscopes [5], a recently introduced interpretability framework I will describe some early findings on training-free approaches to studying entity-attribute extraction and bias quantification using Lumiscope.

[1] Marks, Samuel and Max Tegmark. “The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets.” ArXiv abs/2310.06824 (2023)

[2] A Arditi, O Obeso, A Syed, D Paleka, N Rimsky, W Gurnee, N Nanda. “Refusal in Language Models Is Mediated by a Single Direction” Mechanistic Interpretability Workshop at ICML (2024)

[3] Y Chen, A Wu, T DePodesta, C Yeh, K Li, NC Marin, O Patel, J Riecke, S Raval, O Seow, M Wattenberg and F Viégas. “Designing a Dashboard for Transparency and Control of Conversational AI” arXiv preprint arXiv:2406.07882 (2024)

[4] A Templeton and T Conerly and others. “Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet” (2024) https://transformer-circuits.pub/2024/scaling-monosemanticity/index.html

[5] N Hussein, A Ghandeharioun, R Mullins, E Reif, J Wilson, Ni Thain and L Dixon. “Can large language models explain their internal mechanisms?” (2024) https://pair.withgoogle.com/explorables/patchscopes/

Watch the talk below!

]]>Speaker: Ethan Tregidga

Title: X-ray Spectral Fitting with Autoencoders

Abstract: Black hole X-ray binaries (BHBs) offer insights into extreme gravitational environments and the testing of general relativity. The X-ray spectrum collected by NICER offers valuable information on the properties and behaviour of BHBs through spectral fitting. However, traditional spectral fitting methods are slow and scale poorly with model complexity. We developed a new semisupervised autoencoder neural network for parameter prediction and spectral reconstruction of BHBs, showing significant improvements in speed while maintaining comparable accuracy. The approach maps the spectral features from the numerous outbursts catalogued by NICER and generalizes them to new systems for efficient and accurate spectral fitting. The effectiveness of this approach is demonstrated in the spectral fitting of BHBs and holds promise for use in other areas of astronomy and physics for categorizing large data sets.

Watch the talk below!

]]>Speaker 1: Nicolò Pinciroli (Polytechnic of Milan)

Title: Gravitational Lensing with DeepGravLens

Abstract: Gravitational lensing is the relativistic effect generated by massive bodies, which bend the space-time surrounding them. In recent years, machine learning methods have been used to detect lensing effects in datasets containing single images or combinations of images and brightness time series. Most works considered only single images, neglecting transient phenomena such as gravitationally lensed supernovae. This talk introduces DeepGraviLens, a multi-modal network that classifies spatio-temporal data formed by a single image and 4 brightness time series. The goal is to classify data into 4 classes: no lens, lens, lensed SNIa and lensed SNCC. This approach surpasses the previous state-of-the-art accuracy results on 4 simulated data sets by 3% to 11%. Such an improvement will accelerate the analysis of lensed objects in upcoming surveys, exploiting the petabytes of data collected, e.g., from the Vera C. Rubin Observatory. However, finding gravitational lenses is only a preliminary step towards a deeper analysis of their characteristics. My current research focuses also on the analytical description of gravitational lenses using the Roulette formalism, which approximates them using a 2D Taylor expansion in polar coordinates. Multi-output regression deep learning can estimate the expansion coefficients from the images of gravitational lenses.

Speaker 2: Ningyue Fan

Title: ForkEoR: A Novel Machine Learning Algorithm for Image Deconvolution and Foreground Removal of EoR Signal

Watch the talk below!

]]>Speaker: Aquib Moin (UAEU)

Title: Development & Deployment of AI/ML tools and utilities for NASA’s Habitable Worlds Observatory (HWO): A Case Study.

Abstract: In recent years, the field of Artificial Intelligence (AI) has witnessed unprecedented advancements, with the emergence of sophisticated models from the likes of OpenAI, Google, Meta and advancements in Machine Learning algorithms. This presentation delves into the deployment of Large Language Models (LLMs) and machine learning (ML) models for Astronomy and Space Research data analytics in general, and for NASA’s HWO in particular. It is an account of ongoing exploration of how AI/ML can facilitate fast and automated analysis of vast astronomical datasets, enabling higher accuracy and enhancing potential of discovery through integrating LLMs for natural language processing and ML models for operations like pattern recognition, predictive analytics, anomaly detection etc. The discussion will primarily focus on a brief overview of the deployment techniques, platforms and ecosystems used to operationalize these models, emphasizing on a transition from model development to deployment in enhancing data interpretation and R&D productivity.

Watch the talk below!

]]>