Neural scaling laws and astronomy
Presenter: Mike Smith
Title: Neural scaling laws and astronomy
Date/Time: Monday, June 17th, 2:30 - 4:00 PM; Thursday, June 20th, 3:30 - 5:00 PM
Abstract: Deep learning’s current “hot topics” are foundation models in the vein of ChatGPT and Chinchilla. These remarkably simple models contain a few standard deep learning building blocks and are trained through the prediction of the next item in a sequence. Surprisingly, these models’ performance scale with dataset and model size via a predictable power law. Even more astoundingly, these models have been shown to display “emergent abilities” such as knowledge (albeit not “understanding”) of arithmetic, law, geography, and history. And these emergent abilities only manifest once the model reaches a certain scale of data and compute, without any major changes to model architecture. We will explore these models, their abilities, and why they caused so much excitement within the deep learning community.
In 2022, a team at Google DeepMind discovered that – optimally – the size of these foundation models should be scaled in a roughly equal proportion to the size of the dataset used to train them. We show that this means that the current constraint on state-of-the-art foundation model performance is dataset size, and not model size as previously thought. We also show that astronomy and earth observation are awash with data suitable for training foundation models, and so an “observational” foundation model would not be data constrained. There is therefore an opportunity for astronomers as a community to develop and provide a high quality multi-modal public dataset that could be used to advance the cutting edge in both deep learning and astronomy. This dataset is currently being realised as the “AstroPile”. In turn, the AstroPile could be used to train an astronomical foundation model to serve useful downstream tasks such as astronomical object classification, information extraction, and entirely data-driven astronomical simulation.
This poster draws on conclusions in the final chapter in our review paper “Astronomia ex machina: a history, primer, and outlook on neural networks in astronomy” (https://doi.org/10.1098/rsos.221454).