## AI Seminar

*Our seminar series covers a broad set of topics related to artificial intelligence (AI), machine learning (ML), and statistics. The talks range in scope from applications of AI/ML to tackle hard problems in science and engineering, to ML theory and novel ML techniques, to high-performance computing and new software packages. We aim to bring together AI/ML researchers and domain experts to discuss exciting topics at the intersection of AI/ML and science/engineering.*

Upcoming Talks:

## Acorn Magic for ML Enthusiasts

**Date: August 27, 2021 1:00 pm**

Speaker: Martin Lee

Abstract: TBA

## Past Talks:

**Computational Imaging: Reconciling Physical and Learned Models**

**Date: July 2, 2021 1:00 pm**

Speaker: Ulugbek Kamilov (Washington University in St. Louis)

Abstract: Computational imaging is a rapidly growing area that seeks to enhance the capabilities of imaging instruments by viewing imaging as an inverse problem. There are currently two distinct approaches for designing computational imaging methods: model-based and learning-based. Model-based methods leverage analytical signal properties and often come with theoretical guarantees and insights. Learning-based methods leverage data-driven representations for best empirical performance through training on large datasets. This talk presents Regularization by Artifact Removal (RARE), as a framework for reconciling both viewpoints by providing a learning-based extension to the classical theory. RARE relies on pre-trained “artifact-removing deep neural nets” for infusing learned prior knowledge into an inverse problem, while maintaining a clear separation between the prior and physics-based acquisition model. Our results indicate that RARE can achieve state-of-the-art performance in different computational imaging tasks, while also being amenable to rigorous theoretical analysis. We will focus on the applications of RARE in biomedical imaging, including magnetic resonance and tomographic imaging.

This talk will be based on the following references:

- J. Liu, Y. Sun, C. Eldeniz, W. Gan, H. An, and U. S. Kamilov, “RARE: Image Reconstruction using Deep Priors Learned without Ground Truth,” IEEE J. Sel. Topics Signal Process., vol. 14, no. 6, pp. 1088-1099, October 2020.
- Z. Wu, Y. Sun, A. Matlock, J. Liu, L. Tian, and U. S. Kamilov, “SIMBA: Scalable Inversion in Optical Tomography using Deep Denoising Priors,” IEEE J. Sel. Topics Signal Process., vol. 14, no. 6, pp. 1163-1175, October 2020.
- J. Liu, Y. Sun, W. Gan, X. Xu, B. Wohlberg, and U. S. Kamilov, “SGD-Net: Efficient Model-Based Deep Learning with Theoretical Guarantees,” IEEE Trans. Comput. Imag., vol. 7, pp. 598-610, June 2021.

## Deep Learning for Anomaly Detection

**Date: June 25, 2021 1:00 pm**

Speaker: Ziyi Yang (Stanford)

Abstract: Anomaly Detection (AD) refers to the process of identifying abnormal observations that deviate from what is defined as normal. With applications in many real-world scenarios, anomaly detection has become an important research field in ML and AI. However, detecting anomalies in high-dimensional space is challenging. In some high-dimensional cases, previous AD algorithms fail to correctly model the normal data distribution. Also the understanding on the detection mechanism of AD models remained limited. To address these challenges and questions, in this talk, first I will present the Regularized Cycle-consistent GAN (RCGAN) that introduces a penalty distribution in the modeling of normal data distribution. We theoretically show that the penalty distribution regularizes the discriminator and generator towards the normal data manifold. Second, we explore anomaly detection with domain adaptation where the normal data distribution is non-static. We propose to extract the common features of source and target domain data and train an anomaly detector using the extracted features.

Slides and video.

## Machine-Learning for Modeling Complex Materials and Media

**Date: June 18, 2021 1:00 pm**

Speaker: Serveh Kamrava (USC)

Abstract: In recent years, machine learning (ML) approaches have made it possible to extract and explore intricate patterns from big data. One of the fields that can benefit from the computational advantages that ML offers is materials characterization where we have complex heterogeneous morphology. The morphology of complex systems is one of the determinant elements that control a variety of their properties, such as flow, transport, and mechanical behaviors. Such properties are often estimated using experimental and computational methods, which can be very costly and time-demanding. As such, faster and more automatic methods are required. Machine learning provides an alternative solution for this problem. In this presentation, I will present a deep learning method that can take the 3D morphology of complex materials and estimate their transport properties. Then, I will talk about a novel method using which one can quantify the accuracy of augmentation methods for adding more data to ML and identify the method that can provide the best set of data by minimizing the discrepancy and expanding the variability. For the next topic, I will discuss the application of deep learning for dynamic data when they change with time for a transport problem on a complex membrane system. I close this particular topic by describing how the governing equations can be used in ML for filling the gap in data and reducing the amount of data for ML. These results will be compared with a fully data-driven ML method.

## Autonomous analysis of synchrotron X-ray experiments with applications to metal nanoparticle synthesis

**Date: May 7, 2021 1:00 pm**

Speaker: Sathya Chitturi (Stanford)

Abstract: A critical step in developing autonomous pipelines for materials synthesis experiments is automatic interpretation of characterization experiments. In this talk, we present an example of a closed-loop bayesian optimization pipeline for metal nanoparticle synthesis using real-time information from Small-angle X-ray Scattering (SAXS) experiments. This approach has previously successfully created libraries of monodisperse Pd nanoparticles with user-specified sizes. In addition, we describe a CNN-based method used to interpret complementary X-ray diffraction data. Here CNN regression models are trained for each crystal class to predict lattice parameters for the corresponding unit-cell. A key component of this work involves data augmentation schemes which capture sources of experimental noise in order to improve model generalizability. The lattice parameter estimates are subsequently refined using an automatic whole-pattern fitting algorithm

## Going Beyond Global Optima with Bayesian Algorithm Execution

**Date: April 30, 2021 1:00 pm**

Speaker: Willie Neiswanger

In many real world problems, we want to infer some property of an expensive black-box function f, given a budget of T function evaluations. One example is budget constrained global optimization of f, for which Bayesian optimization is a popular method. Other properties of interest include local optima, level sets, integrals, or graph-structured information induced by f. Often, we can find an algorithm A to compute the desired property, but it may require far more than T queries to execute. Given such an A, and a prior distribution over f, we refer to the problem of inferring the output of A using T evaluations as Bayesian Algorithm Execution (BAX). In this talk, we present a procedure for this task, InfoBAX, that sequentially chooses queries that maximize mutual information with respect to the algorithm's output. Applying this to Dijkstra's algorithm, for instance, we infer shortest paths in synthetic and real-world graphs with black-box edge costs. Using evolution strategies, we yield variants of Bayesian optimization that target local, rather than global, optima. We discuss InfoBAX, and give background on other information-based methods for Bayesian optimization as well as on the probabilistic uncertainty models which underlie these methods.

## Signal Decomposition via Distributed Optimization

**Date: April 23, 2021 1:00 pm**

Speaker: Bennet Meyers (Stanford/SLAC)

We consider the well-studied problem of decomposing a time series signal into some components, each with different characteristics. We propose a simple and general framework for decomposition of a signal into a number of signal classes, each defined by a loss function and possibly constraints, via optimization. We describe a number of useful signal classes, and give a distributed optimization method for computing the decomposition, that scales well and is extensible. The method finds the optimal decomposition when the signal class constraints and loss functions are convex, and appears to be a good heuristic when they are not.

## Equitable Valuation of Data

**Date: April 16, 2021 1:00 pm**

Speaker: Amirata Ghorbani

As data becomes the fuel driving technological and economic growth, a fundamental challenge is how to quantify the value of data in algorithmic predictions and decisions. For example, in healthcare and consumer markets, it has been suggested that individuals should be compensated for the data that they generate, but it is not clear what is an equitable valuation for individual data. In this talk, we discuss a principled framework to address data valuation in the context of supervised machine learning. Given a learning algorithm trained on a number of data points to produce a predictor, we propose data Shapley as a metric to quantify the value of each training datum to the predictor performance. Data Shapley value uniquely satisfies several natural properties of equitable data valuation. We introduce Monte Carlo and gradient-based methods to efficiently estimate data Shapley values in practical settings where complex learning algorithms, including neural networks, are trained on large datasets. We then briefly discuss the notion distributional Shapley, where the value of a point is defined in the context of underlying data distribution

## LassoNet: A Neural Network with Feature Sparsity

**Date: April 2, 2021 1:00 pm**

Speaker: Ismael Lemhadri (Stanford)

Much work has been done recently to make neural networks more interpretable, and one approach is to arrange for the network to use only a subset of the available features. In linear models, Lasso (or L1-regularized) regression assigns zero weights to the most irrelevant or redundant features, and is widely used in data science. However the Lasso only applies to linear models. Here we introduce LassoNet, a neural network framework with global feature selection. Our approach enforces a hierarchy: specifically a feature can participate in a hidden unit only if its linear representative is active. Unlike other approaches to feature selection for neural nets, our method uses a modified objective function with constraints, and so integrates feature selection with the parameter learning directly. As a result, it delivers an entire regularization path of solutions with a range of feature sparsity. On systematic experiments, LassoNet significantly outperforms state-of-the-art methods for feature selection and regression. The LassoNet method uses projected proximal gradient descent, and generalizes directly to deep networks. It can be implemented by adding just a few lines of code to a standard neural network.

## Machine Learning for Big Data Cosmology and High Energy Physics

**Date: February 23, 2021 1:00 pm**

Speaker: Agnes Ferte

In the context of future galaxy surveys such as the Legacy Survey of Space and Time (LSST), I proposed an application of unsupervised learning algorithms such as Self-Organizing Maps to efficiently explore the theory space of cosmological models. In the first part of my talk, I will explain the challenges motivating this research and present our first results aiming at categorizing theories of gravity probed by weak gravitational lensing, one of the main cosmological observables that will be measured by LSST. Many experiments of the FPD at SLAC present computational challenges such as data reduction on the fly or physics simulations that require similar machine learning applications and developments. In the second part of my talk, I will present how I will expand the use of unsupervised learning algorithms to other areas at the FPD and contribute to the application of machine learning to LSST, other cosmology experiments and high energy physics experiments.

## Beyond Deep Learning in Fundamental Physics

**Date: February 16, 2021 1:00 pm**

Speaker: Lukas Heinrich

The experiments at the Large Hadron Collider (LHC) are testament to the success of the reductionist approach to science: the analytical modelling of the 100 million data channels of HEP is patently hard but through a deep, hierarchical stack of simulation across many length and energy-scales and a physics-driven, expert-designed dimensionality reduction procedure, inference on the fundamental parameters of quantum field theory is achievable. In recent years, advancements in Machine Learning techniques have provided physicists promising new tools to analyze the LHC data. To exploit them fundamental questions need to be addressed: How do we formulate ML optimization goals to align with our science goals? How can we translate known constraints in the data into appropriate inductive biases of the trained algorithms? Can we express and incorporate uncertainties and maintain interpretability to achieve safe inference? In light of these challenges I will discuss in this talk recent progress i end-to-end gradient-based optimization, Active Learning, simulator-assisted probabilistic programming.

## Machine Learning for Dark Matter

**Date: February 12, 2021 1:00 pm**

Speaker: Bryan Ostdiek (Harvard)

There is five times more dark matter than ordinary matter in the universe, but we have almost no idea what it is. To learn about the possible interactions of dark matter, physicists use complementary data from cosmological probes, astroparticle observations, and particle colliders. There is an increasing need for advanced analytics and machine learning to process these vastly growing datasets. This talk details examples using machine learning in each of the three realms. First, I demonstrate using image recognition techniques on images of strongly lensed galaxies to constrain dark matter properties. Second, I use machine learning to uncover the phase space distribution of dark matter near the Earth, which directly impacts the interpretation of direct detection experiments. Finally, I examine how unsupervised learning methods can aid collider searches for dark matter. The talk concludes with comments on the intersection of machine learning and physics.

## Searching for dark matter in the sky with machine learning

**Date: February 9, 2021 1:00 pm**

Speaker: Siddharth Mishra Sharma (NYU)

The next decade will see a deluge of new cosmological data that will enable us to accurately map out the distribution of matter in the local Universe, image billion of stars and galaxies to unprecedented precision, and create high-resolution maps of the Milky Way. Signatures of new physics may be hiding in these observations, offering significant discovery potential for uncovering physics beyond the Standard Model, in particular the nature of dark matter. At the same time, the complexity of astrophysical data provides significant challenges to carrying out these searches using conventional methods. I will describe how overcoming these issues will require a qualitative shift in how we approach modeling and inference in cosmology, connecting particle physics properties to cosmological observables and bringing together several recent advances in machine learning and simulation-based inference. I will present several applications of these methods. I will show how they can be used to combine information from tens of thousands of strong gravitational lensing systems in order to infer structural properties of our Universe that can be directly linked to the microphysical properties of dark matter. Finally, I will present an application to the long-standing problem of understanding the nature of the Galactic Center gamma-ray excess, highlighting challenges associated with analyzing real data and discussing ways to overcome them.

Slides are available for those who have Stanford account. Video is available with a password upon request (contact Kazuhiro Terao).

## Quantum Kernel Methods for the Classification of High-dimensional Data on a Superconducting Processor

**Date: December 11, 2020 1:00 pm**

Speaker: Evan Peters (Fermilab, University of Waterloo IQC)

We present a quantum kernel method for high-dimensional data analysis using the Google Sycamore superconducting quantum computer architecture. Our experiment utilizes the largest number of qubits to date compared to prior quantum kernel method experiments. We study an application in the domain of cosmology - a benchmark supernova type classification problem using 67 features with no dimensionality reduction and without vanishing kernel elements. While most experimental work to date has considered synthetic datasets of low dimension, and disregarded the importance of shot statistics and mean kernel element size, we show that the analysis of real, high dimensional datasets requires careful attention to these features when constructing a circuit ansatz.

## Online Bayesian Optimization for the SECAR Recoil Mass Separator

**Date: December 11, 2020 11:00 am**

Speaker: Sara Miskovich (Michigan State University)

The SEparator for CApture Reactions (SECAR) is a next-generation recoil separator system under commissioning at the National Superconducting Cyclotron Laboratory (NSCL) and Facility for Rare Isotope Beams (FRIB) at Michigan State University. SECAR is optimized for the direct measurement of capture reactions on unstable nuclei that drive some stars to explode and synthesize crucial nuclei that make up our universe. Once SECAR is operational, these precise measurements will improve our understanding of astrophysical processes such as X-ray bursts, novae and supernovae. To maximize the performance of the device, ion optical optimizations and careful beam alignment need to be achieved, which can be time consuming and difficult to achieve through manual tuning. This talk will focus on the first development of an online Bayesian optimization that utilizes a Gaussian process model to tune the beam through the complex system and improve its ion optical properties by optimizing magnet settings. The method is shown to improve recoil separator performance and save operational time for future scientific experiments.

Machine Learning with Quantum Computers

**Date: December 4, 2020 10:00 am**

Speaker: Maria Schuld (Xanadu, University of KwaZulu-Natal)

A growing number of papers are searching for intersections between High Energy Physics and the emerging field of Quantum Machine Learning. This talk gives an introduction to the latter, while critically discussing potential connections to HEP. A focus lies on the most popular approach to machine learning with quantum computers, which interprets quantum circuits as machine learning models that load input data and produce predictions. By optimizing the quantum circuit, the "quantum model" can be trained like a neural network. To offer a glimpse of the opportunities and challenges of this approach, I will discuss different aspects of such "variational quantum machine learning algorithms", including their close links to kernel methods and integration into modern machine learning pipelines.

## Reservoir computing using digital logic gate networks

**Date: November 20, 2020 11:00 am**

Speaker: Heidi Komkov (The Institute for Research in Electronics and Applied Physics, University of Maryland)

As Moore's law is coming to an end, new types of computing architectures must be explored to continue the pace of advancement in computing power. At the same time, applications of machine learning are exploding. Reservoir computing is a brain-inspired machine learning method which has shown promise for very rapid time series prediction. The reservoir functions as a recurrent neural network, and substituting a physical system for a computer-based simulation has the potential to allow computation at high speed and very low power. We use an autonomous Boolean network as a reservoir, which uses individual CMOS digital logic gates to implement the nonlinear elements used in machine learning architectures. In this talk I'll show results from an field programmable gate array (FPGA) reservoir and my designs of a 180nm application specific integrated circuit (ASIC) that has been fabricated this year.

## Power efficient hardware accelerators for machine learning, combinatorial optimization, and pattern matching applications

**Date: November 13, 2020 11:00 am**

Speaker: Cat Graves (Hewlett Packard Labs)

The dramatic rise of data-intensive workloads has revived special-purpose hardware and architectures for continuing improvements in computational speed and energy efficiency. While traditional CMOS ASICs deliver some performance gains, typically by limiting data movement or implementing “in-memory computation”, such approaches still suffer from low power efficiency. New proposals leveraging emerging non-volatile resistive RAM (ReRAM) devices for in-memory computation are highly attractive in a variety of application domains. While originally developed for as digital (binary) high density non-volatile memories, ReRAM devices have demonstrated a wide range of behaviors and properties – such as a wide range of tunable analog resistance and non-linear dynamics – which motivate their use in novel functions and new computational models. Many recent in-memory compute studies have focused on crossbar circuit architectures, demonstrating their application for neural networks, scientific computing and signal processing. However, other circuit primitives – such as content addressable memories (CAMs) and combined systems such as crossbar arrays and non-linear elements– have shown further promise for mapping a diverse range of complimentary computational models such as finite state machines, pattern matching, hashing algorithms and Hopfield neural networks for tackling optimization problems. In this talk, I will review the exciting opportunities for in-memory computational primitives levering non-volatile ReRAM devices and their circuits and architectures for enabling low power, high-throughput computation in a variety of application domains. Recent lab demonstrations of various applications mapped to these in-memory computational circuit primitives based on memristor devices will be shown and I will also give an outlook on performance.

**Generative Models and Symmetries **

**Date: November 5, 2020 10:00 am**

Speaker: Danilo Rezende (Google DeepMind)

The study of symmetries in Physics has revolutionized our understanding of the world. Inspired by this, I will focus on our recent work on incorporating Gauge symmetries into normalizing flow generative models and its potential applications in the sciences and ML.

**Multi-Objective Bayesian Optimization for Accelerator Tuning **

**Date: October 30, 2020 1:00 pm**

Speaker: Ryan Roussell (University of Chicago)

Particle accelerators require constant tuning during operation to meet beam quality, total charge and particle energy requirements for use in a wide variety of physics, chemistry and biology experiments. Maximizing the performance of an accelerator facility often necessitates multi-objective optimization, where operators must balance trade-offs between multiple objectives simultaneously, often using limited, temporally expensive beam observations. Usually, accelerator optimization problems are solved offline, prior to actual operation, with advanced beamline simulations and parallelized optimization methods (NSGA-II, Swarm Optimization). Unfortunately, it is not feasible to use these methods for online multi-objective optimization, since beam measurements can only be done in a serial fashion, and these optimization methods require a large number of measurements to converge to a useful solution. Here, we introduce a multi-objective Bayesian optimization scheme, which finds the full Pareto front of an accelerator optimization problem efficiently in a serialized manner and is thus a critical step towards practical online multi-objective optimization in accelerators. This method uses a set of Gaussian process surrogate models, along with a multi-objective acquisition function, which reduces the number of observations needed to converge by at least an order of magnitude over current methods. We demonstrate how this method can be modified to specifically solve optimization challenges posed by the tuning of accelerators. This includes the addition of optimization constraints, objective preferences and costs related to changing accelerator parameters.

**Machine Learning Techniques for Optics Measurements and Corrections**

**Date: October 28, 2020 8:00 am**

Speaker: Elena Fol (CERN)

Recently, the application of ML has grown in accelerator physics, in particular in the domain of diagnostics and control. One of the first applications of ML at the LHC is focused on optics measurements and corrections. Unsupervised Learning has been applied to automatic detection of beam position monitors faults to improve optics analysis, demonstrating successful results in operation. A novel ML-based approach for the estimation of magnet errors is developed, using supervised regression models trained on a large set of LHC optics simulations. Also, autoencoder neural networks have found their application in denoising of measurements data and reconstruction of missing data points. The results and future plans for these studies will be discussed following a brief introduction to relevant ML concepts.

## Superconducting Radio-Frequency Cavity Fault Classification Using Machine Learning at Jefferson Laboratory

**Date: October 23, 2020 1:00 pm **

Speaker: Christopher Tennant (Jefferson Laboratory)

We report on the development of machine learning models for classifying C100 superconducting radio-frequency (SRF) cavity faults in the Continuous Electron Beam Accelerator Facility (CEBAF) at Jefferson Lab. CEBAF is a continuous-wave recirculating linac utilizing 418 SRF cavities to accelerate electrons up to 12 GeV through 5-passes. Of these, 96 cavities (12 cryomodules) are designed with a digital low-level RF system configured such that a cavity fault triggers waveform recordings of 17 RF signals for each of the 8 cavities in the cryomodule. Subject matter experts (SME) are able to analyze the collected time-series data and identify which of the eight cavities faulted first and classify the type of fault. This information is used to find trends and strategically deploy mitigations to problematic cryomodules. However manually labeling the data is laborious and time-consuming. By leveraging machine learning, near real-time – rather than post-mortem – identification of the offending cavity and classification of the fault type has been implemented. We discuss the development and performance of the ML models as well as valuable lessons learned in bringing a ML system to deployment.

## Analytical and Parametric Model Fitting for Inverse Problems, Data Reduction, and Pattern Recognition

**Date: October 21, 2020 8:00 am**

Speaker: Youssef Nashed (ANL, Stats Perform)

Many scientific and engineering challenges can be formulated as fitting a model to existing data. Whether it is comparing a scientific simulation to known experimental observations, finding a continuous representation of sparse/discrete data points, or the values of model parameters which generalize to unforeseen data examples given historical data; all these tasks share a common underlying principle of model fitting, but with different choices made in the model formulation (parametric or analytical) and the assumptions made about the data (acquisition scheme, noise to signal ratio, continuity, or information locality). In this talk I will highlight a few use cases under this framework. Specifically, I will address research conducted at Argonne National Laboratory for X-ray image reconstruction problems, data reduction for scientific simulations, and deep learning approaches for replacing expensive iterative optimization. Additionally, I will present more recent work for sports computer vision applications that enable real time player detection, tracking, and activity prediction from broadcast video.

## Deep Learning and Quantum Gravity

**Date: October 15, 2020 4:00 pm**

Speaker: Koji Hashimoto (Osaka University)

Formulating quantum gravity is one of the final goals of fundamental physics. Recent progress in string theory brought a concrete formulation called AdS/CFT correspondence, in which a gravitational spacetime emerges from lower-dimensional non gravitational quantum systems, but we still lack in understanding how the correspondence works. I discuss similarities between the quantum gravity and deep learning architecture, by regarding the neural network as a discretized spacetime. In particular, the questions such as, when, why and how a neural network can be a space or a spacetime, may lead to a novel way to look at machine learning. I implement concretely the AdS/CFT framework into a deep learning architecture, and show the emergence of a curved spacetime as a neural network, from a given training data of quantum systems.

## Bayesian Optimization and Machine Learning for Accelerating Scientific Discovery

**Date: October 9, 2020 1:00 pm**

Speaker: Stefano Ermon (Stanford)

Applications of AI in the physical sciences require new advances in representing, reasoning about, and acquiring knowledge from data and domain expertise. Motivated by these challenges, I will present new approaches for calibrating ML systems so that predicted probabilities are more reflective of real-world uncertainty, i.e., better capture what is or isn't known by the system. I will discuss approaches to automatically acquire data to reduce uncertainty through maximally informative experiments, focusing on the design of charging protocols for electric batteries and other challenging problems in science and engineering. Finally, I will discuss opportunities for incorporating domain knowledge to further accelerate the process.

video

## Physics-informed machine learning for accelerated modeling and optimization of complex systems

**Date: October 2, 2020 1:00 pm**

Speaker: Paris Perdikaris (University of Pennsylvania)

The towering empirical success of machine learning is promising a pathway for transforming observations to actionable knowledge. Specific to modeling and optimizing complex physical and engineering systems, there is a need for methods that can seamlessly synthesize data of variable fidelity, leverage prior domain knowledge, respect the laws of physics, and provide robust predictions with quantified uncertainty. In this talk I will provide an overview of data-driven techniques that aim to address these needs, and highlight their advantages and limitations through the lens of different application studies. Specifically, we will discuss the effectiveness of Gaussian processes in integrating multi-fidelity data to accelerate the prediction of large scale computational models, as well as the potential of physics-informed deep learning models in tackling a diverse range of forward and inverse problems in computational physics. Finally, I will also discuss the role of predictive uncertainty in closing the observations-to-predictions loop as a proxy for judicious data acquisition and experimental design.

video

*More past talks can be accessed here.*

*More past talks can be accessed here.*