# The multispecies zero range process and modified Macdonald polynomials

## Speaker:

## Speaker Link:

## Institution:

## Time:

## Host:

## Location:

Over the last couple of decades, the theory of interacting particle systems has found some unexpected connections to orthogonal polynomials, symmetric functions, and various combinatorial structures. The asymmetric simple exclusion process (ASEP) has played a central role in this connection. Recently, Cantini, de Gier, and Wheeler found that the partition function of the multispecies ASEP on a circle is a specialization of a Macdonald polynomial $P_{\lambda}(X;q,t)$. Macdonald polynomials are a family of symmetric functions that are ubiquitous in algebraic combinatorics and specialize to or generalize many other important special functions. Around the same time, Martin gave a recursive formulation expressing the stationary probabilities of the ASEP on a circle as sums over combinatorial objects known as multiline queues, which are a type of queueing system. Shortly after, with Corteel and Williams we generalized Martin's result to give a new formula for $P_{\lambda}$ via multiline queues.

The modified Macdonald polynomials $\widetilde{H}_{\lambda}(X;q,t)$ are a version of $P_{\lambda}$ with positive integer coefficients. A natural question was whether there exists a related statistical mechanics model for which some specialization of $\widetilde{H}_{\lambda}$ is equal to its partition function. With Ayyer and Martin, we answer this question in the affirmative with the multispecies totally asymmetric zero-range process (TAZRP), which is a specialization of a more general class of zero range particle processes. We introduce a new combinatorial object in the flavor of the multiline queues, which on one hand, expresses stationary probabilities of the mTAZRP, and on the other hand, gives a new formula for $\widetilde{H}_{\lambda}$. We define an enhanced Markov chain on these objects that lumps to the multispecies TAZRP, and then use this to prove several results about particle densities and correlations in the TAZRP.

# Mathematics of synthetic data and privacy

## Speaker:

## Speaker Link:

## Institution:

## Time:

## Location:

An emerging way to protect privacy is to replace true data by synthetic data. Medical records of artificial patients, for example, could retain meaningful statistical information while preserving privacy of the true patients. But what is synthetic data, and what is privacy? How do we define these concepts mathematically? Is it possible to make synthetic data that is both useful and private? I will tie these questions to a simple-looking problem in probability theory: how much information about a random vector X is lost when we take conditional expectation of X with respect to some sigma-algebra? This talk is based on a series of papers joint with March Boedihardjo and Thomas Strohmer, mainly this one: https://arxiv.org/abs/2107.05824

# Markov processes whose jump kernels degenerate at the boundary

## Speaker:

## Institution:

## Time:

## Location:

In this talk, we discuss the potential theory of Markov processes with jump kernels decaying at the boundary of the half space. The boundary part of kernel is comparable to the product of four terms with parameters appearing as exponents in these terms. We establish sharp two-sided estimates on the Green functions of these processes for all admissible values of parameters. Depending on the regions where parameters belong, the estimates on the Green functions are different. In fact, the estimates have three different forms depending on the regions the parameters belong to. As applications, we completely determine the region of the parameters where the boundary Harnack principle holds or not. This talk is based on joint works with Renming Song and Zoran Vondraček.

# An invariance principle for Markov cookie random walks.

## Speaker:

## Institution:

## Time:

## Host:

## Location:

In joint work with E Kosygina and J Peterson, the

"natural" diffusive scaling is considered for the recurrent case

and the convergence to Brownian motion perturbed at extrema is shown. The key ideas are coarse graining and

the Ray Knight approach.

# Properties of the Riemann zeta distribution.

## Speaker:

## Speaker Link:

## Institution:

## Time:

## Location:

In this talk we will discuss properties of integers selected according to the Riemann zeta distribution. We will emphasize two aspects of this distribution. The first is its faithful similarity to properties of an integer chosen according to the uniform distribution on a finite interval. The second aspect will be the appearance of Poisson behavior under this distribution. The Riemann zeta function is given for $\mbox{Re} z>1$ by

$$\zeta(z)=\sum_{n=1}^\infty \frac{1}{n^z}.$$

An alternative description is given by

$$\zeta(z)=\Pi_{p\in\mathcal{P}}\lt(1-\frac{1}{p^z}\rt)^{-1},$$

where $\mathcal{P}$ denotes the set of primes.

In our discussions we will replace the complex $z$ by a real number $s>1.$ We will denote by $X_s$ a random variable with the distribution

P(X_s=n)=\frac{1}{\zeta(s)n^s},\, n=1,2,3,\cdots.

The statistical properties of $X_s$ is the focus of the talk.

The talk is based on joint work with Adrien Peltzer.

# Clustering a mixture of Gaussians with unknown covariance

## Speaker:

## Speaker Link:

## Institution:

## Time:

## Host:

## Location:

Clustering is a fundamental data scientific task with broad application. This talk investigates a simple clustering problem with data from a mixture of Gaussians that share a common but unknown, and potentially ill-conditioned, covariance matrix. We start by considering Gaussian mixtures with two equally-sized components and derive a Max-Cut integer program based on maximum likelihood estimation. We show its solutions achieve the optimal misclassification rate when the number of samples grows linearly in the dimension, up to a logarithmic factor. However, solving the Max-cut problem appears to be computationally intractable. To overcome this, we develop an efficient spectral algorithm that attains the optimal rate but requires a quadratic sample size. Although this sample complexity is worse than that of the Max-cut problem, we conjecture that no polynomial-time method can perform better. Furthermore, we present numerical and theoretical evidence that supports the existence of a statistical-computational gap. Finally, we generalize the Max-Cut program to a k-means program that handles multi-component mixtures with possibly unequal weights and has similar guarantees.

# Learning low degree functions in logarithmic number of random queries

## Speaker:

## Speaker Link:

## Institution:

## Time:

## Location:

Perhaps a very basic question one asks in learning theory is as follows: we are given a function f on the hypercube {-1,1}^n, and we are allowed to query samples (X, f(X)) where X is uniformly distributed on {-1,1}^n. After getting these samples (X_1, f(X_1)), ..., (X_N, f(X_N)) we would like to construct a function h which approximates f up to an error epsilon (say in L^2). Of course h is a random function as it involves i.i.d. random variables X_1, ... , X_N in its construction. Therefore, we want to construct such h which can only fail to approximate f with probability at most delta. So given parameters epsilon, delta in (0,1) the goal is to minimize the number of random queries N. I will show that around log(n) random queries are sufficient to learn bounded "low-complexity" functions. Based on joint work with Alexandros Eskenazis.

# Odd subgraphs are odd

## Speaker:

## Speaker Link:

## Institution:

## Time:

## Location:

In this talk we discuss the problems of finding large induced subgraphs of a given graph G with some degree-constraints. We survey some classical results, present some intersting and challenging open problems, and sketch solutions to some of them.

This is based on joint works with Liam Hardiman and Michael Krivelevich.

# Sharp matrix concentration

## Speaker:

## Institution:

## Time:

## Location:

Classical matrix concentration inequalities are sharp up to a logarithmic factor. This logarithmic factor is necessary in the commutative case but unnecessary in many classical noncommutative cases. We will present some matrix concentration results that are sharp in many cases, where we overcome this logarithmic factor by using an easily computable quantity that captures noncommutativity. Joint work with Afonso Bandeira and Ramon van Handel. Paper: https://arxiv.org/abs/2108.06312