ÀÏ°ÄÃÅ¿ª½±Ö±²¥

Matching in Networks: Fundamental Limits and Efficient Algorithms
Ph.D. Dissertation, Duke University, June 2023
Abstract

In recent years, there has been an explosion of online platforms and E-commerce, which generate massive amounts of network data. For example, a friend link on Facebook represents a connection between users, and a rating on Amazon is a connection between a customer and a product. These network data contain extensive information on customers' behaviors and preferences. However, deriving meaningful insights from the data can be challenging. On the one hand, networks are often extremely large and have millions of nodes and billions of edges. On the other hand, the network can be sparse, given that some nodes interact only with a few other nodes with a significant amount of noise from missing or faulty connections. Thus, extracting valuable information from such noisy and vast amounts of data requires highly efficient approaches that can process a large amount of data and detect tenuous statistical signatures. As such, my thesis has developed along the following two interrelated streams.

The first stream aims at developing the fundamental limits and designing simple and scalable algorithms for learning complex networks. We base our study upon the decision-theoretic framework wherein an unknown structure underlies the network data. We aim to detect and recover this hidden structure based on partial or noisy observation. In particular, we focus on the problem of graph matching (network alignment), which refers to finding the bijection between the vertex sets of the two graphs that maximizes the total number of common edges. We have initiated a statistical analysis of detection and recovery problems by matching two randomly generated graphs with an underlying vertex correspondence and correlated edges. We characterize the sharp fundamental limits for both problems and develop new algorithms that are the first to achieve detection and recovery with an explicit constant correlation for both sparse and dense regimes, which settled important open problems in this field.

The second stream focuses on improving network systems' decision-making efficiency under uncertainties and limited information. We study the dynamic matching problem in which a new agent enters the market randomly and waits to be matched, and the arrival rates for different types of agents are unknown. The goal is to maximize market efficiency and, at the same time, reduce the wait time for each participant in the system. This problem is inspired by the car-pooling platform, in which after they make a request, riders may have to wait on the platform for some time for potential matches with other riders traveling in the same direction based on their pick-up and drop-off locations. We develop hindsight-optimal primal-dual policies that achieve both constant regret and wait time for each type of agent on average at all times.