当前位置: 首页 > 科学研究 > 学术活动 > 正文
重庆大学夏小超学术报告(20251108)
发布时间:2025-11-05 16:06  作者: 吕晶  初审:xn_math  复审:唐宇  来源:本站原创  浏览次数:

报告题目Communication-Efficient Pilot Estimation for Non-Randomly Distributed Data in Diverging Dimensions

夏小超重庆大学)

报告时间2025118日(星期六)11:20-12:00

报告地点:数学大楼报告厅三(814

参加人员:教师、研究生、本科生



报告摘要:The communication-efficient surrogate likelihood (CSL) framework \citep{jordan2019communication} is notable for handling massive or distributed datasets. The CSL methods use the first machine as the central one for optimization with its data and assume a fixed dimension for statistical properties. However, CSL may not suit non-randomly or heterogeneously distributed data and limit its applicability to diverging- or high-dimensional datasets. To address these issues, we propose a communication-efficient pilot (CEP) estimation strategy. This involves pilot sampling on each machine to create a pilot sample dataset and using a new pilot sample-based surrogate loss to approximate the global one, with the minimizer termed the CEP estimator. We rigorously investigate theoretical properties of the CEP estimator including its convergence rate, reaching the global rate $\sqrt{\frac{p_n}{N}}$, and its asymptotic normality when the dimension $p_n$ diverges with the pilot sample size $r$ and $p_n < n$. Additionally, we extend CEP to high-dimensional cases ($p_n>n$) and propose a regularized version of CEP (CERP). We establish non-asymptotic error bounds for the CERP estimator with Lasso penalty (CERP-Lasso) and provide convergence rates and asymptotic normality for the CERP estimator with adaptive Lasso penalty (CERP-aLasso) under generalized linear models. Extensive synthetic and real datasets demonstrate the effectiveness of our approaches.


报告人简介:夏小超,重庆大学数学与统计学院副教授,主要从事高维数据与海量数据的统计方法、理论和应用研究,主持完成国家自然科学基金青年项目,发表论文20余篇,担任中国现场统计研究会第十一届理事。