報告人:胡江 教授
報告題目:Asymptotic properties of a multicolored random reinforced urn model with an application to multi-armed bandits
報告時間:2026年4月13日(周一)15:40-16:20
報告地點:云龍校區6號樓304報告廳
主辦單位:數學與統計學院、數學研究院、科學技術研究院
報告人簡介:
胡江,教授,博士生導師,入選“國家高層次人才特殊支持計劃”青年拔尖人才。主要從事大維隨機矩陣理論與大維統計分析研究,研究興趣包括大維隨機矩陣特征根與特征向量的極限性質、高維估計與假設檢驗、機器學習模型的可解釋性。2012年博士畢業于東北師范大學,先后在新加坡國立大學、新加坡南洋理工大學、澳門大學、日本廣島大學、香港科技大學等學府訪學。主持多項國家自然科學基金,發表SCI論文四十余篇,其中包括學科權威期刊The Annals of Statistics、Bernoulli、IEEE Transactions on Information Theory等,目前擔任SCI雜志Random Matrices: Theory and Applications主編。
報告摘要:
The random self-reinforcement mechanism, characterized by the principle of ``the rich get richer'', has demonstrated significant utility across various domains. One prominent model embodying this mechanism is the random reinforcement urn model. This paper investigates a multicolored, multiple-drawing variant of the random reinforced urn model. We establish the limiting behavior of the normalized urn composition and demonstrate strong convergence upon scaling the counts of each color. Additionally, we derive strong convergence estimators for the reinforcement means, i.e., for the expectations of the replacement matrix's diagonal elements, and prove their joint asymptotic normality. It is noteworthy that the estimators of the largest reinforcement mean are asymptotically independent of the estimators of the other smaller reinforcement means. Additionally, if a reinforcement mean is not the largest, the estimators of these smaller reinforcement means will also demonstrate asymptotic independence among themselves. Furthermore, we explore the parallels between the reinforced mechanisms in random reinforced urn models and multi-armed bandits, addressing hypothesis testing for expected payoffs in the latter context.