文本挖掘技术在比较政治学中的应用
文章来源
[1]pan, j., & chen, k. (2018). concealing corruption: how chinese officials distort upward reporting of online grievances. american political science review, 112(3), 602-620.
[2]pan, j. (2017). how chinese officials use the internet to construct their public image. political science research and methods, 1-17.
主要内容
本次分享将首先介绍文本挖掘技术,特别是无监督机器学习方法的基本思路,并对r语言的基本操作做一简要讲解。随后,将对斯坦福大学传播系助理教授jennifer pan的两篇使用文本挖掘技术研究中国政治的论文进行复制。最后,总结文本挖掘在比较政治学中应用的现状,并以一个研究设计为例,讲解研究者如何将主题模型与经典的计量模型方法相结合。
01
论文一
题目:信息上报中的扭曲行为 concealing corruption: how chinese officials distort upward reporting of online grievances
摘要:改革开放以来,中国的地方政府迸发出巨大的活力,这在很大程度上取决于中央有能力收集下级官员行为的可靠信息。总的来说,网络参政议政是转型国家改善对下级官员监督的一种方法。然而美中不足的是,本文发现,包含腐败问题的网络舆情,在牵连下级官员或与下级官员有联系的同事时,被系统地向上级机关隐瞒。这表明即使在数字时代,强大的国家能力依然无法保证腐败的舆情被完整收集,意味着中央在监督地方上依然存在巨大挑战。
a prerequisite for the durability of authoritarian regimes as well as their effective governance is the regime’s ability to gather reliable information about the actions of lower-tier officials. allowing public participation in the form of online complaints is one approach authoritarian regimes have taken to improve monitoring of lower-tier officials. in this paper, we gain rare access to internal communications between a monitoring agency and upper-level officials in china. we show that citizen grievances posted publicly online that contain complaints of corruption are systematically concealed from upper-level authorities when they implicate lower-tier officials or associates connected to lower-tier officials through patronage ties. information manipulation occurs primarily through omission of wrongdoing rather than censorship or falsification, suggesting that even in the digital age, in a highly determined and capable regime where reports of corruption are actively and publicly voiced, monitoring the behavior of regime agents remains a challenge.
02
论文二
题目:中国官员如何利用互联网建构形象 how chinese officials use the internet to construct their public image
摘要:中国政府在政务公开上成绩斐然。本文通过分析地方政府官员为满足这些透明度要求而建立的政务网站,随机抽取了192万个政府网页作为样本,展示了地方政府官员如何利用政务网站来构建公众形象。政府网站上的大多数内容都强调政府官员的能力或亲民,这取决于官员在政治任期周期中的位置。通常而言,官员在上任之初更倾向通过强调他们对民众的关心来表现出亲民的形象,而在临近换届时更可能通过突出他们的成就来展示他们的治理能力。本文通过关注上级政府与基层官员之间的交流与信息流动,揭示了互联网如何成为地方官员自我展示的平台。
the chinese regime has launched a number of online government transparency initiatives to increase the volume of publicly available information about the activities of lower level governments. by analyzing online content produced by local government officials to fulfill these transparency requirements—a random sample of 1.92 million county-level government web pages—this paper shows how websites are commandeered by local-level officials to construct their public image. the majority of content on government websites emphasizes either the competence or benevolence of county executives, depending on where leaders are in the political tenure cycle. early tenure county executives project images of benevolence by emphasizing their attentiveness and concern toward citizens. late tenure executives project images of competence by high lighting their achievements. these findings shift the nature of debates concerning the role of the internet in authoritarian regimes from a focus on regime-society interactions to an examination of dynamics among regime insiders. by focusing on communication and the flow of information between upper-level leaders and lower-level regime agents, this paper reveals how the internet becomes a vehicle of self-promotion for local politicians.
导读人介绍
郑思尧,清华大学公共管理学院硕士研究生,本科毕业于北京大学政府管理学院政治学与行政学专业。研究领域为政治学研究方法,比较政治学。理论兴趣主要为转型国家的信息政治学和集体行动问题,方法兴趣主要为机器学习在传统因果推断中的应用。他曾在《公共行政评论》发表论文,也曾在美国中西部政治学年会和全美中国政治研究学会等国际会议上进行宣讲。
导读人寄语
比较政治学的学者往往囿于数据可得性的限制,而难于对一些重要的议题展开研究。在数据科学蓬勃发展的背景下,研究者可以借助网络爬虫和机器学习技术,将传统定量方法无法处理的大体量、多模态和非结构化数据整合起来,以打开权威主义政治系统这一既往难以窥见的“黑箱”。
软件下载链接
r下载网址:
rstudio下载网址:
(注:请大家提前下载好r和rstudio软件以便交流)
活动信息
时间:4月24日(周三)18:00-20:00
地点:公管学院620