凡是搞计量经济的,都关注这个号了
所有计量经济圈方法论丛的code程序, 宏微观数据库和各种软件都放在社群里.欢迎到计量经济圈社群交流访问.
关于Stata相关技能,各位学者可以参阅如下文章:1.Stata16新增功能有哪些? 满满干货拿走不谢,2.Stata资料全分享,快点收藏学习,3.Stata统计功能、数据作图、学习资源等,4.Stata学习的书籍和材料大放送, 以火力全开的势头,5.史上最全Stata绘图技巧, 女生的最爱,6.把Stata结果输出到word, excel的干货方案,7.编程语言中的函数什么鬼?Stata所有函数在此集结,8.世界范围内使用最多的500个Stata程序,9.6张图掌握Stata软件的方方面面, 还有谁, 还有谁?,10.LR检验、Wald检验、LM检验什么鬼?怎么在Stata实现,11.Stata15版新功能,你竟然没有想到,一睹为快,12."高级计量经济学及Stata应用"和"Stata十八讲"配套数据,13.数据管理的Stata程序功夫秘籍,14.非线性面板模型中内生性解决方案以及Stata命令,15.把动态面板命令讲清楚了,对Stata的ado详尽解释,16.半参数估计思想和Stata操作示例,17.Stata最有用的points都在这里,无可替代的材料,18.PSM倾向匹配Stata操作详细步骤和代码,干货十足,19.随机前沿分析和包络数据分析 SFA,DEA 及Stata操作,20.福利大放送, Stata编程技巧和使用Tips大集成,21.使用Stata进行随机前沿分析的经典操作指南,22.Stata, 不可能后悔的10篇文章, 编程code和注解,23.用Stata学习Econometrics的小tips, 第二发礼炮,24.用Stata学习Econometrics的小tips, 第一发礼炮,25.广义合成控制法gsynth, Stata运行程序release,26.多重中介效应的估计与检验, Stata MP15可下载,27.输出变量的描述性统计的方案,28.2SLS第一阶段输出, 截面或面板数据及统计值都行,29.盈余管理指标的构建及其Stata实现程序, 对应解读和经典文献,30.Python, Stata, R软件史上最全快捷键合辑!,31.用Stata做面板数据分析, 操作代码应有尽有,32.用Stata做面板数据分析, 操作代码应有尽有。还有很多相关文章,各位学者可以自行搜索参阅。
计量社群里有Stata 16,各位群友可以自行下载使用。
文章1:Modeling count data with marginalized zero-inflated distributions
T. H. Cummings and J. W. Hardin
Abstract: In this article, we present new commands for modeling count data using marginalized zero-inflated distributions. While we mainly focus on presenting new commands for estimating count data, we also present examples that illustrate some of these new commands.
文章1:使用边际化零膨胀分布对计数数据建模
T. H. Cummings and J. W. Hardin
摘要:在本文中,我们提供了使用边际化零膨胀分布对计数数据进行建模的新命令。尽管我们主要致力于提供用于估算计数数据的新命令,但我们还提供了一些示例,以说明其中的一些新命令。
文章2:kg_nchs: A command for Korn–Graubard confidence intervals and National Center for Health Statistics' Data Presentation Standards for Proportions
B. W. Ward
Abstract: In August 2017, the National Center for Health Statistics (NCHS), part of the U.S. Federal Statistical System, published new standards for determining the reliability of proportions estimated using their data. These standards require one to take the Korn–Graubard confidence interval (CI), CI widths, sample size, and degrees of freedom to assess reliability of a proportion and determine whether it can be presented. The assessment itself involves determining whether several conditions are met. In this article, I present kg_nchs, a postestimation command that is used following svy: proportion. It allows Stata users to a) calculate the Korn–Graubard CI and associated statistics used in applying the NCHS presentation standards for proportions and b) display a series of three dichotomous flags that show whether the standards are met. I provide empirical examples to show how kg_nchs can be used to easily apply the standards and prevent Stata users from needing to perform manual calculations. While developed for NCHS survey data, this command can also be used with data that stem from any survey with a complex sample design.
文章2:kg_nchs:Korn–Graubard置信区间的命令和国家卫生统计中心的比例数据表示标准
B. W. Ward
摘要:2017年8月,美国联邦统计系统一部分的国家卫生统计中心(NCHS)发布了新标准,用于确定使用其数据估算的比例的可靠性。这些标准要求人们采用Korn-Graubard置信区间(CI),CI宽度,样本大小和自由度来评估比例的可靠性并确定是否可以提出。评估本身包括确定是否满足几个条件。在本文中,我提出的kg_nchs命令,一个用于SVY:比例命令之后的 postestimation命令。它允许Stata用户使用:a)计算适用于比例的NCHS表示标准时使用的Korn-Graubard CI和相关统计信息,以及b)显示一系列三个二分旗,显示是否符合标准。我提供了一些经验示例,以说明如何使用kg_nchs轻松应用标准并防止Stata用户进行手动计算。当为NCHS调查数据开发时,此命令也可以与源自具有复杂样本设计的任何调查的数据一起使用。
文章3:konfound: Command to quantify robustness of causal inferences
R. Xu, K. A. Frank, S. J. Maroulis, and J. M. Rosenberg
Abstract: Statistical methods that quantify the discourse about causal inferences in terms of possible sources of biases are becoming increasingly important to many social-science fields such as public policy, sociology, and education. These methods are also known as “robustness or sensitivity analyses”. A series of recent works (Frank [2000, Sociological Methods and Research 29: 147–194]; Pan and Frank [2003, Journal of Educational and Behavioral Statistics 28: 315– 337]; Frank and Min [2007, Sociological Methodology 37: 349–392]; and Frank et al. [2013, Educational Evaluation and Policy Analysis 35: 437–460]) on robustness analysis extends earlier methods. We implement these recent developments in Stata. In particular, we provide commands to quantify the percent bias necessary to invalidate an inference from a Rubin causal model framework and the robustness of causal inferences in terms of correlations associated with unobserved variables.
文章3:konfound:用于量化因果推断的鲁棒性的命令
R. Xu, K. A. Frank, S. J. Maroulis, and J. M. Rosenberg
摘要:根据偏差的可能来源量化因果推理的论述的统计方法对于许多社会科学领域(例如公共政策,社会学和教育)变得越来越重要。这些方法也称为“稳健性或敏感性分析”。一系列最新著作(Frank [2000,社会学方法和研究 29:147-194];Pan和Frank [2003,教育与行为统计杂志 28:315-337];Frank和Min [2007,社会学方法 37:349–392];以及Frank等[2013,教育评估和政策分析35:437–460])的鲁棒性分析扩展了早期方法。我们在Stata实现这些最新发展。特别是,我们提供了一些命令,用于量化使来自鲁宾因果模型框架的推理无效所需的百分比偏差以及因果推理的鲁棒性(与与未观察变量相关的相关性)。
文章4:Estimation of pre- and posttreatment average treatment effects with binary time-varying treatment using Stata
G. Cerulli and M. Ventura
Abstract: In this article, we describe tvdiff, a community-contributed command that implements a generalization of the difference-in-differences estimator to the case of binary time-varying treatment with pre- and postintervention periods. tvdiff is flexible and can accommodate many actual situations, enabling the user to specify the number of pre- and postintervention periods and a graphical representation of the estimated coefficients. In addition, tvdiff provides two distinct tests for the necessary condition of the identification of causal effects, namely, two tests for the so-called parallel-trend assumption. tvdiff is intended to simplify applied works on program evaluation and causal inference when longitudinal data are available.
文章4:使用Stata估算二值时变处理前后的平均处理效应
G. Cerulli and M. Ventura
摘要:在本文中,我们描述了tvdiff,这是一个由社区贡献的命令,用于对干预前和干预后的二元时变处理情况实现双重差分估计值的一般化。tvdiff灵活,可以适应许多实际情况,使用户可以指定干预前后的次数以及估算系数的图形表示。此外,tvdiff为确定因果关系的必要条件提供了两种不同的检验,即针对所谓的平行趋势假设的两项检验。tvdiff旨在简化纵向数据可用时在程序评估和因果推理方面的应用工作。
文章5:Visualizing effect modification on contrasts
N. H. Bruun
Abstract: A recurring problem in statistics is estimating and visualizing nonlinear dependency between an effect and an effect modifier. One approach to handle this is polynomial regressions of some order. However, polynomials are known for fitting well only in limited ranges. In this article, I present a simple approach for estimating the effect as a contrast at selected values of the effect modifier. I implement this approach using the flexible restricted cubic splines for the point estimation in a new simple command, emc. I compare the approach with other classical approaches addressing the problem.
文章5:可视化对比效果修正
N. H. Bruun
摘要:统计中经常出现的问题是估计和可视化效果与效果修改器之间的非线性相关性。处理此问题的一种方法是某种程度的多项式回归。但是,多项式仅在有限的范围内拟合良好。在本文中,我提出了一种简单的方法,用于以效果修改器的选定值作为对比来估计效果。我在新的简单命令emc中使用灵活的受限三次样条进行点估计来实现此方法。我将这种方法与其他解决该问题的经典方法进行了比较。
文章6:Two-sample instrumental-variables regression with potentially weak instruments
J. Choi and S. Shen
Abstract: We develop a command, weaktsiv, for two-sample instrumentalvariables regression models with one endogenous regressor and potentially weak instruments. weaktsiv includes the classic two-sample two-stage least-squares estimator whose inference is valid only under the assumption of strong instruments. It also includes statistical tests and confidence sets with correct size and coverage probabilities even when the instruments are weak.
文章6:具有潜在弱工具的两样本工具变量回归
J. Choi and S. Shen
摘要:我们为带有一个内生回归变量和潜在弱函数的两样本工具变量回归模型开发了一个命令,weaktsiv。weaktsiv包括经典的两样本两阶段最小二乘估计器,其推论仅在强工具假设下才有效。它还包括具有正确大小和覆盖率概率的统计检验和置信度集,即使工具较弱也是如此。
文章7:Added-variable plots with confidence intervals
J. L. Gallup
Abstract: An added-variable plot is an effective way to show the correlation between an independent variable and a dependent variable conditional on other independent variables. For multivariate estimation, a simple scatterplot showing x versus y is not adequate to show the partial correlation of x with y, because it ignores the impact of the other covariates. Added-variable plots are especially effective for showing the correlation of a dummy x variable with y because the dummy variable conditional on other covariates becomes a continuous variable, making the relationship easier to visualize.Added-variable plots are also useful for spotting influential outliers in the data that affect the estimated regression parameters. Stata provides added-variable plots after ordinary least-squares regressions with theavplot command. I present a new command, avciplot, that adds a confidence interval and other options to theavplot command.
文章7:具有置信区间的加变量图
J. L. Gallup
摘要:加变量图是显示自变量与以其他自变量为条件的因变量之间的相关性的有效方法。对于多变量估计,一个简单的散点图表示X与y不充分显示的部分相关X与ÿ,因为它忽略了其他协变量的影响。附加变量图对于显示虚拟x变量与y的相关性特别有效,因为以其他协变量为条件的虚拟变量变为连续变量,使关系更易于可视化。加变量图还可用于发现影响估计回归参数的数据中有影响的离群值。在使用avplot命令进行普通最小二乘回归之后,Stata提供了可变变量图。我展示了一个新命令avciplot,它向avplot命令添加了置信区间和其他选项。
文章8:cvauroc: Command to compute cross-validated area under the curve for ROC analysis after predictive modeling for binary outcomes
M. A. Luque-Fernandez, D. Redondo-Sánchez, and C. Maringe
Abstract: Receiver operating characteristic (ROC) analysis is used for comparing predictive models in both model selection and model evaluation. ROC analysis is often applied in clinical medicine and social science to assess the tradeoff between model sensitivity and specificity. After fitting a binary logistic or probit regression model with a set of independent variables, the predictive performance of this set of variables can be assessed by the area under the curve (AUC) from an ROC curve. An important aspect of predictive modeling (regardless of model type) is the ability of a model to generalize to new cases. Evaluating the predictive performance (AUC) of a set of independent variables using all cases from the original analysis sample often results in an overly optimistic estimate of predictive performance. One can use K-fold cross-validation to generate a more realistic estimate of predictive performance in situations with a small number of observations. AUC is estimated iteratively for k samples (the “test” samples) that are independent of the sample used to predict the dependent variable (the “training” sample). cvauroc implements k-fold cross-validation for the AUC for a binary outcome after fitting a logit or probit regression model, averaging the AUCs corresponding to each fold, and bootstrapping the cross-validated AUC to obtain statistical inference and 95% confidence intervals. Furthermore, cvauroc optionally provides the cross-validated fitted probabilities for the dependent variable or outcome, contained in a new variable named fit; the sensitivity and specificity for each of the levels of the predicted outcome, contained in two new variables named _senand _spe; and the plot of the mean cross-validated AUC and k-fold ROC curves.
文章8_cvauroc:在对二进制结果进行预测建模之后,计算曲线下交叉验证面积以进行ROC分析的命令
M. A. Luque-Fernandez, D. Redondo-Sánchez, and C. Maringe
摘要:接收器工作特性(ROC)分析用于在模型选择和模型评估中比较预测模型。ROC分析通常用于临床医学和社会科学中,以评估模型敏感性和特异性之间的权衡。在用一组自变量拟合二进制logistic或Probit回归模型后,可以通过ROC曲线的曲线下面积(AUC)来评估这组变量的预测性能。预测建模的一个重要方面(与模型类型无关)是模型能够将其推广到新案例的能力。使用原始分析样本中的所有情况评估一组自变量的预测性能(AUC)通常会导致对预测性能的评估过于乐观。一个可以用K折叠式交叉验证可在只有少量观察值的情况下生成更实际的预测性能估计。对于k个样本(“测试”样本)进行迭代估计AUC ,这与用于预测因变量的样本(“训练”样本)无关。在拟合logit或概率回归模型后,cvauroc对二进制结果的AUC进行k倍交叉验证,对与每个折叠相对应的AUC求平均值,并自举交叉验证的AUC以获得统计推断和95%置信区间。此外,cvauroc可选地为包含在名为fit的新变量中的因变量或结果提供交叉验证的拟合概率。; 对每个预期结果水平的敏感性和特异性,包含在两个新变量sen和_spe中;以及交叉验证的平均AUC和k倍ROC曲线的图。
文章9:The fayherriot command for estimating small-area indicators
C. Halbmeier, A.-K. Kreutzmann, T. Schmid, and C. Schröder
Abstract: We introduce a command, fayherriot, that implements the Fay– Herriot model (Fay and Herriot, 1979, Journal of the American Statistical Association 74: 269–277), which is a small-area estimation technique (Rao and Molina, 2015, Small Area Estimation), in Stata. The Fay–Herriot model improves the precision of area-level direct estimates using area-level covariates. It belongs to the class of linear mixed models with normally distributed error terms. The fayherriot command encompasses options to a) produce out-of-sample predictions, b) adjust nonpositive random-effects variance estimates, and c) deal with the violation of model assumptions.
文章9:用fayherriot命令估算小面积指标
C. Halbmeier, A.-K. Kreutzmann, T. Schmid, and C. Schröder
摘要:我们引入了一个命令fayherriot,该命令实现了Fay–Herriot模型(Fay和Herriot,1979年,美国统计协会杂志74:269–277),这是一种小区域估计技术(Rao和Molina,2015年,《小》面积估算),位于Stata。Fay-Herriot模型使用区域级协变量提高了区域级直接估计的精度。它属于具有正态分布误差项的线性混合模型。fayherriot命令包含以下选项:a)产生样本外预测,b)调整非正向随机效应方差估计,以及c)处理违反模型假设的情况。
文章10:intcount: A command for fitting count-data models from interval data
S. Pudney
Abstract: In this article, I describe a community-contributed command, intcount, that fits one of several regression models for count data observed in interval form. The models available are Poisson, negative binomial, and binomial, and they can be fit in standard or zero-inflated form. I illustrate the command with an application to analysis of data from the UK Understanding Society survey on the demand for healthcare services.
文章10:intcount:用于从间隔数据拟合计数数据模型的命令
S. Pudney
摘要:在本文中,我描述了一个由社区贡献的命令intcount,它适合几种以间隔形式观察到的计数数据的回归模型之一。可用的模型为Poisson,负二项式和二项式,它们可以标准或零膨胀形式拟合。我用一个应用程序来说明该命令,该应用程序用于分析英国理解协会关于医疗保健服务需求的调查数据。
文章11:parallel: A command for parallel computing
G. G. Vega Yon and B. Quistorff
Abstract: The parallel package allows parallel processing of tasks that are not interdependent. This allows all flavors of Stata to take advantage of multiprocessor machines. Even Stata/MP users can benefit because many community-contributed programs are not automatically parallelized but could be under our framework.
文章11:parallel:用于并行计算的命令
G. G. Vega Yon and B. Quistorff
摘要:并行程序包允许并行处理不相互依赖的任务。这使Stata的所有形式都可以利用多处理器机器。甚至Stata / MP用户也可以从中受益,因为许多社区贡献的程序不会自动并行化,但可以在我们的框架下进行。
文章12:Estimation of dynamic panel threshold model using Stata
M. H. Seo, S. Kim, and Y.-J. Kim
Abstract: In this article, we develop a command, xthenreg, that implements the first-differenced generalized method of moments estimation of the dynamic panel threshold model that Seo and Shin (2016, Journal of Econometrics 195: 169–186) proposed. Furthermore, we derive the asymptotic variance formula for a kink-constrained generalized method of moments estimator of the dynamic threshold model and provide an estimation algorithm. We also propose a fast bootstrap algorithm to implement the bootstrap for the linearity test. We illustrate the use of xthenreg through a Monte Carlo simulation and an economic application.
文章12:使用Stata估计动态面板阈值模型
M. H. Seo, S. Kim, and Y.-J. Kim
摘要:在本文中,我们开发了一个命令xthenreg,该命令实现了Seo和Shin(2016,Journal of Econometrics 195:169-186)提出的动态面板阈值模型的矩估计的一阶广义方法。此外,我们推导了动态阈值模型的弯矩约束矩估计的广义方法的渐近方差公式,并提供了一种估计算法。我们还提出了一种快速自举算法,以实现用于线性测试的自举。我们通过蒙特卡洛模拟和经济应用说明了xthenreg的使用。
文章13:gidm: A command for generalized inflated discrete models
Y. Xia, Y. Zhou, and T. Cai
Abstract: In this article, we describe the gidm command for fitting generalized inflated discrete models that deal with multiple inflated values in a distribution. Based on the work of Cai, Xia, and Zhou (Forthcoming, Sociological Methods & Research: Generalized inflated discrete models: A strategy to work with multimodal discrete distributions), generalized inflated discrete models are fit via maximum likelihood estimation. Specifically, the gidm command fits Poisson, negative binomial, multinomial, and ordered outcomes with more than one inflated value. We illustrate this command through examples for count and categorical outcomes.
文章13:gidm:用于广义膨胀离散模型的命令
Y. Xia, Y. Zhou, and T. Cai
摘要:在本文中,我们描述了gidm命令,用于拟合处理分布中多个膨胀值的广义膨胀离散模型。根据蔡,夏和周的工作(即将出版的社会学方法与研究:广义膨胀离散模型:一种适用于多峰离散分布的策略),可以通过最大似然估计来拟合广义膨胀离散模型。具体而言,gidm命令适合泊松,负二项式,多项式和有序的结果,且具有多个膨胀值。我们通过计数和分类结果示例来说明此命令。
文章14:Speaking Stata: The last day of the month
N. J. Cox
Abstract: I discuss three related problems about getting the last day of the month in a new variable. Commentary ranges from the specifics of date and other functions to some generalities on developing code. Modular arithmetic belongs in every Stata user’s coding toolbox.
文章14:讲Stata:每月的最后一天
N. J. Cox
摘要:我讨论了有关在新变量中获取月份的最后一天的三个相关问题。注释的范围从日期和其他功能的细节到开发代码的一般性。模块化算法属于每个Stata用户的编码工具箱。
文章15:Review of Richard Valliant and Jill A. Dever's Survey Weights: A Step-by-Step Guide to Calculation
S. G. Heeringa
Abstract: In this article, I review the Stata Press publication Survey Weights: A Step-by-Step Guide to Calculation by Valliant and Dever (2018).
文章15:理查德·瓦利安特(Richard Valliant)和吉尔·A·德沃(Jill A.)的调查权重回顾:一个逐步计算指南
S. G. Heeringa
摘要:在本文中,我回顾了Stata Press的出版物Survey Survey Weights:Valliant and Dever的逐步计算指南(2018)。
文章16:Review of William Gould's The Mata Book: A Book for Serious Programmers and Those Who Want to Be
B. Jann
Abstract: In this article, I review The Mata Book: A Book for Serious Programmers and Those Who Want to Be, by William Gould (2018, Stata Press).
文章16:威廉·古尔德(William Gould)的《 The Mata Book:认真的程序员和想成为的人的书》评论
B. Jann
摘要:在本文中,我将回顾William Gould 撰写的《The Mata Book:A Book for认真的程序员和那些想成为的人》(2018年,Stata出版社)。
拓展性阅读
关于一些计量方法的合辑,各位学者可以参看如下文章:①“实证研究中用到的200篇文章, 社科学者常备toolkit”、②实证文章写作常用到的50篇名家经验帖, 学者必读系列、③过去10年AER上关于中国主题的Articles专辑、④AEA公布2017-19年度最受关注的十大研究话题, 给你的选题方向,⑤2020年中文Top期刊重点选题方向, 写论文就写这些。后面,咱们又引荐了①使用CFPS, CHFS, CHNS数据实证研究的精选文章专辑!,②这40个微观数据库够你博士毕业了, 反正凭着这些库成了教授,③Python, Stata, R软件史上最全快捷键合辑!,④关于(模糊)断点回归设计的100篇精选Articles专辑!,⑤关于双重差分法DID的32篇精选Articles专辑!,⑥关于合成控制法SCM的33篇精选Articles专辑!⑦最近80篇关于中国国际贸易领域papers合辑!,⑧最近70篇关于中国环境生态的经济学papers合辑!⑨使用CEPS, CHARLS, CGSS, CLHLS数据库实证研究的精选文章专辑!⑩最近50篇使用系统GMM开展实证研究的papers合辑!
关于一些常用数据库,各位学者可以参看如下文章:1.这40个微观数据库够你博士毕业了;2.中国工业企业数据库匹配160大步骤的完整程序和相应数据;3.中国省/地级市夜间灯光数据;4.1997-2014中国市场化指数权威版本;5.1998-2016年中国地级市年均PM2.5;6.计量经济圈经济社会等数据库合集(在社群里);7.中国方言,官员, 行政审批和省长数据库开放;8.2005-2015中国分省分行业CO2数据;9.国际贸易研究中的数据演进与当代问题;10.经济学研究常用中国微观数据手册;11.疫情期Wind资讯金融终端操作指南;12.CEIC数据库操作指南;13.清华北大经管社科数据库有哪些? 不要羡慕嫉妒恨!14.金融领域三大中文数据库, CSMAR, CCER, Wind和CNRDS,15.EPS最新版本使用手册,16.疫情期计量课程免费开放!面板数据, 因果推断, 时间序列分析与Stata应用。
下面这些短链接文章属于合集,可以收藏起来阅读,不然以后都找不到了。
2年,计量经济圈公众号近1000篇文章,
Econometrics Circle