A spatially-adjusted Bayesian additive regression tree model to merge two datasets

Song Zhang, Ya Chen Tina Shihy, Peter Müullerz

Research output: Contribution to journalArticle

8 Citations (Scopus)

Abstract

Scientic hypotheses of interest often involve variables that are not available in a single survey. This is a common problem for researchers working with survey data. We propose a model-based approach to provide information about the missing variable. We use a spatial extension of the BART (Bayesian additive regression tree) model. The imputation of the missing variables and infer-ence about the relationship between two variables are obtained simultaneously as posterior inference under the proposed model. The uncertainty due to imputation is automatically accounted for. A simulation analysis and an application to data on self-perceived health status and income are presented.

Original languageEnglish (US)
Pages (from-to)611-634
Number of pages24
JournalBayesian Analysis
Volume2
Issue number3
DOIs
StatePublished - 2007

Fingerprint

Regression Tree
Imputation
Survey Data
Simulation Analysis
Health
Model
Model-based
Uncertainty

Keywords

  • Bart
  • Cart
  • Missing variables
  • Spatial model
  • Survey

ASJC Scopus subject areas

  • Applied Mathematics
  • Statistics and Probability

Cite this

A spatially-adjusted Bayesian additive regression tree model to merge two datasets. / Zhang, Song; Shihy, Ya Chen Tina; Müullerz, Peter.

In: Bayesian Analysis, Vol. 2, No. 3, 2007, p. 611-634.

Research output: Contribution to journalArticle

Zhang, Song ; Shihy, Ya Chen Tina ; Müullerz, Peter. / A spatially-adjusted Bayesian additive regression tree model to merge two datasets. In: Bayesian Analysis. 2007 ; Vol. 2, No. 3. pp. 611-634.
@article{4bf9ba03487647ac823b13dd7798875b,
title = "A spatially-adjusted Bayesian additive regression tree model to merge two datasets",
abstract = "Scientic hypotheses of interest often involve variables that are not available in a single survey. This is a common problem for researchers working with survey data. We propose a model-based approach to provide information about the missing variable. We use a spatial extension of the BART (Bayesian additive regression tree) model. The imputation of the missing variables and infer-ence about the relationship between two variables are obtained simultaneously as posterior inference under the proposed model. The uncertainty due to imputation is automatically accounted for. A simulation analysis and an application to data on self-perceived health status and income are presented.",
keywords = "Bart, Cart, Missing variables, Spatial model, Survey",
author = "Song Zhang and Shihy, {Ya Chen Tina} and Peter M{\"u}ullerz",
year = "2007",
doi = "10.1214/07-BA224",
language = "English (US)",
volume = "2",
pages = "611--634",
journal = "Bayesian Analysis",
issn = "1936-0975",
publisher = "Carnegie Mellon University",
number = "3",

}

TY - JOUR

T1 - A spatially-adjusted Bayesian additive regression tree model to merge two datasets

AU - Zhang, Song

AU - Shihy, Ya Chen Tina

AU - Müullerz, Peter

PY - 2007

Y1 - 2007

N2 - Scientic hypotheses of interest often involve variables that are not available in a single survey. This is a common problem for researchers working with survey data. We propose a model-based approach to provide information about the missing variable. We use a spatial extension of the BART (Bayesian additive regression tree) model. The imputation of the missing variables and infer-ence about the relationship between two variables are obtained simultaneously as posterior inference under the proposed model. The uncertainty due to imputation is automatically accounted for. A simulation analysis and an application to data on self-perceived health status and income are presented.

AB - Scientic hypotheses of interest often involve variables that are not available in a single survey. This is a common problem for researchers working with survey data. We propose a model-based approach to provide information about the missing variable. We use a spatial extension of the BART (Bayesian additive regression tree) model. The imputation of the missing variables and infer-ence about the relationship between two variables are obtained simultaneously as posterior inference under the proposed model. The uncertainty due to imputation is automatically accounted for. A simulation analysis and an application to data on self-perceived health status and income are presented.

KW - Bart

KW - Cart

KW - Missing variables

KW - Spatial model

KW - Survey

UR - http://www.scopus.com/inward/record.url?scp=77949906680&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77949906680&partnerID=8YFLogxK

U2 - 10.1214/07-BA224

DO - 10.1214/07-BA224

M3 - Article

AN - SCOPUS:77949906680

VL - 2

SP - 611

EP - 634

JO - Bayesian Analysis

JF - Bayesian Analysis

SN - 1936-0975

IS - 3

ER -