Title: | Tools for Handling Indices and Proportions in Small Area Estimation |
Version: | 1.0.3 |
Description: | It allows for mapping proportions and indicators defined on the unit interval. It implements Beta-based small area methods comprising the classical Beta regression models, the Flexible Beta model and Zero and/or One Inflated extensions (Janicki 2020 <doi:10.1080/03610926.2019.1570266>). Such methods, developed within a Bayesian framework through Stan https://mc-stan.org/, come equipped with a set of diagnostics and complementary tools, visualizing and exporting functions. A Shiny application with a user-friendly interface can be launched to further simplify the process. For further details, refer to De Nicolò and Gardini (2024 <doi:10.18637/jss.v108.i01>). |
License: | GPL-3 |
Encoding: | UTF-8 |
LazyData: | true |
RoxygenNote: | 7.3.2 |
VignetteBuilder: | R.rsp |
RdMacros: | Rdpack |
Biarch: | true |
NeedsCompilation: | yes |
Depends: | R (≥ 3.5.0), shiny (≥ 1.0.3) |
Imports: | methods, Rcpp (≥ 0.12.0), rstan (≥ 2.26.0), ggplot2 (≥ 3.3.2), nlme (≥ 3.1.152), stats, sp, ggpubr, Rdpack |
Suggests: | rstantools (≥ 2.1.1), RcppParallel (≥ 5.0.1), callr, sf, dplyr, leaflet, tmap, spam, spdep, gridExtra (≥ 2.3), R.rsp, shinythemes, shinyFeedback, shinybusy, shinyWidgets, shinyjs, bayesplot, DT (≥ 0.32.1), loo (≥ 2.3.1) |
LinkingTo: | BH (≥ 1.66.0), Rcpp (≥ 0.12.0), RcppEigen (≥ 0.3.3.3.0), RcppParallel (≥ 5.0.1), rstan (≥ 2.26.0), StanHeaders (≥ 2.26.0) |
SystemRequirements: | GNU make |
Packaged: | 2024-09-13 14:15:33 UTC; aldo.gardini2 |
Author: | Silvia De Nicolò |
Maintainer: | Silvia De Nicolò <silvia.denicolo@unibo.it> |
Repository: | CRAN |
Date/Publication: | 2024-09-13 14:40:02 UTC |
The 'tipsae' Package.
Description
It provides tools for mapping proportions and indicators defined on the unit interval, widely used to measure, for instance, unemployment, educational attainment and also disease prevalence. It implements Beta-based small area methods, particularly indicated for unit interval responses, comprising the classical Beta regression models, the Flexible Beta model and Zero and/or One Inflated extensions. Such methods, developed within a Bayesian framework, come equipped with a set of diagnostics and complementary tools, visualizing and exporting functions. A customized parallel computing is built-in to reduce the computational time. The features of the tipsae package assist the user in carrying out a complete SAE analysis through the entire process of estimation, validation and results presentation, making the application of Bayesian algorithms and complex SAE methods straightforward. A Shiny application with a user-friendly interface can be launched to further simplify the process.
Author(s)
Silvia De Nicolò, silvia.denicolo@unibo.it
Aldo Gardini, aldo.gardini@unibo.it
References
De Nicolò S, Gardini A (2024). “The R Package tipsae: Tools for Mapping Proportions and Indicators on the Unit Interval.” Journal of Statistical Software, 108(1), 1–36. doi:10.18637/jss.v108.i01.
Stan Development Team (2020). “RStan: the R interface to Stan.” R package version 2.21.2, https://mc-stan.org/.
Carpenter B, Gelman A, Hoffman MD, Lee D, Goodrich B, Betancourt M, Brubaker M, Guo J, Li P, Riddell A (2017). “Stan: A probabilistic programming language.” Journal of Statistical Software, 76(1), 1–32.
Janicki R (2020). “Properties of the beta regression model for small area estimation of proportions and application to estimation of poverty rates.” Communications in Statistics-Theory and Methods, 49(9), 2264–2284.
Vehtari A, Gelman A, Gabry J (2017). “Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC.” Statistics and Computing, 27(5), 1413–1432.
Datta GS, Ghosh M, Steorts R, Maples J (2011). “Bayesian benchmarking with applications to small area estimation.” Test, 20(3), 574–588.
Kish L (1992). “Weighting for Unequal Pi.” Journal of Official Statistics, 8(2), 183.
Fabrizi E, Ferrante MR, Pacei S, Trivisano C (2011). “Hierarchical Bayes multivariate estimation of poverty rates based on increasing thresholds for small domains.” Computational Statistics & Data Analysis, 55(4), 1736–1747.
Morris M, Wheeler-Martin K, Simpson D, Mooney SJ, Gelman A, DiMaggio C (2019). “Bayesian hierarchical spatial models: Implementing the Besag York Mollié model in stan.” Spatial and Spatio-Temporal Epidemiology, 31, 100301.
De Nicolò S, Ferrante MR, Pacei S (2023). “Small area estimation of inequality measures using mixtures of Beta.” https://doi.org/10.1093/jrsssa/qnad083.
Chang W, Cheng J, Allaire JJ, Sievert C, Schloerke B, Xie Y, Allen J, McPherson J, Dipert A, Borges B (2021). “shiny: Web Application Framework for R.” R package version 1.6.0, https://CRAN.R-project.org/package=shiny.
Benchmarking Procedure for Model-Based Estimates
Description
The benchmark()
function gives the chance to perform a benchmarking procedure on model-based estimates. Benchmarking could target solely the point estimates (single benchmarking) or, alternatively, also the ensemble variability (double benchmarking). Furthermore, an estimate of the overall posterior risk is provided, aggregated for all areas. This value is only yielded when in-sample areas are treated and a single benchmarking is performed.
Usage
benchmark(
x,
bench,
share,
method = c("raking", "ratio", "double"),
H = NULL,
time = NULL,
areas = NULL
)
Arguments
x |
Object of class |
bench |
A numeric value denoting the benchmark for the whole set of areas or a subset of areas. |
share |
A numeric vector of areas weights, in case of proportions it denotes the population shares. |
method |
The method to be specified among |
H |
A numeric value denoting an additional benchmark, to be specified when the |
time |
A character string indicating the time period to be considered, in case of temporal models, where a benchmark can be specified only for one time period at a time. |
areas |
If |
Details
The function allows performing three different benchmarking methods, according to the argument method.
The
"ratio"
and"raking"
methods provide benchmarked estimates that minimize the posterior expectation of the weighted squared error loss, see Datta et al. (2011) andtipsae
vignette.The
"double"
method accounts for a further benchmark on the weighted ensemble variability, whereH
is a prespecified value of the estimators variability.
Value
A benchmark_fitsae
object being a list of the following elements:
bench_est
A vector including the benchmarked estimates for each considered domain.
post_risk
A numeric value indicating an estimate of the overall posterior risk, aggregated for all areas. This value is only yielded when in-sample areas are treated and a single benchmarking is performed.
method
The benchmarking method performed as selected in the input argument.
time
The time considered as selected in the input argument.
areas
The areas considered as selected in the input argument.
data_obj
A list containing input objects including in-sample and out-of-sample relevant quantities.
model_settings
A list summarizing all the assumptions of the model: sampling likelihood, presence of intercept, dispersion parametrization, random effects priors and possible structures.
model_estimates
Posterior summaries of target parameters for in-sample areas.
model_estimates_oos
Posterior summaries of target parameters for out-of-sample areas.
is_oos
Logical vector defining whether each domain is out-of-sample or not.
direct_est
Vector of direct estimates for in-sample areas.
References
Datta GS, Ghosh M, Steorts R, Maples J (2011). “Bayesian benchmarking with applications to small area estimation.” Test, 20(3), 574–588.
De Nicolò S, Gardini A (2024). “The R Package tipsae: Tools for Mapping Proportions and Indicators on the Unit Interval.” Journal of Statistical Software, 108(1), 1–36. doi:10.18637/jss.v108.i01.
See Also
summary.fitsae
to produce the input object.
Examples
library(tipsae)
# loading toy dataset
data("emilia_cs")
# fitting a model
fit_beta <- fit_sae(formula_fixed = hcr ~ x, data = emilia_cs, domains = "id",
type_disp = "var", disp_direct = "vars", domain_size = "n",
# MCMC setting to obtain a fast example. Remove next line for reliable results.
chains = 1, iter = 150, seed = 0)
# check model diagnostics
summ_beta <- summary(fit_beta)
# creating a subset of the areas whose estimates have to be benchmarked
subset <- c("RIMINI", "RICCIONE", "RUBICONE", "CESENA - VALLE DEL SAVIO")
# creating population shares of the subset areas
pop <- emilia_cs$pop[emilia_cs$id %in% subset]
shares_subset <- pop / sum(pop)
# perform benchmarking procedure
bmk_subset <- benchmark(x = summ_beta,
bench = 0.13,
share = shares_subset,
method = "raking",
areas = subset)
# check benchmarked estimates and posterior risk
bmk_subset$bench_est
bmk_subset$post_risk
Density Plot Function for a summary_fitsae
Object
Description
The method density()
provides, in a grid (default) or sequence, the density plot of direct estimates versus HB model estimates and the density plot of standardized posterior means of the random effects versus standard normal.
Usage
## S3 method for class 'summary_fitsae'
density(x, grid = TRUE, ...)
Arguments
x |
Object of class |
grid |
Logical indicating whether plots are displayed in a grid ( |
... |
Currently unused. |
Value
Two ggplot2
objects in a grid or in sequence.
See Also
summary.fitsae
to produce the input object.
Examples
library(tipsae)
# loading toy dataset
data("emilia_cs")
# fitting a model
fit_beta <- fit_sae(formula_fixed = hcr ~ x, data = emilia_cs, domains = "id",
type_disp = "var", disp_direct = "vars", domain_size = "n",
# MCMC setting to obtain a fast example. Remove next line for reliable results.
chains = 1, iter = 150, seed = 0)
# check model diagnostics
summ_beta <- summary(fit_beta)
# visualize estimates and random effect densities via density() function
density(summ_beta)
Poverty in Emilia-Romagna (Italy) Health Districts
Description
The emilia
dataset consists of a panel on poverty mapping concerning 38 health districts within the Emilia-Romagna region, located in North-East of Italy, with annual observations recorded from 2014 to 2018.
Usage
emilia
Format
Dataframe with 190 observations and 8 variables.
id
Character, name of the health district.
prov
Character, name of NUTS-3 region related to the district.
year
Numeric, year of the observation.
hcr
Numeric, head-count ratio estimate (used as response variable).
vars
Numeric, sampling variance of head-count ratio estimator.
n
Numeric, area sample size.
x
Numeric, fake covariate.
pop
Numeric, population size of the area.
Details
It has been built starting from model-based estimates and related CV freely available on Emilia-Romagna region website. Since it is used for illustrative purposes only, such estimates are assumed to be unreliable direct estimates, requiring a SAE procedure.
Examples
library(tipsae)
data("emilia")
Poverty in Emilia-Romagna (Italy) Health Districts in 2016
Description
The emilia
dataset consists of a dataset on poverty mapping concerning 38 health districts within the Emilia-Romagna region, located in North-East of Italy, with observations recorded in 2016.
Usage
emilia_cs
Format
Dataframe with 38 area observations and 8 variables.
id
Character, name of the health district.
prov
Character, name of NUTS-3 region related to the district.
year
Numeric, year of the observation.
hcr
Numeric, head-count ratio estimate (used as response variable).
vars
Numeric, sampling variance of head-count ratio estimator.
n
Numeric, area sample size.
x
Numeric, fake covariate.
pop
Numeric, population size of the area.
Details
It has been built starting from model-based estimates and related CV freely available on Emilia-Romagna region website. Since it is used for illustrative purposes only, such estimates are assumed to be unreliable direct estimates, requiring a SAE procedure.
See Also
emilia
for the panel dataset including observation from 2014 to 2018.
Examples
library(tipsae)
data("emilia_cs")
Shapefile of Emilia-Romagna (Italy) Health Districts
Description
The emilia_shp
shapefile consists of a SpatialPolygonsDataFrame
object of 38 health districts within the Emilia-Romagna region, located in the North-East of Italy.
Usage
emilia_shp
Format
A shapefile of class SpatialPolygonsDataFrame
.
COD_DIS_SA
Code of the health district.
NAME_DISTRICT
Name of the health district. It can be linked to the variable
id
inemilia
andemilia_cs
See Also
emilia
and emilia_cs
for the provided datasets.
Examples
library(tipsae)
library(sp)
data("emilia_shp")
Exporting Results of a Small Area Model Fitting
Description
The function export()
allows for exporting model estimates in CSV format.
Usage
export(x, file, type = "all", ...)
Arguments
x |
An object of class |
file |
A character string indicating the path (if different from the working directory) and filename of the CSV to be created. It should end with .csv. |
type |
An option between |
... |
Additional arguments of |
Value
A CSV file is created in the working directory, or at the given path, exporting the estimates_fitsae
object given as input.
See Also
extract
to produce the input object and write.csv
.
Examples
## Not run:
library(tipsae)
# loading toy dataset
data("emilia_cs")
# fitting a model
fit_beta <- fit_sae(formula_fixed = hcr ~ x, data = emilia_cs, domains = "id",
type_disp = "var", disp_direct = "vars", domain_size = "n",
# MCMC setting to obtain a fast example. Remove next line for reliable results.
chains = 1, iter = 150, seed = 0)
# check model diagnostics
summ_beta <- summary(fit_beta)
# extract model estimates
HB_estimates <- extract(summ_beta)
# export model estimates
export(HB_estimates, file = "results.csv", type = "all")
## End(Not run)
Extract Posterior Summaries of Target Parameters
Description
The extract()
function provides the posterior summaries of target parameters, including model-based estimates, and possibly benchmarked estimates, related to a fitted small area model.
Usage
extract(x)
Arguments
x |
An object of class |
Value
An object of class estimates_fitsae
, being a list of two data frames, distinguishing between $in_sample
and $out_of_sample
areas, which gathers domains name, direct and HB estimates, as well as posterior summaries of target parameters. When the input is a benchmark_fitsae
object, benchmarked estimates are also included.
See Also
summary.fitsae
and benchmark
to produce the input object.
Examples
library(tipsae)
# loading toy dataset
data("emilia_cs")
# fitting a model
fit_beta <- fit_sae(formula_fixed = hcr ~ x, data = emilia_cs, domains = "id",
type_disp = "var", disp_direct = "vars", domain_size = "n",
# MCMC setting to obtain a fast example. Remove next line for reliable results.
chains = 1, iter = 150, seed = 0)
# check model diagnostics
summ_beta <- summary(fit_beta)
# extract model estimates
HB_estimates <- extract(summ_beta)
head(HB_estimates)
Fitting a Small Area Model
Description
fit_sae()
is used to fit Beta-based small area models, such as the classical Beta, zero and/or one inflated Beta and Flexible Beta models. The random effect part can incorporate either a temporal and/or a spatial dependency structure devoted to the prior specification settings. In addition, different prior assumptions can be specified for the unstructured random effects, allowing for robust and shrinking priors and different parametrizations can be set up.
Usage
fit_sae(
formula_fixed,
data,
domains = NULL,
disp_direct,
type_disp = c("neff", "var"),
domain_size = NULL,
household_size = NULL,
likelihood = c("beta", "flexbeta", "Infbeta0", "Infbeta1", "Infbeta01", "ExtBeta"),
prior_coeff = c("normal", "HorseShoe"),
p0_HorseShoe = NULL,
prior_reff = c("normal", "t", "VG"),
spatial_error = FALSE,
spatial_df = NULL,
domains_spatial_df = NULL,
temporal_error = FALSE,
temporal_variable = NULL,
scale_prior = list(Unstructured = 2.5, Spatial = 2.5, Temporal = 2.5, Coeff. = 2.5),
adapt_delta = 0.95,
max_treedepth = 10,
init = "0",
...
)
Arguments
formula_fixed |
An object of class |
data |
An object of class |
domains |
Data column name displaying the domain names. If |
disp_direct |
Data column name displaying given values of sampling dispersion for each domain. In out-of-sample areas, dispersion must be |
type_disp |
Parametrization of the dispersion parameter. The choices are variance ( |
domain_size |
Data column name indicating domain sizes (optional). In out-of-sample areas, sizes must be |
household_size |
Data column name indicating the number of sampled households. Required for the |
likelihood |
Sampling likelihood to be used. The choices are |
prior_coeff |
Prior distribution of the regression coefficients. The choices are |
p0_HorseShoe |
If |
prior_reff |
Prior distribution of the unstructured random effect. The choices are: |
spatial_error |
Logical indicating whether to include a spatially structured random effect. |
spatial_df |
Object of class |
domains_spatial_df |
Column name of the |
temporal_error |
Logical indicating whether to include a temporally structured random effect. |
temporal_variable |
Data column name indicating temporal variable. Required if |
scale_prior |
List with the values of the prior scales. 4 named elements must be provided: "Unstructured", "Spatial", "Temporal", "Coeff.". Default: all equal to 2.5. |
adapt_delta |
HMC option: target average proposal acceptance probability. See |
max_treedepth |
HMC option: target average proposal acceptance probability. See |
init |
Initial values specification. See the detailed documentation for
the init argument in |
... |
Arguments passed to |
Value
A list of class fitsae
containing the following objects:
model_settings
A list summarizing all the assumptions of the model: sampling likelihood, presence of intercept, dispersion parametrization, random effects priors and possible structures.
data_obj
A list containing input objects including in-sample and out-of-sample relevant quantities.
stanfit
A
stanfit
object, outcome ofsampling
function containing full posterior draws. For details, seestan
documentation.pars_interest
A vector containing the names of parameters whose posterior samples are stored.
call
Image of the function call that produced the
fitsae
object.
References
Janicki R (2020). “Properties of the beta regression model for small area estimation of proportions and application to estimation of poverty rates.” Communications in Statistics-Theory and Methods, 49(9), 2264–2284.
Carpenter B, Gelman A, Hoffman MD, Lee D, Goodrich B, Betancourt M, Brubaker M, Guo J, Li P, Riddell A (2017). “Stan: A probabilistic programming language.” Journal of Statistical Software, 76(1), 1–32.
Morris M, Wheeler-Martin K, Simpson D, Mooney SJ, Gelman A, DiMaggio C (2019). “Bayesian hierarchical spatial models: Implementing the Besag York Mollié model in stan.” Spatial and Spatio-Temporal Epidemiology, 31, 100301.
De Nicolò S, Ferrante MR, Pacei S (2023). “Small area estimation of inequality measures using mixtures of Beta.” https://doi.org/10.1093/jrsssa/qnad083.
De Nicolò S, Gardini A (2024). “The R Package tipsae: Tools for Mapping Proportions and Indicators on the Unit Interval.” Journal of Statistical Software, 108(1), 1–36. doi:10.18637/jss.v108.i01.
See Also
sampling
for sampler options and summary.fitsae
for handling the output.
Examples
library(tipsae)
# loading toy cross sectional dataset
data("emilia_cs")
# fitting a cross sectional model
fit_beta <- fit_sae(formula_fixed = hcr ~ x, data = emilia_cs, domains = "id",
type_disp = "var", disp_direct = "vars", domain_size = "n",
# MCMC setting to obtain a fast example. Remove next line for reliable results.
chains = 1, iter = 150, seed = 0)
# Spatio-temporal model: it might require time to be fitted
## Not run:
# loading toy panel dataset
data("emilia")
# loading the shapefile of the concerned areas
data("emilia_shp")
# fitting a spatio-temporal model
fit_ST <- fit_sae(formula_fixed = hcr ~ x,
domains = "id",
disp_direct = "vars",
type_disp = "var",
domain_size = "n",
data = emilia,
spatial_error = TRUE,
spatial_df = emilia_shp,
domains_spatial_df = "NAME_DISTRICT",
temporal_error = TRUE,
temporal_variable = "year",
max_treedepth = 15,
seed = 0)
## End(Not run)
Map Relevant Quantities from a Small Area Model
Description
The map()
function enables to plot maps containing relevant model outputs by accounting for their geographical dimension. The shapefile of the area must be provided via a SpatialPolygonsDataFrame
or sf
object.
Usage
map(
x,
spatial_df,
spatial_id_domains,
match_names = NULL,
color_palette = c("snow2", "deepskyblue4"),
quantity = c("HB_est", "Direct_est", "SD"),
time = NULL,
style = "quantile",
...
)
Arguments
x |
An object of class |
spatial_df |
A object of class |
spatial_id_domains |
A character string indicating the name of |
match_names |
An encoding two-columns |
color_palette |
A vector with two color strings denoting the extreme bounds of colors range to be used. |
quantity |
A string indicating the quantity to be mapped. When a |
time |
A string indicating the year of interest for the quantities to be treated, in case of temporal or spatio-temporal objects. |
style |
Method to process the color scale, see |
... |
Arguments passed to |
Value
Atmap
object.
See Also
summary.fitsae
to produce the input object and SpatialPolygonsDataFrame
to manage the shapefile.
Examples
## Not run:
library(tipsae)
# loading toy dataset
data("emilia_cs")
# fitting a model
fit_beta <- fit_sae(formula_fixed = hcr ~ x, data = emilia_cs, domains = "id",
type_disp = "var", disp_direct = "vars", domain_size = "n",
# MCMC setting to obtain a fast example. Remove next line for reliable results.
chains = 1, iter = 150, seed = 0)
# check model diagnostics
summ_beta <- summary(fit_beta)
# load shapefile of concerned areas
data("emilia_shp")
# plot the map using model diagnostics and areas shapefile
map(x = summ_beta,
spatial_df = emilia_shp,
spatial_id_domains = "NAME_DISTRICT")
## End(Not run)
Plot Method for benchmark_fitsae
Object
Description
The method plot()
provides the boxplots of original and benchmarked estimates in comparison with the benchmark value. Note that share weights are not considered.
Usage
## S3 method for class 'benchmark_fitsae'
plot(x, ...)
Arguments
x |
A |
... |
Currently unused. |
Value
A ggplot2
object.
See Also
benchmark
to produce the input object.
Examples
library(tipsae)
# loading toy dataset
data("emilia_cs")
# fitting a model
fit_beta <- fit_sae(formula_fixed = hcr ~ x, data = emilia_cs, domains = "id",
type_disp = "var", disp_direct = "vars", domain_size = "n",
# MCMC setting to obtain a fast example. Remove next line for reliable results.
chains = 1, iter = 150, seed = 0)
# check model diagnostics
summ_beta <- summary(fit_beta)
# creating a subset of the areas whose estimates have to be benchmarked
subset <- c("RIMINI", "RICCIONE", "RUBICONE", "CESENA - VALLE DEL SAVIO")
# creating population shares of the subset areas
pop <- emilia_cs$pop[emilia_cs$id %in% subset]
shares_subset <- pop / sum(pop)
# perform benchmarking procedure
bmk_subset <- benchmark(x = summ_beta,
bench = 0.13,
share = shares_subset,
method = "raking",
areas = subset)
plot(bmk_subset)
Plot Method for smoothing_fitsae
Object
Description
The plot()
method provides (a) the boxplot of variance estimates, when effective sample sizes are estimated through kish
method; (b) a scatterplot of both original and smoothed estimates versus the area sample sizes, when variance smoothing is performed through methods ols
and gls
.
Usage
## S3 method for class 'smoothing_fitsae'
plot(x, size = 2.5, alpha = 0.8, ...)
Arguments
x |
A |
size |
Aesthetic option denoting the size of scatterplots points, see |
alpha |
Aesthetic option denoting the opacity of scatterplots points, see |
... |
Currently unused. |
Value
A ggplot2
object.
See Also
smoothing
to produce the input object.
Examples
library(tipsae)
# loading toy dataset
data("emilia_cs")
# perform smoothing procedure
smoo <- smoothing(emilia_cs, direct_estimates = "hcr", area_id = "id",
raw_variance = "vars", areas_sample_sizes = "n",
var_function = NULL, method = "ols")
plot(smoo)
Plot Method for a summary_fitsae
Object
Description
The generic method plot()
provides, in a grid (default) or sequence, (a) a scatterplot of direct estimates versus model-based estimates, visually capturing the shrinking process, (b) a Bayesian P-values histogram, (c) a boxplot of standard deviation reduction values, and, if areas sample sizes are provided as input in fit_sae()
, (d) a scatterplot of model residuals versus sample sizes, in order to check for design-consistency i.e., as long as sizes increase residuals should converge to zero.
Usage
## S3 method for class 'summary_fitsae'
plot(
x,
size = 2.5,
alpha = 0.8,
n_bins = 15,
grid = TRUE,
label_names = NULL,
...
)
Arguments
x |
Object of class |
size |
Aesthetic option denoting the size of scatterplots points, see |
alpha |
Aesthetic option denoting the opacity of scatterplots points, see |
n_bins |
Denoting the number of bins used for histogram. |
grid |
Logical indicating whether plots are displayed in a grid ( |
label_names |
Character string indicating the model name to display in boxplot x-axis label. |
... |
Currently unused. |
Value
Four ggplot2
objects in a grid.
See Also
summary.fitsae
to produce the input object.
Examples
library(tipsae)
# loading toy dataset
data("emilia_cs")
# fitting a model
fit_beta <- fit_sae(formula_fixed = hcr ~ x, data = emilia_cs, domains = "id",
type_disp = "var", disp_direct = "vars", domain_size = "n",
# MCMC setting to obtain a fast example. Remove next line for reliable results.
chains = 1, iter = 150, seed = 0)
# check model diagnostics
summ_beta <- summary(fit_beta)
# visualize diagnostics via plot() method
plot(summ_beta)
Print Method for a benchmark_fitsae
Object
Description
The generic method print()
allow to explore relevant outputs of the input object
Usage
## S3 method for class 'benchmark_fitsae'
print(x, digits = 3L, ...)
Arguments
x |
Object of class |
digits |
Number of digits to display. |
... |
Currently unused. |
Value
Printed information on a benchmark_fitsae
object.
Print Method for a estimates_fitsae
Object
Description
The generic method print()
allow to explore relevant outputs of the input object
Usage
## S3 method for class 'estimates_fitsae'
print(x, digits = 3L, ...)
Arguments
x |
Object of class |
digits |
Number of digits to display. |
... |
Currently unused. |
Value
Printed information on a estimates_fitsae
object.
Print Method for a fitsae
Object
Description
The generic method print()
allow to explore relevant outputs of the input object
Usage
## S3 method for class 'fitsae'
print(x, ...)
Arguments
x |
Object of class |
... |
Currently unused. |
Value
Printed information on a fitsae
object.
Print Method for a smoothing_fitsae
Object
Description
The generic method print()
allow to explore relevant outputs of the input object
Usage
## S3 method for class 'smoothing_fitsae'
print(x, digits = 3L, ...)
Arguments
x |
Object of class |
digits |
Number of digits to display. |
... |
Currently unused. |
Value
Printed information on a smoothing_fitsae
object.
Print Method for a summary_fitsae
Object
Description
The generic method print()
allow to explore relevant outputs of the input object
Usage
## S3 method for class 'summary_fitsae'
print(x, digits = 3L, ...)
Arguments
x |
Object of class |
digits |
Number of digits to display. |
... |
Currently unused. |
Value
Printed information on a summary_fitsae
object.
Lauch Shiny App to Performs Small Area Estimation
Description
The command launches a Shiny application that assists the user from the data loading step to the export of the outputs. See the vignette for further details.
Usage
runShiny_tipsae()
Value
No value returned.
Examples
library(tipsae)
# Starting the Shiny application
if(interactive()){
runShiny_tipsae()
}
Variance Smoothing and Effective Sample Sizes Estimation
Description
The smoothing()
function implements three methods, all yielding refined estimates of either variance or effective sample size, to account for indicators with different variance functions. The output estimates are ready to be used as known parameters in an area-level model, and they need to be added to the analysed data.frame
object. All the implemented methods enable the estimation of the effective sample sizes, whereas "ols"
and "gls"
also perform a variance smoothing procedure.
Usage
smoothing(
data,
direct_estimates,
area_id = NULL,
raw_variance = NULL,
areas_sample_sizes = NULL,
additional_covariates = NULL,
method = c("ols", "gls", "kish"),
var_function = NULL,
survey_data = NULL,
survey_area_id = NULL,
weights = NULL,
sizes = NULL
)
Arguments
data |
A |
direct_estimates |
Character string specifying the variable in |
area_id |
Character string indicating the variable with domain names included in |
raw_variance |
Character string indicating the variable name for raw variance estimates included in |
areas_sample_sizes |
Character string indicating the variable name for domain sample sizes included in |
additional_covariates |
A vector of character strings indicating the variable names of possible additional covariates, included in |
method |
The method to be used. The choices are |
var_function |
An object of class |
survey_data |
An additional dataset to be specified when method |
survey_area_id |
Character string indicating the variable denoting the domain names included in the |
weights |
Character string indicating the variable including sampling weights in |
sizes |
Character string indicating the variable including unit sizes in |
Value
An object of class smoothing_fitsae
, being a list of vectors including dispersion estimates: the variances and, when no alternative variance functions are specified, the effective sample sizes. When "ols"
or "gls"
method has been selected, the list incorporates also an object of class gls
from nlme
package.
References
Kish L (1992). “Weighting for Unequal Pi.” Journal of Official Statistics, 8(2), 183.
Fabrizi E, Ferrante MR, Pacei S, Trivisano C (2011). “Hierarchical Bayes multivariate estimation of poverty rates based on increasing thresholds for small domains.” Computational Statistics & Data Analysis, 55(4), 1736–1747.
De Nicolò S, Gardini A (2024). “The R Package tipsae: Tools for Mapping Proportions and Indicators on the Unit Interval.” Journal of Statistical Software, 108(1), 1–36. doi:10.18637/jss.v108.i01.
See Also
gls
for details on estimation procedure for "ols"
and "gls"
methods.
Examples
library(tipsae)
# loading toy dataset
data("emilia_cs")
# perform smoothing procedure
smoo <- smoothing(emilia_cs, direct_estimates = "hcr", area_id = "id",
raw_variance = "vars", areas_sample_sizes = "n",
var_function = NULL, method = "ols")
Summary Method for fitsae
Objects
Description
Summarizing the small area model fitting through the distributions of estimated parameters and derived diagnostics using posterior draws.
Usage
## S3 method for class 'fitsae'
summary(
object,
probs = c(0.025, 0.25, 0.5, 0.75, 0.975),
compute_loo = TRUE,
...
)
Arguments
object |
An instance of class |
probs |
A numeric vector of |
compute_loo |
Logical, indicating whether to compute |
... |
Currently unused. |
Details
If printed, the produced summary displays:
Posterior summaries about the fixed effect coefficients and the scale parameters related to unstructured and possible structured random effects.
Model diagnostics summaries of (a) model residuals; (b) standard deviation reductions; (c) Bayesian P-values obtained with the MCMC samples.
Shrinking Bound Rate.
-
loo
information criteria and related diagnostics from theloo
package.
Value
A list of class summary_fitsae
containing diagnostics objects:
raneff
A list of
data.frame
objects storing the random effects posterior summaries divided for each type:$unstructured
,$temporal
, and$spatial
.fixed_coeff
Posterior summaries of fixed coefficients.
var_comp
Posterior summaries of model variance parameters.
model_estimates
Posterior summaries of the parameter of interest
\theta_d
for each in-sample domaind
.model_estimates_oos
Posterior summaries of the parameter of interest
\theta_d
for each out-of-sample domaind
.is_oos
Logical vector defining whether each domain is out-of-sample or not.
direct_est
Vector of input direct estimates.
post_means
Model-based estimates, i.e. posterior means of the parameter of interest
\theta_d
for each domaind
.sd_reduction
Standard deviation reduction, see details section.
sd_dir
Standard deviation of direct estimates, given as input if
type_disp="var"
.loo
The object of class
loo
, for details seeloo
package documentation.shrink_rate
Shrinking Bound Rate, see details section.
residuals
Residuals related to model-based estimates.
bayes_pvalues
Bayesian p-values obtained via MCMC samples, see details section.
y_rep
An array with values generated from the posterior predictive distribution, enabling the implementation of posterior predictive checks.
diag_summ
Summaries of residuals, standard deviation reduction and Bayesian p-values across the whole domain set.
data_obj
A list containing input objects including in-sample and out-of-sample relevant quantities.
model_settings
A list summarizing all the assumptions of the input model: sampling likelihood, presence of intercept, dispersion parametrization, random effects priors and possible structures.
call
Image of the function call that produced the input
fitsae
object.
References
Janicki R (2020). “Properties of the beta regression model for small area estimation of proportions and application to estimation of poverty rates.” Communications in Statistics-Theory and Methods, 49(9), 2264–2284.
Vehtari A, Gelman A, Gabry J (2017). “Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC.” Statistics and Computing, 27(5), 1413–1432.
De Nicolò S, Gardini A (2024). “The R Package tipsae: Tools for Mapping Proportions and Indicators on the Unit Interval.” Journal of Statistical Software, 108(1), 1–36. doi:10.18637/jss.v108.i01.
See Also
fit_sae
to estimate the model and the generic methods plot.summary_fitsae
and density.summary_fitsae
, and functions map
, benchmark
and extract
.
Examples
library(tipsae)
# loading toy dataset
data("emilia_cs")
# fitting a model
fit_beta <- fit_sae(formula_fixed = hcr ~ x, data = emilia_cs, domains = "id",
type_disp = "var", disp_direct = "vars", domain_size = "n",
# MCMC setting to obtain a fast example. Remove next line for reliable results.
chains = 1, iter = 150, seed = 0)
# check model diagnostics via summary() method
summ_beta <- summary(fit_beta)
summ_beta