One-step, split sample estimator for E[Y(t)], E[Y(t)|R=1], E[Y(t)|R=0],

Estimates study-specific and overall outcome means (and difference from baseline) using cross-fitting with single index models (SIMs) and nuisance models for treatment and outcome missingness (mgcv::gam). The function also computes influence-function-based variances, confidence intervals, and optional truncated influence-function diagnostics.

est_psi(
  Y,
  M,
  R,
  X,
  t,
  trt,
  gamma,
  fold,
  seed,
  IF_output,
  simple_trunc,
  quant,
  kernel,
  single_index_method,
  method = "optim",
  use_mave = TRUE,
  s_t_y = NULL,
  coef_g.fit = NULL,
  coef_t_R0.fit = NULL,
  coef_t_R1.fit = NULL,
  coef_M_R0.fit = NULL,
  coef_M_R1.fit = NULL
)

Arguments

Y

Numeric outcome vector. Missing values are internally replaced with 0 prior to model fitting.

M

Binary indicator for observed outcome (1 = observed, 0 = missing).

R

Binary randomization consent indicator (1 for RCT, 0 for PPS).

X

Data frame or matrix of baseline covariates.

t

Treatment assignment vector.

trt

Treatment level for which the target estimand is computed.

gamma

Numeric vector of sensitivity parameters.

fold

Number of cross-fitting folds.

seed

Optional integer random seed for fold assignment. Use NULL to leave RNG state unchanged.

IF_output

Logical; if TRUE, include influence-function vectors in the returned list.

simple_trunc

Logical; if TRUE, apply quantile truncation to inverse probability weights. If FALSE, apply IF truncation diagnostics.

quant

Numeric in (0, 1) used as the upper quantile for simple weight truncation when simple_trunc = TRUE.

kernel

Characters; Kernel used for SIMs. K2_Biweight for Epanechnikov kernel, dnorm for Gaussian kernel.

single_index_method

Characters; Three implementations for SIMs: fixed_bandwidth for setting bandwidth to 1, fixed_coef for setting the first coefficient to 1, and norm1coef for setting the norm of coefficients to 1.

method

Characters; Optimization method used for SIMs. Choices are: optim, nlminb, nmk. Note that method is set to optim if single_index_method=norm1coef.

use_mave

Logical; if TRUE, use Minimum Average Variance Estimation (MAVE) method for initial coefficients value for SIMs. If FALSE, use sliced inverse regression. Default is TRUE.

s_t_y

A function of Y in the exponential tilting model. If NULL, s_t_y is set to pnorm((y-60)/25).

coef_g.fit

Optional starting values for a treatment model; currently retained for interface compatibility.

coef_t_R0.fit

Optional starting coefficients for treatment model fit in t=trt and R = 0stratum.

coef_t_R1.fit

Optional starting coefficients for treatment model fit in t=trt and R = 1 stratum.

coef_M_R0.fit

Optional starting coefficients for missingness model fit in R = 0 stratum.

coef_M_R1.fit

Optional starting coefficients for missingness model fit in R = 1 stratum.

Value

A named list of estimates and uncertainty summaries for each value in gamma. Core elements include point estimates (est, est_R1, est_R0), variance estimates (var, var_R1, var_R0), and confidence interval bounds (lowerCI*, upperCI*). Additional components depend on simple_trunc and IF_output:

  • simple_trunc = TRUE: returns quantile-weight-truncated summaries only.

  • simple_trunc = FALSE: additionally returns truncated summaries and truncated IF objects when requested.

  • IF_output = TRUE: includes influence-function lists (IF*) and, when relevant, truncated IF lists (IF_trunc*).

Details

The procedure uses sample-splitting/cross-fitting to reduce overfitting bias in nuisance estimation. Outcome regressions are fit with single index models and then integrated over estimated conditional distributions to obtain conditional means and sensitivity-adjusted moments.

Examples

# out <- est_psi(Y, M, R, X, t, trt = 1, gamma = c(0, 0.5),
#                fold = 5, seed = 1, IF_output = FALSE,
#                simple_trunc = TRUE, quant = 0.99, kernel="dnorm", 
#                single_index_method="norm1coef", method="optim")