Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

variable selection for uncorrelated random variables #435

Open
wlangera opened this issue Jul 24, 2023 · 3 comments
Open

variable selection for uncorrelated random variables #435

wlangera opened this issue Jul 24, 2023 · 3 comments
Labels
enhancement Enhancements of existing features, but also new feature requests. exceptions Needs enhanced warning or error message.

Comments

@wlangera
Copy link

Hi

I want to fit a Poisson model with uncorrelated random variables and perform projection prediction.
Is it true that this is not yet possible with projpred or can I specify my model in a different way such that I can use uncorrelated random variables?
I also wonder if it matters for projection prediction whether they are specified as correlated or not.

Unexpected number of | characters in group terms. Please contact the package maintainer.

library(brms)
#> Loading required package: Rcpp
#> Warning: package 'Rcpp' was built under R version 4.3.1
#> Loading 'brms' package (version 2.19.0). Useful instructions
#> can be found by typing help('brms'). A more detailed introduction
#> to the package is available through vignette('brms_overview').
#> 
#> Attaching package: 'brms'
#> The following object is masked from 'package:stats':
#> 
#>     ar
library(projpred)
#> Warning: package 'projpred' was built under R version 4.3.1
#> This is projpred version 2.6.0.
#> 
#> Attaching package: 'projpred'
#> The following object is masked from 'package:brms':
#> 
#>     do_call

# simulate data
species <- paste("species", 1:5, sep = "_")
periods <- paste("period", 1:2, sep = "_")
df <- expand.grid(species, periods)

set.seed(123)
lambdas <- c(rgamma(length(species), 0.7, 0.2),
             rgamma(length(species), 0.5, 0.25))

data_list <- lapply(seq_along(df[, 1]), function(i) {
  out <- tidyr::expand_grid(df[i, ], rpois(30, lambdas[i]))
  names(out) <- c("species", "period", "count")
  return(out)
  })
dat_df <- do.call(rbind.data.frame, data_list)

# MCMC parameters
nchains <- 3 # number of chains
niter <- 4000 # number of iterations (incl. burn-in)
burnin <- niter / 4 # number of initial samples to discard (burn-in)
nparallel <- nchains # number of cores used for parallel computing

## Uncorrelated random random variables
# Fit model
fit1 <- brm(bf(count ~ period + (1 + period || species)),
            data = dat_df,
            family = poisson(),
            chains = nchains, warmup = burnin, iter = niter, cores = nparallel,
            control = list(adapt_delta = 0.99,
                           max_treedepth = 12))
#> Compiling Stan program...
#> Start sampling

# Projection prediction
cvvs_fast1 <- cv_varsel(fit1,
                        method = "forward",
                        nclusters_pred = 20, # speed
                        cv_method = "LOO",
                        # only for the sake of speed
                        validate_search = FALSE)
#> Error in FUN(X[[i]], ...): Unexpected number of `|` characters in group terms. Please contact the package maintainer.

Created on 2023-07-24 with reprex v2.0.2

Kind regards,
Ward

@fweber144
Copy link
Collaborator

Hi Ward,

Thank you for this issue.

According to your brms code, you have uncorrelated group-level intercepts and period slopes for species (because of the || syntax; see the brms documentation and Table 2 in the lme4 vignette Fitting Linear Mixed-Effects Models Using lme4; beware that "uncorrelated" refers to zero correlation of the group-level intercept (regarded as a random variable) and the group-level period slope (regarded as a random variable), so to zero correlation in the corresponding bivariate normal distribution).

You are right that this is not supported by projpred yet. Sorry :/

I'll leave this issue open because (i) we should throw a more informative error message in that case and (ii) as a reminder that this could be implemented in the future.

@fweber144 fweber144 added exceptions Needs enhanced warning or error message. enhancement Enhancements of existing features, but also new feature requests. labels Jul 24, 2023
@fweber144
Copy link
Collaborator

I forgot to answer two more specific questions:

can I specify my model in a different way such that I can use uncorrelated random variables?

Currently, I don't think this is possible.

I also wonder if it matters for projection prediction whether they are specified as correlated or not.

Yes, it does. But the reason why I added some more explanations concerning the || syntax above is that given these explanations, you might perhaps decide that you don't need the || syntax and instead rely on the ordinary | syntax.

@wlangera
Copy link
Author

Thank you for your reply and extra information!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancements of existing features, but also new feature requests. exceptions Needs enhanced warning or error message.
Projects
None yet
Development

No branches or pull requests

2 participants