Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unclear what posterior parameter relates to what formula predictor #36

Open
rikhuijzer opened this issue Feb 5, 2022 · 5 comments
Open

Comments

@rikhuijzer
Copy link
Contributor

Currently, the posterior parameters have the form β[1], ..., β[n] whereas formulas used something along the lines of y ~ a + b + c. This is now tricky to link again because I don't know without looking into the code what parameter belongs to what predictor.

@rikhuijzer
Copy link
Contributor Author

rikhuijzer commented Feb 6, 2022

I've been thinking about a possible solution. We probably could add methods for sample which take a GLMModel. For example,

import Turing: sample

using MCMCChains: replacenames

struct GLMModel
    model::DynamicPPL.Model
    name_mapping::Dict{String,String}
end

function sample(gm::GLMModel, sampler, n_samples)
    chns = sample(gm.model, sampler, n_samples)
    updated_chns = replacenames(chns, gm.name_mapping)
    return updated_chns
end

# And more additional methods for `sample`
[...]

From the user-side, things wouldn't change so much. In most cases, it would still be:

fm = @formula(...)
model = turing_model(fm, data)
chns = sample(model, NUTS(), 2_000)

@phipsgabler
Copy link
Member

Just loading off a random idea here, but what about keeping the predictor names in β itself? E.g., a NamedArrays.jl solution:

julia> β = NamedArray(rand(3), ["a", "b", "c"], "Predictor")
3-element Named Vector{Float64}
Predictor  │ 
───────────┼─────────
a          │ 0.143747
b          │ 0.253579
c          │ 0.143526

@kleinschmidt
Copy link
Contributor

Ideally we can build something on top of coefnames(::AbstractTerm) which can do that substitution for you, although it might be a bit tricky since it looks like the intercept column is manually removed during construction.

@storopoli
Copy link
Member

Yes, a coefnames would be great.
Intercept is always \alpha. It was removed because brms constructs the model as fast as possible for a MCMC NUTS sampler and this involves not using the 1 or 0 column in the model matrix.

@storopoli
Copy link
Member

We could do the same thing we do with the idx for random-intercept:

julia> using TuringGLM

julia> cheese = CSV.read(download("https://github.com/TuringLang/TuringGLM.jl/raw/main/data/cheese.csv"), DataFrame);

julia> f = @formula(y ~ (1 | cheese) + background);

julia> m = turing_model(f, cheese);
The idx are Dict{String1, Int64}("B" => 2, "A" => 1, "C" => 3, "D" => 4)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants