-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix handling of unconventional variable names #97
base: main
Are you sure you want to change the base?
Conversation
I am not familiar with regular expressions. @sunxd3 / @penelopeysm, can you take a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could just be my purist tendencies, but I have a personal preference for "doing the right thing": string processing does give the right answer, but it's not technically what we want to do. As the function name suggests, we're trying to extract the index from an expression and Julia gives us tools to do that:
function ind_from_string(x)
expr = Meta.parse(x)
Meta.isexpr(expr, :ref) && expr.args[2] isa Int || error("invalid expression for ind_from_string: '$(x)'")
return expr.args[2]
end
julia> ind_from_string("var[1][2]")
2
julia> ind_from_string("var[1]")
1
I usually feel better writing code like this because I can feel more confident that it accomplishes what I want it to do, rather than having to reason about whether string splitting behaves correctly in all cases.
Just a suggestion and not at all meant as a criticism of your PR, after all the original code was string processing too :) won't be offended at all if we stick to the current implementation
IIRC there are quite a few suboptimal implementations in ParetoSmooth (even a few more in the function alone that is touched by this PR), but I haven't really used the package a lot, so there hasn't been a strong incentive from my side to improve things. IMO it would be good to make the code more robust and better tested. Or maybe just join forces with https://github.com/arviz-devs/PSIS.jl which (AFAIK) has better code quality and is better maintained. |
Regarding this PR more concretely: Can't we avoid parsing strings completely and just work with |
I'm happy to change to this if that's the better option. I just made the PR based on the minor change I made to get it working. I'm not entirely sure yet how much work it would be to convert it to using VarName as I'm not familiar with the interface. |
Sorry, slightly off-topic response. There were early attempts to unify ParetoSmooth and PSIS (offline discussions and #41, #42, #43, #44, #51), but it didn't go anywhere because I'm quite pleased with the PSIS.jl API and code quality, and it would be more work to unify the packages than it was to write PSIS.jl to begin with. There's also the difference in scope; PSIS.jl is only an ultra-lightweight implementation of the PSIS method designed to be used by downstream packages (e.g. any VI package could use this; Pathfinder.jl uses it as a diagnostic). LOO is implemented in PosteriorStats.jl. PSIS.jl and PosteriorStats.jl provide no integration with PPLs or objects that store MCMCChains and won't in the future, being deliberately PPL-agnostic. PPL maintainers are expected to add integrations to their own packages, so that they can ensure these integrations are kept up-to-date. (e.g. InferenceObjects integrates with PosteriorStats, and MCMCChains will as well once I finish TuringLang/MCMCChains.jl#430). For |
@sethaxen, apart from the integration with PPLs, are there any other differences between ParetoSmooth and PSIS? |
It's been a long time since I made a comparison, but besides the differences mentioned in #97 (comment) (scope, package weight, API, code quality, test coverage), there are a few differences in functionality:
Both packages are outdated in their use of the diagnostics. PSIS is shortly getting updates from the recent PSIS paper revision. |
Maybe we can make ParetoSmooth dependent on PSIS? |
I think we should merge this PR, and move the other discussions to an issue. |
But is it clear whether the sorting is actually needed? #97 (comment) |
@devmotion if I understand your point correctly, are you suggesting we should give some kind of ordering to Without a |
If you have a variable named
something["z"][()]
the loo* functions will fail, because it tries to split, but will get the wrong part, changing this to rsplit and limiting the number of elemetns to 2 resolves this and will always take the rightmost[]
which contains the index.