-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Causes vs. Associates_with #33
Comments
Semantics of the causes and associates relationships & language designBased on the user studies, I think that it makes sense to clarify what I can see us applying some basic heuristics, like if we have The The Hidden Variables ConfusionI'm not entirely sure how someone would specify a hidden variable. Would the user hypothesize "there may be On Causal ModelingI, for one, do not feel like I have a good enough grasp on what causal modeling is to really make a lot of statements about what would be the right thing to do. My brief impression is that it seems theoretically nice but in actuality is kind of impractical? (I have to wonder if there is a gender difference in how people would use the Implementation ChangesI think that these all sound reasonable. I think it would also make sense to provide some feedback to the user if they stipulate an |
Regarding SemanticsIn my opinion, causal definitions should only be reserved for universal truths (like drunk driving causes accidents). For all other possible causal relationships, we should "nudge" the user towards using something less powerful. This can be done through Audrey's idea of providing varying degrees of confidence for defining causal relationships. On the implementation front, we could multiply/raise to the power of some constant, to make certain relationships more or less powerful. I am not sure how we would determine such a constant. Regarding Hidden Variables
Could Tisane provide a list of viable hidden variables? For example, consider two variables X and Y for which the user has defined a causal relationship. Tisane could list all other variables that have a relationship with X and Y, presenting them as possible options for the hidden variable. Potential Problem: Asking users to pick a hidden variable might force them to make more assumptions or define more relationships than they're comfortable with. Regarding the Working ListI think everything mentioned on the list is a great idea. In addition to all of that, do we want to explore having different workflows in Tisane? We could provide users with different options and paths depending on their use case. This will help us add more features to Tisane that benefit a certain type of user without having to worry about the impact it could have on another type of user. On Next Steps
Python alternative for Daggity: https://github.com/pgmpy/pgmpy |
Tisane currently provides two types of conceptual relationships:
causes
andassociates_with
. This doc covers when and how to use these verbs.If a user provides associates_with, we walk them through possible association patterns to identify the underlying causal relationships. In other words,
associates_with
indicates a need for disambiguation to compile to a series ofcauses
statements.To do this well, we need to resolve two competing interests: causal accuracy and usability. Prioritizing causal accuracy, the system should help an analyst distinguish and choose among an exhaustive list of possible causal situations. However, doing so may be unusable because the task of differentiating among numerous possible causal situations may be unrealistic for analysts unfamiliar with causality. These concerns do not seem insurmountable.
With an infinite number of hidden variables, there are an infinite number of possible causal relationships. We could restrict the number of hidden variables an analyst considers. This decision compromises causal accuracy for usability. If we had a justifiable cap on hidden variables, it may be worthwhile to take this approach.
Another perspective: If the goal is to translate each
associates_with
into a set ofcauses
, why provideassociates_with
at all?The primary reason I wanted to provide both was because of the following:
In all these cases, it seems important to acknowledge what is known, what is hypothesized/the focus of inquiry, and what is asserted for the scope of the analysis. (accurate documentation, transparency)
In the current version of Tisane, analysts can express any relationships they might know or are probing into using
causes
. If analysts do not want to assert any causal relationships due to a perceived lack of evidence in their field, they should useassociates_with
. Whenever possible, analysts should usecauses
instead ofassociates_with
.Tisane's model inference process makes argubaly less useful covariate selection recommendations based on
associates_with
relationships. Tisane looks for variables that haveassociates_with
relationships with both one of the IVs and the DV. Tisane suggests these variables as covariates with caution, including a warning in the Tisane GUI and a tooltip explaining to analysts thatassociates_with
edges may have additional causal confounders that are not specified or detectable with the current specification.For the
causes
relationships, Tisane uses the disjunctive criteria, developed for settings where researchers may be uncertain about their causal models, to recommend possible confounders as covariates.We assume that the set of IVs an end-user provides in their query are the ones they are most interested in and want to treat as exposures.
Moving forward
I would like to see the following (working list, no priority given yet):
Implementation changes:
Follow-up work/Paper ideas:
The text was updated successfully, but these errors were encountered: