Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function for multiple models #19

Open
mattnuttall00 opened this issue Mar 7, 2019 · 7 comments
Open

Function for multiple models #19

mattnuttall00 opened this issue Mar 7, 2019 · 7 comments
Labels
question Further information is requested

Comments

@mattnuttall00
Copy link

I'm making my first tentative steps in writing very simple functions just to help keep my scripts tidy, but have hit a snag.

I am trying to write a function that makes running lots of models a bit neater. The models I am running are to estimate the detection function for animals from line transect surveys, using the package 'distance' and the function ds(). The structure of the model call is:

mod1 <- ds(data, truncation, key, formula)

Where
data = my data,
trunctation = a truncation distance
key = the key model function to use (options are uniform, half-normal, hazard rate)
formula = simple formula for if you are adding covariates into the model (e.g. formula = ~habitat)

In my models, data and truncation will not change.
I wrote the below function to make running a bunch of models slightly neater (although probably not by much!)

detfunc <- function(name,key,covar) {
  
  name <- ds(distdata, truncation = 50, key=key, formula = ~covar)
  
  par(mfrow=c(1,2))
  plot(name, showpoints=FALSE, pl.den=0, lwd=2)
  ddf.gof(name$ddf)
  summary(name)
}

The idea being that I can simply write:

detfunc(mod1,"uniform",habitat)
detfunc(mod2,"hn",elevation)
detfunc(mod3,"hr",transect)

etc etc rather than writing the model calls out in full, and the model summary and resulting plots would be spat out.

The function seems to be struggling with the third term: covar though. It throws up and error (from the ds() call rather than from my function) saying that "covar" is not in my dataframe. So it doesn't seem to be recognising the third term in my function call. I tried adding

covar <- data$covar

at the top of the function to try and assign the term to a column in my dataframe but that hasn't worked.

Can someone offer any advice? If you think this is a pointless use of a function, I am open to that advice too ;)

Matt

@mattnuttall00 mattnuttall00 added the question Further information is requested label Mar 7, 2019
@bradduthie
Copy link
Member

I've downloaded the Distance package, just to see how everything works. Looks like the output should be fine to do what you want.

One issue is that distdata isn't defined as an argument in your function, so the first line in the function will look for this from the global environment and assign it to name. How do things work when you change the name argument like the below?

detfunc <- function(distdata, key, covar) {
  
  name <- ds(distdata, truncation = 50, key=key, formula = ~covar)
  
  par(mfrow=c(1,2))
  plot(name, showpoints=FALSE, pl.den=0, lwd=2)
  ddf.gof(name$ddf)
  summary(name)
}

Does this give you the same error message?

@mattnuttall00
Copy link
Author

Hey Brad,

Yea I assumed it would be able to look in the global environment for distdata, and just use it normally, but perhaps not.

I changed it as per your example, but wasn't exactly sure whether when testing it I was then supposed to use distdata as the first argument or the name of the model. So I tried both.

When I do detfunc(distdata,"hn",habitat) I get the error:

Variable(s): covar are in the model formula but not in the data.

And when I try detfunc(mod1,"hn",habitat) I get the error:

object 'mod1' not found

I have also just tried to add a dat argument:

detfunc <- function(dat,name,key,covar) {
  
  name <- ds(dat, truncation = 50, key=key, formula = ~covar)
  
  par(mfrow=c(1,2))
  plot(name, showpoints=FALSE, pl.den=0, lwd=2)
  ddf.gof(name$ddf)
  summary(name)
}

Followed by detfunc(distdata,mod1,"hn",habitat)

But get the same error: variable(s): covar are in the model formula but not in the data

@bradduthie
Copy link
Member

bradduthie commented Mar 8, 2019

Thanks @mattnuttall00 -- you're correct that distdata could be found in the global environment, but it's generally a good idea to include everything required by the function as an argument just to make the function self-contained. The argument name wasn't doing anything because it was immediately overwritten by the first line of code.

Sorry, I should have specified what I was doing better! By changing to distdata, I meant to make your function detfunc read in whatever ds needs in its first argument. With this way, it doesn't matter what this data set is named outside the function, ds will still recognise what is needed and run it accordingly. For example:

detfunc(distdata = data_I_want_ds_to_use, key, covar);

You could also just call data_I_want_ds_to_use the same thing, as below.

detfunc(distdata = distdata, key, covar);

But your detfunc won't care either way -- it will happily take whatever you specify in the argument and apply it to the ds function.

In the redefined function you sent (including dat), the argument name still isn't doing anything. If you removed it, then the function would do the exact same thing because you are assigning name to the result of ds() and name isn't used in this assignment (i.e., the right hand side).

I'm not quite sure I understand what the formula argument is doing, but it looks like ds is having a hard time finding it as a column in distdata (this is what I think is meant by in the model formula but not in the data). Are you specifying a column of data here?

@mattnuttall00
Copy link
Author

Thanks @bradduthie,

Right I see, the distdata bits make sense. John also mentioned to me about keeping functions self contained...it just doesn't seem to be sinking in!

In terms of name, in order to assign a model name would you suggest then doing this outside the function? e.g.

mod1 <- detfunc(distdata,"hn",habitat)

So the formula bit is where in ds() you are able to include covariates in the model. It's really simple to do normally. For example, the below code runs absolutely fine:

modtest <- ds(distdata, truncation=50, key = "hn", formula=~habitat)

So for some reason the covar argument in my function is not being recognised. Any ideas why?

M

@bradduthie
Copy link
Member

No worries @mattnuttall00! I'm trying to verbalise what I think is the general function-related issue, but I'm not quite sure how to phrase it in a way that makes sense to me. I'm hoping the below will help.

Your line showing modtest really helps. What I suspect is happening is that ds is looking for the name of a column in a formula, but you're reading in the column itself and ds is confused by the ~. Instead, I think you will need to specify the formula itself as an argument, or do some very crafty pasting within the function to get the format correct (let me know if you need to go this route for some reason). Try the below.

detfunc <- function(dat, key, trnc = 50, covar) {

  name <- ds(dat = dat, truncation = trnc, key = key, formula = covar)
  
  par(mfrow=c(1,2))
  plot(name, showpoints=FALSE, pl.den=0, lwd=2)
  ddf.gof(name$ddf)
  summary(name)
}

But then run the function as below.

detfunc(dat = distdata, key = "uniform", trnc = 50, covar = ~habitat)

Does that work okay?

@mattnuttall00
Copy link
Author

Eureka! That worked. I never even suspected that the culprit was the ~

Many thanks @bradduthie , much appreciated!

@bradduthie
Copy link
Member

It seems like it's always the most subtle possible thing that causes the most critical problem :-) -- glad that worked, @mattnuttall00!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants