Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable AD for Nmixture models #25

Open
wants to merge 8 commits into
base: AD-rc0
Choose a base branch
from
Open

Enable AD for Nmixture models #25

wants to merge 8 commits into from

Conversation

weizhangstats
Copy link
Collaborator

@dochvam @perrydv I created a branch based on the AD-rc0 branch to make the AD work for N-mixture models. This PR lets you see what has been changed. For some of the distributions requiring q** functions to calculate Nmin and Nmax, I disabled this which is not supported by AD.

@weizhangstats
Copy link
Collaborator Author

Apparently I do not know what's happening here. But when I made AD for those distributions, I did check that everything compiles ok. Maybe @perrydv @dochvam you have better ideas here on the failures.

@dochvam
Copy link
Collaborator

dochvam commented Mar 11, 2024

Apparently I do not know what's happening here. But when I made AD for those distributions, I did check that everything compiles ok. Maybe @perrydv @dochvam you have better ideas here on the failures.

@weizhangstats Looks like a problem with the Github Action config rather than your code. I'll look into it.

@dochvam
Copy link
Collaborator

dochvam commented Mar 21, 2024

I've got it working up through executing the tests, but test-AD.R is failing.

@weizhangstats
Copy link
Collaborator Author

weizhangstats commented Apr 5, 2024

I've got it working up through executing the tests, but test-AD.R is failing.

Hey Ben @dochvam, I updated the tests for N-mixture models which passed on my machine with nimble 1.1.0 (the version on CRAN) installed. It seems nimble 1.0.0 is used for the tests on GitHub. I tried to make it install nimble from CRAN but did not work. Anyway, it is worth having a look at why all the AD tests do not pass with the latest nimble on devel or nimble 1.0.0.

@perrydv
Copy link
Contributor

perrydv commented May 15, 2024

@weizhangstats @dochvam I am looking at this, with apologies for such delay. Do we want to provide AD and non-AD versions of the Nmixture distributions? Here are two reasons why we might.

  1. We had a way to create default values of Nmin and Nmax, based on parameters and data. These use a quantile function and would not work through AD. The current PR removes those default options and requires the user always to provide the Nmin and Nmax. But that means for cases where an Nmixture distribution will be used not for AD, we would be removing an option a user might want.
  2. The insights for numerical stability of the potentially long summation that @paciorek helped us come up with are now in nimNmixPois_logFac. However, this has control flow code like while, if, and for. The outcomes of these conditions or loop extents will be baked in by the first values used in a call, i.e. when taping the operations for AD. Subsequently they may not be good choices (and I'd have to remember what is going on to see if this might lead to wrong results or just not-necessarily-numerically-stable results). At the moment this trick is still called by the AD versions of the Nmixture distributions in this PR, which I don't think we want to do. And OTOH we want to allow this trick to be used in non-AD cases.

The other distributions in this package will be getting AD support without a proliferation of new AD-specific cases. However for Nmixture do we want non-AD versions that (as before) allow default Nmin and Nmax and use the numerically more stable calculations and also AD versions that require the user to choose Nmin and Nmax and can't do the numerically more stable calculations (or possibly, with more thought, use some statically [one-time] chosen split point)?

@dochvam
Copy link
Collaborator

dochvam commented May 15, 2024

@perrydv You've laid out the arguments nicely. I agree that we should retain functionality from both AD and non-AD versions of dNmixture, so non-AD users can take advantage of the tricks.

Is it the case that, by default, we expect most users to not use AD versions, and that users who want AD versions will be actively thinking about that? If so, I suggest keeping the current dNmixture functions as-is, and defining new, separately named functions for AD. Maybe with the additional suffix AD, e.g. dNmixture_BNB_v_AD.

@weizhangstats
Copy link
Collaborator Author

Totally agreed on the AD and non-AD opinion, @perrydv. And yes a good idea to keep the non-AD functions and add AD ones as new.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants