Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistencies in definitions of bigint operations and Fraction operations #3374

Open
gwhitney opened this issue Jan 31, 2025 · 6 comments
Open

Comments

@gwhitney
Copy link
Collaborator

gwhitney commented Jan 31, 2025

Describe the bug
Many operations on integers, like division or square root or logarithm, do not necessarily produce integers. Similarly, many operations on rational numbers, like square root or logarithm, do not necessarily produce rational numbers. However, the approach that mathjs currently takes in determining whether to accept such arguments and what to return if it does varies widely both between the two different types and within the types. For some examples:

  • Division of a bigint by a bigint returns the floor of the exact answer when that is not an integer

  • Square root of a bigint returns the number/Complex (approximation, or exact) square root, even when the answer is an exact integer.

  • logarithm of a bigint follows the same scheme

  • Division of Fraction by Fraction is no issue, it's always a Fraction except 0/0 which throws (as it presumably will in any scheme)

  • Square root of a Fraction always throws

  • logarithm of a Fraction to a Fraction base returns a Fraction when the exact answer happens to be rational, and throws an error otherwise, meaning that you must use try/catch to use it unless you for some reason happen to know that the outcome will be rational (which seems like an unusual situation).

To Reproduce
There are many examples, such as:

math.log(math.fraction(3,2), math.fraction(9,4)) // returns 1/2
math.sqrt(math.fraction(9,4)) // throws error, even though there is a rational answer

math.sqrt(math.fraction(9,1)) // throws error
math.sqrt(9n) // returns **number** 3

Discussion
With growing support of bigint, it seems important to adopt and make clear a consistent philosophy on how mathjs will handle the results of mathematical operations that go outside the domain that the inputs come from. Otherwise, it seems likely the problems illustrated above will become worse, leaving mathjs prone to producing inscrutable behavior.

Here are some possibilities:


Proposal (1): Take a page from the very early story of mathjs and sqrt and number. When x is negative, there is no number that is the sqrt of -4. So mathjs has a config option predictable. When it is true, sqrt(-4) therefore returns NaN -- an entity of type number, that informs the user there was otherwise no appropriate number answer. When it is false (the default), mathjs is allowed to go to the best available type to represent the result , and so returns complex(0,2).

A slight hitch in the case of bigint and Fraction is that neither domain contains an analogue of NaN. So to truly remain within the type, when there is no suitable answer, our only alternative would be to throw. On the other hand, we might want to return null or NaN (choose one and use it always) even though it is not a bigint or Fraction, so that we are returning a sentinel value that can be checked for without try/catch, and which will presumably propagate into any further calculation, making the whole thing null or NaN to signal that somewhere within, something failed. So this option (1) splits into (1a) and (1b) depending on whether operations that cannot be satisfied throw an error, or whether they return null or NaN. To not have to repeat the below, I will just say the computation "barfs", and we would just need to pick one consistently (or allow it to be configured, perhaps by allowing additional values of the predictable option). (For the particular case of Fraction, it could be extended to allow the indeterminate ratio 0/0 as its own sort of NaN within its domain, as its particular kind of barfing.)

So proposal (1) would be:

  • When predictable is false (default), all operations strive to return the best result in the best available type whenever possible. When a result is irrational, that best type could potentially be number or bignumber, and perhaps the fallbackNumber configuration option should control the choice of which one. I will just say "floating point" below to be agnostic between number and bignumber. So for example, in this situation:

    • Dividing a bigint by another would produce a bigint when the quotient is an integer, and a Fraction otherwise

    • sqrt of a bigint would produce a bigint for perfect squares, a floating point for other positive numbers, and a Complex for negative bigints. (Note no integer has a rational but non-integer square root, as a matter of mathematical fact. This is the current behavior.)

    • logarithm of a bigint would produce a bigint for perfect powers, a Fraction for rational powers, a floating point for other positive arguments, and a Complex otherwise.

    • Dividing a Fraction by a Fraction would always be a Fraction

    • sqrt of a rational square would produce a Fraction, a floating point for other positive numbers, and a Complex for negative Fractions.

    • logarithm of Fraction to a Fraction base would produce a Fraction when it happens to be a rational power; in all other cases or if either input is a floating point type, it would produce a floating point or Complex result if possible.

  • When predictable is true, all operations barf whenever the answer cannot be the same type (with perhaps the variant of barfing being configurable). In particular, in this situation

    • Dividing a bigint by another would produce a bigint when the quotient is an integer, and barf otherwise

    • sqrt of a bigint would produce a bigint for perfect squares, and barf otherwise

    • logarithm of a bigint would produce a bigint for perfect powers, and barf otherwise

    • sqrt of a rational square would produce a Fraction, and barf otherwise

    • logarithm of Fraction to a Fraction base would produce a Fraction when it is a rational power, and barf otherwise.


Proposal (2): Take a page from JavaScript's definition of bigint division, and have mathjs always strive to produce the "best" approximation to an answer within the arguments' domain:

  • Dividing a bigint by a bigint produces the bigint that is the floor of the actual quotient (the current behavior).

  • sqrt of a nonnegative bigint would produce the floor of the actual quotient, otherwise a sentinel value like 0 or -1.

  • logarithm of a bigint would produce the floor of the actual logarithm when that is real, otherwise a sentinel value.

  • sqrt of a nonnegative fraction would produce the exact Fraction when there is one, otherwise an approximation with minimal denominator within some set precision. One could press the existing precision configuration option into use, and say we want an approximation within 10^(-precision), to match roughly the precision we would get out of BigNumber. Or we could add a new configuration 'rationalPrecision', perhaps as the log of the maximum denominator allowed. For a negative fraction, you would get a sentinel value like 0, -1, or if we add such a thing, 0/0.

  • logarithm of Fraction would produce the exact Fraction when there is one, otherwise a minimum-denominator rational approximation within a set precision when there is a real value, and if there is no real-number logarithm, a sentinel value.

Note for full consistency throughout mathjs in Proposal (2), predictable really ought to be abolished (it's only used in sqrt, pow, logarithms, and inverse trig and hyperbolic functions anyway) and sqrt on number should just return a sentinel value like NaN or 0 or -1 for negative numbers (etc.).


Proposal (3): Both proposals (1) and (2) have their virtues, so further extend/enhance/replace the predictable config with settings that produce any one of these classes of consistent behavior. E.g, an outOfDomainRule parameter that could be 'throw', 'null', 'NaN', 'sentinel' (to produce a specific sentinel chosen strictly within each domain) -- this first group of options are all roughly like predictable: true but just differ in detail; 'closest' -- proposal (2); or 'promote' -- the current behavior with predictable: false except extended to Fraction, which currently acts most closely like 'throw' but not exactly. (And in the case of 'promote', the type(s) to promote to, number or bignumber, might have to be configurable, perhaps by fallbackNumber.)


Proposal (4): Each of the settings in (3) has its virtues, but this whole configuring thing is overcomplex and bogs down implementations too much. Any one consistent class of behavior is feasible to work with, and you can always get other reasonable behavior(s) by explicit casts or by trying and then casting if need be. So just abolish "predictable" and any other config option like it, and in essence pick one outOfDomainRule, and implement it everywhere.



Frankly, any of these proposals is defensible and there are surely other reasonable options I haven't thought of. The key thing is that any consistent approach will be more understandable and scalable than the current type-dependent hodgepodge. And as I dive into the details of the bigint implementation, it would really be helpful to settle, sooner rather than later, on a general philosophical direction that mathjs will commit to in the long run, even if it doesn't move quickly toward strict compliance. It would really inform the refinement of the bigint implementation. Thanks so much for your thoughts!

@gwhitney
Copy link
Collaborator Author

(I guess if I had to vote myself, at this moment I would say that in the spirit of the existing predictable config but in a new world where there are a more types, Proposal (3) seems the most consistent with mathjs history. But the only options I can ever imagine actually using myself are 'promote' and 'closest', so I'd actually be fine only supporting those two options. And as I already said, if it had to be just one scheme, period, I'd choose 'closest' but it only very slightly edges out 'promote'.)

@gwhitney
Copy link
Collaborator Author

Upon sleeping on it, while what I said about 'closest' (Proposal 2) providing new mathematical capabilities is true, in a scheme where it is the only one directly supported by mathjs, it makes it the hardest for someone using the library to implement a different strategy themselves. The obstruction is that under 'closest', it can be difficult to tell when you've gotten an exact answer and when you've only gotten an approximation (in which case maybe you want to try another datatype). So I will edit out my recommendation that if there is only one that should be it. I guess if there is only one it should be 'promote'. But maybe it is best for mathjs to be at least somewhat configurable in this, as each proposal has its virtues.

@josdejong
Copy link
Owner

Thanks Glen, this is a good discussion point.

We have to propose your try/barf construct to the TC39 commission and make it a part of JavaScript :)

For context: the general idea in mathjs is that the output type is the same as the input type. When that is not possible and the config.predictable option is false, the output will be the type configured as config.number. If that is not possible config.numberFallback is used. And if that is not possible use what is best suited.

I think there are two main use cases that we need to serve, and the idea behind the option config.predictable was to cater for these two use cases:

  1. A user which "just" wants a correct answer formatted on screen. This user is fine with "any" data type, and does not want type conversion errors (i.e. no-barfing-mode).
  2. A user which uses the results programmatically and needs predictability in the returned data type.

Thoughts:

  1. I think removing the option config.predictable it would be a serious step back in usability (hurting use case 1). It is really helpful when having mixed real numbers and complex numbers. We can indeed think through whether we can refine the configuration option(s).
  2. About outOfDomainRule: my initial feeling is that we're overengineering things if we make the behavior for returning throw, null, or NaN configurable. Before we go into that I would love to list all the relevant cases and see if we can come up with sensible behavior that is not configurable and is as consistent as possible. Thinking aloud here: maybe it is a good idea to go for throw, since, as soon as you get a NaN somewhere in a nested operation, you can't do any meaningful operations on it anyway. Are there any cases where NaN is a desirable outcome?
  3. About proposal (1): interesting idea to rethink more "special" cases where the functions can return a better answer by returning a different type.
  4. I know from practice that the bigint/bigint that you can easily shoot yourself in the foot when the returned result is a bigint which is the floor of the actual float. In order to cater for use case (1), I think we should not return the bigint floor but the configured config.number (so you can configure that as Fraction if you want).
  5. Great to have an open brainstorm right now, but in the end, lets think through if the planned changes are breaking changes and/or backward compatible.

@gwhitney
Copy link
Collaborator Author

gwhitney commented Feb 5, 2025

Thanks for that feedback! With that we can start to converge to a workable plan.

First, your description of the current state of mathjs w/r/t config.predictable is not quite reflected in the code as it stands. Nowhere in the code when predictable is false does mathjs consult config.number or config.numberFallback. Here is a complete catalog of the current uses of predictable:

  1. When the units cancel in a Unit operation, leaving a unitless value, whether to return the numeric type (predictable = false) or the unitless Unit type (predictable = true).
  2. In operations on real numbers whose domain can be extended by providing Complex results on some inputs, for numbers/BigNumbers whether to return that Complex result (predictable = false) or number/BigNumber NaN (predictable = true), with sporadic/inconsistent outcomes for other types like Fraction.

That's all.

So in listening to your feedback in terms of the two use cases, I would suggest that we keep predictable, and have its two values mean:

T. When predictable is true and all inputs are a given type, any result returned must be of that same type.

F. When predictable is false, and the result of the Platonic mathematical operation cannot be represented by the input type(s), mathjs returns its judgment of the "best" type it knows of to return that Platonic exact result.

Those two options seem to correspond well to the two use cases you mention. But they do leave two open questions:

M. (For "mixed") When predictable is true and inputs are of mixed type, what type shall we return? Should we just presume that the conversion operations will promote this situation in some way or another to a case of all inputs of the same type, and then apply principle T? I think that is roughly what is happening in the status quo, but there could well be cases in which we could produce "better" answers by choosing one of the types of the operands based on knowing what all of the supplied values are (e.g. a bigint times a fraction could result in a bigint if the denominator cancels). Should we worry about finding any of those cases? Or is it central to the idea of principle T that operations should first be reduced to cases of uniform type?

B. (For "barf") All current cases of T applied in practice in mathjs code deal with number and BigNumber, which are convenient in that those number systems contain their respective NaN values, which can never be "wrong" as the outcome of an operation. But as we become systematic about predictable, consider say pow applied to two bigints. Often it can have a bigint value, but for many cases where one or both inputs are negative, there is no correct bigint value. But bigint has no element analogous to NaN -- it only contains the integers. So in such a case of pow applied to two bigints when predictable is true, do we:

  1. Return a sentinel bigint (say 0n) and let the client of the library worry about whether that was the actual answer or it was a case that had no actual answer and so mathjs is resorting to this sentinel?
  2. Actually make the type that mathjs uses be an "ExtendedBigint" which is (say) the union of bigint and the numbers NaN and ±Infinity so that those three (only!) are OK to return even when predictable is true (and those three are handled gracefully with other bigint inputs). (Alternatively, we could make our own Symbols to be -InfiniteBigint, InfiniteBigint, and NaB (Not a Bigint), but I think that would be more of a pain because nothing else in JavaScript could smoothly deal with those values.)
  3. Throw a RangeError (essentially forcing clients to either pre-check their inputs for validity or use try/catch).
  4. Do something else?

We need to make some at least initial decision on M and B. My recommendations:

M. At least for now, leave the status quo where we presume typed-function/conversions transmute everything we need to implement into the uniform-type case, and just focus on implementing that case.

B. Option 2 in which we explicitly say that mathjs will operate on this ExtendedBigint type (to the point of the TypeScript typings, etc.), and not precisely on the built-in bigint type, seems like the path of least resistance. (1) seems like a potential trap for clients ("How can I trust the result when I get back a 0?") and (3) seems like more of a pain ("I just want the answer, I don't want to have to wrap all my mathjs calls in try/catch"). But I could totally be convinced otherwise: if 0n is the only sentinel bigint uniformly used in this way, maybe it's not too hard to check if it's a real or sentinel answer; or maybe I shouldn't be so allergic to try/catch. So really I would be fine with any proposal here, as long as we do it uniformly across types that don't have NaNs and across all functions. (It was this ambivalence that led me to propose that the barfing style be configurable.) The answer could be different for bigint and Fraction, because Fraction is so easy/natural to extend to infinities and not a number, but bigint isn't.

And I think with decisions on M and B, we would be ready to systematize the (existing and future) mathjs functions. To examine the cases in my original post, under these recommendations:

  • We extend the Fraction type with -1/0, 0/0, and 1/0 for -FractionInfinity, NotAFraction (NaF), and FractionInfinity. (Sort of not sure why they aren't in the Fraction package already, they are definitely meaningful and useful.)

  • Division:

    • Fractions: always returns a Fraction, now truly possible with the extended type. So predictable irrelevant.
    • Bigints: when the quotient happens to be a bigint, we return it. Otherwise:
      • predictable: return the ExtendedBigint -Infinity, NaN, or Infinity as appropriate (e.g. 3n/5n -> NaN, -2n/0n -> -Infinity)
      • !predictable: return the appropriate (extended) Fraction.
  • sqrt:

    • Fractions: return the rational square root of a rational square. otherwise:
      • predictable: return NaF
      • !predictable: return BigNumber of the square root for positive fractions (since Fraction is arbitrary precision, BigNumber is the best match and the client can always just convert to number to throw away precision); return Complex for negative fractions
    • Bigints: return the integer square root of perfect squares, otherwise proceed exactly as for Fraction
  • log(x, base)

    • Fractions: when x is a rational power of base, return that fraction. When x is 0, return -FractionInfinity. Otherwise:
      • predictable: return NaF
      • !predictable: return BigNumber approximation when x is positive, and Complex when x is negative
    • Bigints: when x is an integer power of base, return that bigint. When x is 0, return -Infinity. Otherwise:
      • predictable: return NaN
      • !predictable: proceed exactly as for Fractions.

How does that all sound? What are your feelings on questions M and B?

@josdejong
Copy link
Owner

First, your description of the current state of mathjs w/r/t config.predictable is not quite reflected in the code as it stands. Nowhere in the code when predictable is false does mathjs consult config.number or config.numberFallback.

Ah, you're right, sorry for the confusion. config.number is used only when there is no information on what the desired type of output is.

M. (For "mixed") that is an interesting point. I agree with you, I think it's fine to keep it like it's working now: there is a set way to resolve mixed data types, like mixing a number and BigNumber will always return a BigNumber. So it boils down to: when predictable is true, mixing types will always return a predictable type, but different numeric values will not affect the returned data type. On a side note: it would be good to document the actual mixed type resolutions.

B. (For "barf") I agree that option (1) a sentinel bigint would be tricky to use. I expect that option (3) introducing an ExtendedBigint will alienate mathjs from JavaScript and make it harder to interop between those two, this is not my preference. Another option (4) would be to introduce a new Symbol('NaN') and use that as NaN value for bigint calculations, or use the number NaN value.

When doing a calculation of which the result cannot be represented as a bigint, somehow the user needs to be informed of that and the error must be propagated. We can do that in two ways: return a special value (like a NaN value), or throw an exception. In both cases, the user has to check the outcome. To me, having to write either if (isNaN(result)) {...} or try { ...} catch () {...} is just a different syntax. (side note: a third option would be returning an [err, success] pair like more and more programming languages use, but that doesn't align with our current API I think). My preference goes to a try/catch, since it stops all calculations at once with an exception and rules out the possibility of accidentally producing a falsive result, hiding that there was an error somewhere during the calculation. Also, with an exception, you can pass an explanatory message that can help debugging. And third, this just aligns with how bigint is implemented in JavaScript, so no inconsistency there. Would it be ok with you to go for try/catch?

What I am thinking about though is whether we should change the cases where mathjs currently returns NaN, like math.divide(0, 0). Should we remove support for NaN altogether and throw exceptions instead? Python throws too in case of 0/0 for example. I have the feeling that NaN may be a mistake, just like having both null and undefined in JavaScript. This would be a serious breaking change though, and feels a bit like going for purity at the cost of practicality.

I like the idea of improving Fraction support by always returning a Fraction (and possibly internally using number to actually do the calculation, but convert that into a Fraction in the end, like with trigonometric functions).

@gwhitney
Copy link
Collaborator Author

gwhitney commented Feb 7, 2025

  • of course it would be fine to throw when predictable is true and a bigint operation leads to a non-integer value. If so, I think we should keep division to mean mathematical division and throw on 3n / 5n, and introduce a means for "truncated division" like python's //. How do you feel about that? I would just like each mathjs function to represent the same mathematical operation, regardless of the types it is operating.

  • I think this throwing behavior should be unique to bigint since it has no special values. I think that the IEEE is quite accomplished and wise, and that NaN is actually better than throwing, because its propagation gives you the option of not looking until you care -- you can check for NaN at the end of a computation, you don't have to check for it everywhere, and you don't have to handle throws in the middle of unfinished computations -- maybe some later condition doesn't even use that particular intermediate result, and so the NaN is completely harmless. Exceptions eliminate that possibility to wait and see if an intermediate even ends up being used. I think that JavaScript has not been developed with the same level of deliberative care that the IEEE used and it is a pity bigint has no infinities nor a "NotABigint" value.

  • so in particular that means I maintain my suggestion to add negative infinity, not a fraction, and positive infinity to Fraction (as 1/0, 0/0, and 0/1) because they are entirely natural elements of the representation that are currently artificially disallowed, they have clear semantics, and for all the reasons that IEEE kept infinities and NaN in floats. How do you feel about that? There is no issue of "breaking a standard" with Fraction, since it is not an official "Rational" type of the JavaScript language.

  • I strongly believe that in the approach we are developing, when predictable is true, then calls like sin(fraction(1/3)) should return NotAFraction. One of the hallmarks of the design/philosophy of this rational arithmetic (stated well in the docs of the fraction class) is that when you get a result, it is a mathematically exact value. There is no Rational number equal to the sine of 1/3 or the square root of two, so these things should return NotAFraction to tell you so. (Or you can turn predictable off and get a bignum floating point approximation, and/or we could add interval arithmetic or ball arithmetic types that would give you two rationals the value lies between or a rational number and a bound on how far the true value could be, respectively.)

How do these ideas sit with you? I think we are close to being able to start systematizing mathjs functions along these lines. Thanks for the productive conversation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants