Which doesn't imply anything about how fast the operations can be implemented for fixed size numbers.
Also, did you even read the article you linked? The lowest complexity for Multiplication is an open question, absurd to say that it "has the same big-O complexity".
You last point there is nonsense. We can reduce division to multiplication, so we can say division is O(M(n)), where M(n) is the complexity of multiplication, even if we don't know what M(n) is. Note that big-O is an upper asymptotic bound, not an exact asymptotic bound (that's big-Theta).
In fact, one can also show that M(n) is O(D(n)) (where D(n) is the time needed to divide n bit numbers). So they really are big-theta of each other.
(See Aho, Hopcroft, and Ullman, The Design and Analysis of Computer Algorithms, 1976, section 8.2)
As explained in the Wikipedia article, we use the Newton–Raphson method method with O(log n) steps to find the reciprocal of the divisor, then we multiply this reciprocal by the dividend. Each step of the reciprocal method reqires a single multiplication, with the precision doubling at each step. So the steps take M(1), M(2), M(4), ..., M(n/2), M(n) time. Since multiplication isn't sublinear, M(cn) ≥ c·M(n) for c ≥ 1 and "most" n (very roughly); thus, M(n) + M(n/2) + M(n/4) + ... ≤ M(n) + 1/2·M(n) + 1/4·M(n) + ... ≤ 2·M(n). Adding the final multiplication, we get an upper bound of 3·M(n), or O(M(n)).
See the book referenced. Basically, if you implement division by Newton-Raphson, using multiplication as a subroutine, the complexity is O(M(n)).
In the other direction, one can use division to do multiplication by: (1) doing reciprocal with division, (2) doing squaring with reciprocal, and (3) doing multiplication with squaring. All these operations (division, multiplication, reciprocal, and squaring) turn out to be equivalent in complexity up to a constant factor. All this is asymptotically in n, the number of bits.
You don't understand what "reduce to" means in this context. In the intended meaning (where, X reduces to Y means that if we can do Y in O(T(n)) time, we can do X in O(T(n)) time) those aren't known. We can't reduce division to a constant number of additions of the same size.
Adding on to this, and related somewhat, binary addition even with look ahead is a multi clock cycle process for non trivial word sizes. Arithmetic is not trivial by any means. We all just take it for granted.
By "non-trivial word sizes," I presume you mean larger than a single word of most computers? My understanding is that modern BDD/ZDD techniques have gone a long way to showing how to do the entire operation in a single pass for most sized numbers. Such that propagating the carry information isn't really how it is done, anymore. (This is a genuine question, btw. I have been a long way from my VLSI classes back in college.)
I'm talking about constructing boolean circuits. The usual parallel prefix algorithm yields a circuit of depth O(log n) with O(n log n) gates. Using some trickery, however, this can be reduced to O(n) gates, with a constant factor increase in depth.
Not for cryptographically useful group like elliptic curves. Multiplication is defined as additions and can be made fast by repeatedly doubling and add. The inverse function, division, is incomputable for large curves.
Some operations have much more complex inverses, in a computational sense, so I don't think it's so simple. The key to showing this is to reduce the problem to its inverse: https://en.wikipedia.org/wiki/Reduction_(complexity)
I noticed it is very fast when I was experimenting with algorithms for unbiased bounded random numbers https://dotat.at/@/2022-04-20-really-divisionless.html - the fancy nearly and really divisionless algorithms had much less advantage on an Apple M1 than on an old Intel CPU.
It's actually not more "complicated" in the sense of operation count. You divide in the standard algorithm by: shifting your divisor so it's within one bit of the dividend (or one "place" if you're doing non-binary math), subtracting (or doing a one-digit divide for human numbers), and then shifting the result left by one place and repeating until you're out of divisor.
Multiplication is completely symmetric: shift off the lowest bit (digit), multiply (which is just logical AND for binary), add to the result, shift the result right, and repeat.
The reason multiplication is faster than division isn't about complexity it's about dependency. The division steps depend on the previous iteration, where multiplication can be parallelized.
It's a slight tangent, but in the general realm of division being complicated, divide by zero is a longstanding annoyance. Does anyone know (or can refer to) a reasonable definition of +,-,*,/,% (in their usual meanings) on rational numbers such that A op B always evaluates to a rational?
A/1 divide 0/1 being stored as 1/0 and then propagated around seems reasonable but I keep putting off the bookwork of finding out what arithmetic relations still hold at that point. I think one loses multiply by zero folding to zero, for example A * 0 is now only 0 for A known to have non-zero denominator.
Context is a language with arbitrary precision rationals as the number type, and all I'm really looking for is a way to declare that the type of division is a rational for any rational arguments. Searching for this mostly turns up results on the naturals or integers which I'm not particularly interested in.
Does anyone know (or can refer to) a reasonable definition of +,-,*,/,% (in their usual meanings) on rational numbers such that A op B always evaluates to a rational?
Given the usual definition of multiplication (and division as its inverse), this is not possible:
Assume some multiplicative inverse 0⁻¹ of 0 exists. Then, writing · for multiplication, since
The Riemann sphere has always existed beyond my grasp of mathematics. I remember the maths undergraduates being excited about it. Thank you for the references, it looks like I need to learn more before proceeding with certainty.
In ordinary arithmetic and algebra, there isn't any reasonable definition of n/0 because if it is allowed, then one can prove contradictory things.
Outside of proof systems/pure math, like in software you write, you can define n/0 to be anything you like, you just can't depend on being able to derive mathematically consistent results.
There are also more exotic math systems. Infinity is not literally a number (it would lead to contradictions), but mathematicians can happily deal with numbers systems augmented with infinity as an extra member -- one just has to step carefully, since it still isn't a number. The "number system" has one non-number member requiring special handling.
Similarly there are modern approaches to infinitesimals where there's a thing "e" which is not zero, but e^2 is zero. That is not normal arithmetic but when handled carefully it can be useful.
Huh. That's surprising but quite persuasive, thank you for the references. I like the general view that a non-axiomatic divide can't introduce unsoundness so define it however is useful.
It has somewhat kicked the can down the road to defining the multiplicative inverse, but that's still a good step forward. Thank you
Not a very satisfying answer (feels like if a satisfying answer to this question was possible, maths would look very different) but I think the trick to not getting annoyed by it is to shift your perspective rather than try and redefine maths. It's not that division is a badly behaved operator; it's that division isn't an operator. It's a shorthand for multiplication by the reciprocal. Since the reciprocal of 0 doesn't exist, we can no more multiply by it than we can compute 5 + the sound of one hand clapping. And we don't fret over how the addition operator's inadequacy there is unsatisfying!
It's worth noting that you can generally divide much faster if you know the divisor ahead of time, because you can turn it into a multiplication problem. But doing this takes some effort, as compared to turning subtraction into addition which is trivial.
Its worth noting that this asymmetry between addition and multiplication happens because our usual representations of numbers (i.e. on a piece of paper or in a a computer) favour addition over multiplication. You can come up with different representations of rationals, for example you could store prime exponents as integers so the number 20 gets stored as (2,0,1,0...) because its 2^2 × 3^0 × 5^1, or 5/6 would be (-1, -1, 1,...).
With a representation like this, or just by taking logs if you like, you can come up with a representation where multiplication and division are very cheap but addition and subtraction are more expensive.
Nitpick, 50 is 2^1 × 5^2, not the other way around. But anyway, thanks a ton for this post, I felt my brain shift at a new way of thinking about numbers, seriously. I have to wonder what serious and useful implementations of this method might have been done somewhere.
By the way, if you're wondering if anyone has done anything useful with the method of describing numbers as prime exponents I mentioned the answer is yes, depending on your definition of "useful".
The encoding in terms of prime exponents is known as the Godel encoding, and its a fairly important step in proving things like Godel's incompleteness theorems and the undecidability of the Halting problem.
Essentially it's a one-to-one map between natural numbers and finite sequences of natural numbers (since you can encode (a,b,c,d...) as 2^a × 3^b × 5^c × 7^d etc), which turns out to be mathematically a convenient thing to have.
I don't think that anything exists which is going to be better than the normal representations we use in anything except fantastically niche situations.
Mostly a tongue in cheek comment but if you don't know log rules at all check out log tables [0]. And you'll see that if you use log values then multiplication and division become addition and subtraction and equally simply.
Relies on you either having stored the needed data or efficiently calculating logirithms and antilogirithims.
That's essentially how IEE 754 floating-point works. You can think of it as a piece-wise linear approximation of the base-2 logarithm.
For non-negative values, the exponent (upper) bits give the integer part of the logarithm and the significand (lower) bits approximate the fractional part (i.e., the mantissa).
In other words, you can do:
float approxLog2( float value )
{
assert( value > 0.0f );
auto bits = std::bit_cast< int32_t >( value );
auto exponent = ( bits >> 23 ) - 127;
auto significand = bits & ( ( 1 << 23 ) - 1 );
return exponent + significand / static_cast< float >( 1 << 23 );
}
This will be exact at powers of two, and within 0.0860748291 of the true value in between (always being slightly too low).
Actually, everything will work out fine if we just could pick the optimal algorithm for each situation. How? Use an oracle to pick the right algorithm!
Dr. Douglas Jones wrote a pretty great article on the subject, https://homepage.cs.uiowa.edu/~jones/bcd/divide.html. Depending on the divisor it can be a bit messy, since reciprocal multiplication sometimes needs extra precision. For example, on an 8-bit machine, x/6 == (x*42>>8), which is nice because 42 fits in 8 bits. But x/105 == (x*313>>15) needs 9 bits to hold 313. On some systems (thinking 8-bit AVR), it'd probably be faster to branch on the 3 cases rather than perform a full 16-bit multiplication.
To dive into this a bit more without using too much math jargon, under modulo every integer division is a trivial integer multiplication but the number you multiply by has to figured out before hand.
The above is also not true of factors of the modulo itself but then that's just a right shift so that part is easy.
eg. Under mod 35 if i want to divide by 3 i can just multiply by 12 since 12x3 = 1 under mod 35.
eg2. Under mod 35 if i want to divide by 11 i can just multiply by 16 since 16x11 = 1 under mod 35.
I can do this for any number that doesn't share factors with 35 since there's always a multiple to that number that will give 1. I could also create base 2 examples similarly. It's called the multiplicative inverse.
To clarify, there are two separate notions of division-as-multiplication discussed here.
Modular division is straightforward to convert to a multiplication; however this is mainly for specialized applications, like cryptography and number theory. It's uncommon to want divide-by-2 to make the value larger.
Ordinary flooring division can also be converted to a multiplication problem when the dividend is bounded. This uses the "magic number" approach where you multiply by a rounded scaled reciprocal, and then do some work to correct the error from rounding.
For flooring integer division with a bounded dividend and known constant, the idea is to multiply by a scaled reciprocal. For example with 32 bit division by 3, you might precompute 2^32 / 3, multiply by that, and then flooring-divide by 2*32 (right shift).
It gets tricky because 2^32/3 is not an integer, so you must round it. This introduces error; the idea is to show that the error is wiped out by the flooring divide. I did the monster math here:
Exactly. If you know you’re going to later be dividing by 23.7, you can precompute 1/23.7 and store that as a constant. Then you can later multiply by it.
Doesn't the risk of lack of precision apply to both floating point and fixed-point?
I'd think the reciprocal would need to be exactly represented in the type, or to have more bits than the original type to produce a correct result in the last bit (or two bits?).
Also, with integers, a signed right shift is rounding down (towards negative infinity), whereas the division operator/instruction in many languages/hardware is rounding towards 0.
To adjust the rounding, you'd add the sign-bit to the first fractional bit before shifting the last step. Let's say that 'x' is a signed long, and a signed long has 64 bits, then:
> Doesn't the risk of lack of precision apply to both floating point and fixed-point? I'd think the reciprocal would need to be exactly represented in the type, or to have more bits than the original type to produce a correct result in the last bit (or two bits?).
Yes, but this was historically okay on an x87 FPU which had more precise representation than the common external formats.
Not being snarky, at the simplest level by knowing x ahead of time you can change your code from y = n/2 into y = n * 0.5. Not wildly important by itself, but in a loop - maybe… I suspect compilers optimize for that already.
I still don't get it. How do you know that 1/x = 0.abc... without performing the division? I mean in general case, not in something matching binary tricks. Such as 1/3 for example. Unless you mean you somehow know the value of 1/x ahead of time. But where does it come from?
If you know the divisor x ahead of time you can pre-compute 1/x at compile time, so that your actual compiled code never does the division--only your compiler does. Your actual compiled code just does the multiplication by a pre-computed constant (and the compiled code doesn't have to know that that constant was pre-computed as the reciprocal of x).
Often, you can amortize a division or reciprocal by calculating it once and then reusing it. Frequently the divisor is dynamic, but reused locally.
For example, if you want to normalize a 3D vector you could do:
mag = sqrt(x*x + y*y + z*z)
x /= mag
y /= mag
z /= mag
That's three divisions with the same divisor. But you could instead do:
invMag = 1.0 / sqrt(x*x + y*y + z*z)
x *= invMag
y *= invMag
z *= invMag
There's still a single division (or reciprocal) done here. But you've eliminated at least the other two. (And it's even better if you have an rsqrt function.)
I enjoyed the economics answer and comment by Mark Booth:
> If generic floating point division were more important to modern CPU's then it might make sense to dedicate enough silicon area to make it single cycle, however most chip makers have obviously decided that they can make better use of that silicon by using those gates for other things. ..
> CPU manufacturers would only dedicate a large swath of silicon to a low latency floating point divide unit if they couldn't use that silicon more effectively to speed up the CPU in other ways. Generally speaking though, having more long latency FDIVs are a more efficient use of silicon than fewer shorter latency FDIVs
But you'd have to cram so many levels logic into that single cycle that you'd have to decrease the clock frequency to meet timing constraints, meaning that code that is not dominated by division would slow down. And a multi-cycle division can often happen in parallel with other operations.
If you simply try to take a design and make it take a single cycle while using the same area, yes. But usually you can trade off area (and power) for latency, by essentially having redundant logic throughout the operation. That's the point being made: the CPU manufacturers don't really consider it worth pushing this tradeoff (which is often quite non-linear).
Remember the Cray-1. It had no division instruction but could perform reciprocals. Division was performed as 1 reciprocal and 2 or 3 multiplications (depending on the precision needed). Thus division was only a little more complex than multiplication. (A reciprocal instruction took about twice the time of a multiply.)
So really, it's tradition that division (which is needed relatively rarely) is not performed quickly in hardware. The old time versus silicon tradeoff strikes again - you can have it quick but it will be expensive, how much do you want it?
Re the original question - (almost) anything can be done in enough silicon. Both multiplication and division could be done as very fast "ROM" lookups, but that would use an enormous amount of silicon.
I find it interesting that the arithmetic operations which are more difficult for school children are also more difficult for computers (addition/subtraction < multiplication < division).
If you set up a logarithmic number system, division would be easy but addition and subtraction would be very difficult, worth noting that difficulty is a function of the representation.
Yes, but computing an integer into it's reciprocal (which is what you're doing in decimal) is the computation that makes it more complex, you're just offloading that from the machine to your head. Try doing this with x/31, show me the operation to convert 31 into it's reciprocal in decimal and you'll see where the complexity lies.
conceptually yes, computationally no. For starters, no tricky rules to observe when multiplying, but when dividing? oh boy, we best not put zeros wherever we please or we shit the bed real quick. On top of that, you ever multiplied integers into non-integers? I haven't. Real easy to get real fuck numbers REAL fast if you're not careful when dividing. Add in floating points? forget about it. Wanna multiply to some big number using only small primes? easy! wanna divide down to smallest possible prime factors? good luck, buddy. Division is a whole different ball game. Cryptography relies on that fact.
The result you get can be reversed with multiplication, yes. But the operation of devision is not simply reversing of the operation of multiplication. Subtraction is the exact opposite operation of addition. Division is multiplication by a reciprocal.
Probably because add, sub, and mul are int -> int and div is int -> real. The fundamental operation maps from ℵ0 to ℵ1, which the other operations don't do. Intuitively, the complexity comes from this fact.
1. We're discusing here floored integer division, which outputs an integer, not something more general like a real.
2. Even if we were discussing exact division, the ratio of two integers is a rational. The rationals are countable. More generally, any image of a countable set is countable, and the cartesian product of two countable sets is countable; there is no way you could get something uncountable from something like this.
3. Aleph_1 is not (necessarily) the same thing as the cardinality of the reals, which is 2^(aleph_0). The statement that these are the same is known as the continuum hypothesis and is undecideable in standard mathematics.
...and, also, none of this really relates to the complexity of performing integer division, even ignoring the fact that these statements are false.
https://en.wikipedia.org/wiki/Computational_complexity_of_ma...