Fast and accurate sine/cosine approximation

Trigonometric functions are costly operations and can slow down your application if they are extensively used. There are two reasons why: First, Math.sin() is a function, and thus needs a function call which simple eats up some time. Second, the result is computed with much more precision than you would ever need in most situations.

Most often you just want the periodic wave-like characteristics of the sine or cosine, which can be approximated in various ways. One common way of making it faster is to create a lookup-table by computing the sine at discrete steps and storing the result in an array. For example:

var sineTable:Array = [];
for (var i:int = 0; i < 90; i++)
{
    sineTable[i] = Math.sin(Math.PI/180 * i)
}

Due to the symmetry of the sine wave, it's sufficient to compute one quadrant only (0..pi/2), and the other 3/4's of the circle can be computed by shifting and wrapping the input value. The biggest drawback is that the values are stored at a fixed resolution and so the result is not very accurate. This can be enhanced with linear interpolation:

x = 22.5;
y = sineTable[int(x)] + (sineTable[int(x + .5)] - sineTable[int(x)]) / 2;

Much better, but yet the error exists. It also involves accessing array elements which makes the code rather slow. Another technique uses taylor series approximation:

sin(x) = x - (x^3)/3! + (x^5)/5! - (x^7)/7! + ...

Like with the lookup-table, evaluating this term is costly.

After searching for alternatives, I finally found a fantastic solution using a quadratic curve which blows everything away in terms of performance and accuracy. For a detailed derivation, please follow the link because I won't go into it.

I did minor optimizations to figure out what AS3 likes most, and arrived at some code that can be up to 14x faster, while still being very accurate. However, you have to use it directly - do not place the code inside a function, because the additional function call sweeps out the performance gain, and you are left with an approximation that is actually slower compared to a native Math.sin() or Math.cos() call. Also note that cos(x) = sin(x + pi/2) or cos(x - pi/2) = sin(x), so computing the cosine is just of matter adding pi/2 to the input value.

Download source: fastTrig.as.

Below is a simple visualization to show you the quality of the approximation. The high precision version can replace the Math.sin() and Math.cos() calls in nearly all situations.

Low precision sine/cosine (~14x faster)

//always wrap input angle to -PI..PI
if (x < -3.14159265)
    x += 6.28318531;
else
if (x >  3.14159265)
    x -= 6.28318531;

//compute sine
if (x < 0)
    sin = 1.27323954 * x + .405284735 * x * x;
else
    sin = 1.27323954 * x - 0.405284735 * x * x;

//compute cosine: sin(x + PI/2) = cos(x)
x += 1.57079632;
if (x >  3.14159265)
    x -= 6.28318531;

if (x < 0)
    cos = 1.27323954 * x + 0.405284735 * x * x
else
    cos = 1.27323954 * x - 0.405284735 * x * x;
}

High precision sine/cosine (~8x faster)

//always wrap input angle to -PI..PI
if (x < -3.14159265)
    x += 6.28318531;
else
if (x >  3.14159265)
    x -= 6.28318531;

//compute sine
if (x < 0)
{
    sin = 1.27323954 * x + .405284735 * x * x;
    
    if (sin < 0)
        sin = .225 * (sin *-sin - sin) + sin;
    else
        sin = .225 * (sin * sin - sin) + sin;
}
else
{
    sin = 1.27323954 * x - 0.405284735 * x * x;
    
    if (sin < 0)
        sin = .225 * (sin *-sin - sin) + sin;
    else
        sin = .225 * (sin * sin - sin) + sin;
}

//compute cosine: sin(x + PI/2) = cos(x)
x += 1.57079632;
if (x >  3.14159265)
    x -= 6.28318531;

if (x < 0)
{
    cos = 1.27323954 * x + 0.405284735 * x * x;
    
    if (cos < 0)
        cos = .225 * (cos *-cos - cos) + cos;
    else
        cos = .225 * (cos * cos - cos) + cos;
}
else
{
    cos = 1.27323954 * x - 0.405284735 * x * x;

    if (cos < 0)
        cos = .225 * (cos *-cos - cos) + cos;
    else
        cos = .225 * (cos * cos - cos) + cos;
}

61 thoughts on “Fast and accurate sine/cosine approximation”

  1. The article on DevMaster mentions that you should avoid branching, especially in loops. I wonder if the same problem applies to AS3? Is there a speed improvement if you replace the branches by binary logic twists?

  2. Hi, it didn’t work for me (the author also says that this is not valid C code at all) Binary & converts the result to an integer. Only this works:
    x -= int(x > y) && z;

    But the code above is a lot slower than the pure if..else statements.

  3. Nice work Michael – this one goes a long way back. I think it may have been in Hart’s ‘Computer Approximations.’

    One thing I’ve found helpful when working with compilers that don’t support function inlining is to use the C preprocessor or awk or something similar to do all my debugging with library calls and then have the tool replace them with inline approximations going into production.

    Another benefit of having inline code is that you can combine the sin or cos computation with whatever equation it’s used in and realize additional gains by having the trig function result immediately available in a register.

    regards,

    – jim

  4. hey cool I didn’t know there is even a whole book about approximations only ;-) talking about preprocessors, what IDE do you use ? I was trying to integrate the C preprocessor into FlashDevelop, but failed at configuring make or ant to use it.

  5. Michael:

    Another classic is Cody’s ‘A Software Manual for the Elementary Functions’. There is a relatively new book by Muller – ‘Elementary Functions: Algorithms and Implementation’, but I haven’t read it yet.

    I’ve given up on direct integration of other tools to any IDE – don’t have the bandwidth to work out all the issues, so I run awk or another text processor directly on the .AS files — very old school.

    Best of luck in your continued (and superb) efforts!

    regards,

    – jim

  6. I’ev tested this in AS2, and it’s not faster. So this optimisation trick is ONLY for AS3. Just to let you know.

  7. Thanks. This post was useful and more accurate than using an MS-Excell trendline function (2nd degree). Made minor changes that should speed things up since addition is cheaper than mult usually. This is the c code:
    double sin(double rad){
    double sine;
    if (rad < 0)
    sine = rad*(1.27323954 + .405284735 * rad);
    else
    sine = rad*(1.27323954 – 0.405284735 * rad);

    if (sine < 0)
    sine = sine*(-0.225 * (sine + 1) + 1);
    else
    sine = sine*(0.225 * (sine – 1) + 1);
    return sine;
    }

  8. @DYessick: I tried your version in AS3 and there is no noticeable difference unfortunately. Probably due to the weirdness of AS3.

    Overall the high precision approximation gives a nice boost though, although it is closer to ~6.7 faster on my tests. This could depend on a variety of factors I imagine, not least of all your processor architecture.

  9. You wrote:

    “[…] it’s sufficient to compute one quadrant only (0..pi), and the other 3/4’s […]”

    But … 1/4 of 2pi is pi/2 and not pi, so a quadrant is (0..pi/2), i’m wrong ?

    Greg

  10. More over, you can use a Quadrant but an Octant is sufficient due to the symetry (x y, -x y, -x -y, x -y, y x, -y x, -y -x, y -x)

  11. That makes a difference? Damn I didn’t know that…I’ll have to go find my release player, haven’t even used that thing in forever.

  12. This may be a bit outdated – I think Adobe took pretty harsh to your approximation genius and generated a super fast trig table…

    Compiling using the Flex 3.2 SDK, I get these results…

    iterations: 1,000,000
    algorithm: FastMath [ high-quality approximation ]
    angle: 1.302888286613767

    FastMath.sin( 1.302888286613767 ) = 0.9643253756148902
    – Time: 475 ms

    iterations: 100,000
    algorithm: Math [ flash ]
    angle: 1.302888286613767

    Math.sin( 1.302888286613767 ) = 0.9643267785343497
    – Time: 167 ms

    Can anyone else confirm this? Thanks!

  13. hi matt,

    I cannot confirm this. make sure you are using the mxmlc compiler with debug=false and run the swf inside a release player. otherwise you get misleading results. with the Flex3.2 SDK the high quality approximation is now almost exactly 10x faster on my machine.

  14. Thanks alot, this really helps. But I was wondering if anyone could tell me where the numbers 0.405284735 and 0.225 came from?

  15. Hi,

    I know this is outdated, but these results may help somebody – these tests were done in Flash Lite 2.0 on a Nokia 6121 (I will test on more phones)

    Calculating both sin and cos of 100 angles using Math.sin and Math.cos
    = 6216 ms

    Calculating the same thing using FastMath inline
    = 4685 ms

    In Flash Lite FastMath is around 25-30% faster

    Just a note (it says this in the post too) that wrapping FastMath in a function call negates the speed increase, and you get the same results as just using Math.sin/cos.

  16. for sin, radians 0 <= x <= pi
    temp = x * 2 / pi = x * 0.636619…
    sin = temp * ( 2 – temp )
    sin = sin * ( 0.225 * ( sin – 1 ) + 1 )

    without refinement step, maximum error is 0.056 at +- 0.47
    with refinement, maximum error is 0.001 at +- 1.38

    for degrees, swap pi with 90

  17. Thanks for this. Excellent post.

    Does anyone have a similar technique for Atan? I’m scratching my head a bit on this one.

    Thanks.

  18. Man, thanks for taking the time to post this. I’m sure you’ve been getting a few of these generic thank-you’s, but this blog is basically a godsend for us as3 game devs!

  19. I am using C in a different application, but found this useful. I will have to compare this quadratic with my current approximation. Mine, after wrapping the function, this is a modified form using the first two iterations of the Taylor series centered around x=0, but it results in approx 1.5% max error, which is found in the area between pi/4 to pi/2.
    y = x + x*x*x*fact3
    fact3 = -1/6.25, a constant I have defined for the preprocessor that normalizes sin(pi/2) = 1.0.

    Comparing, it’s 3 multiplies & 1 add so we’re probably equal performance, except maybe this form might take advantage of SSE instructions in a more beneficial way ???

    I will compare the accuracy of the quadratic method.

    Thanks for the post.

  20. your mileage may vary using, obviously, a different language and platform, but for those coding in plain C or C++, the following performs slightly faster than the Low Precision sin/cos and is incredibly more accurate.
    above low precision sin: 0.085516% avg error
    max error ~ 27%
    code below for sin: 0.006660% avg error
    max error ~1.5%

    both were run computing millions of loops of 1000-step increment from x = 0 to 2*pi.

    Below slightly outperforms in speed, greatly exceeds accuracy:
    #define MM_PI 3.14159265358979f
    #define D_PI 6.28318530717959f

    //globals
    static const float p2 = MM_PI/2.0f;
    static const float mp2 = -p2;
    static const float fact3 = 0.148148148148148;//6.75f;
    float tmp;
    bool sign;

    static inline float f_sin(float x) {
    float y = 0.0f;
    float tmp = 0.0f;
    bool sign = 0;
    if ((x>D_PI) || (x<-D_PI)) x = fmod(x,D_PI);

    if (x MM_PI) {
    x = D_PI – x;
    sign = 1;
    }

    if (x <= p2) y = x – x*x*x*fact3;
    else {
    tmp = x – MM_PI;
    y = -tmp + tmp*tmp*tmp*fact3;
    }

    if (sign) y = -y;

    return y;
    }

    Then for curiosity I decided to compare accuracy the high precision version:
    accuracy is very comparable, speed is not much worse than the lower precision version, so the cubic approximation I have posted above seems to be in my opinion a good balance of speed and precision.

  21. I tried your fast version against Math lib (FlashDevelop, Flex SDK 4, FP Debugger 10.1)


    package TestSuit
    {
    import flash.display.Sprite;
    import flash.events.MouseEvent;
    import flash.utils.getTimer;

    public class FastMathTest extends Sprite
    {
    private var Outer:int, Inner:int;
    private var i:int, j:int;

    private var Angle:Number;
    private var Rad:Number;

    private var SinA:Number;
    private var CosA:Number;

    private var SinB:Number;
    private var CosB:Number;

    private var Time:int;
    private var TimeA:int = 0;
    private var TimeB:int = 0;

    public function FastMathTest()
    {
    Inner = 100;
    Outer = 10000;

    stage.addEventListener(MouseEvent.CLICK, RunTest);
    }

    private function RunTest(e:MouseEvent):void
    {
    i = Outer
    while(--i != 0)
    {
    Angle = Math.random() * 360;

    j = Inner
    while(--j != 0)
    {
    Time = getTimer();

    /* Fast Math.sin() & Math.cos() */
    Rad = Math.PI / 180 * Angle;

    //always wrap input angle to -PI..PI
    if (Rad 3.14159265)
    Rad -= 6.28318531;

    //compute sine
    if (Rad 3.14159265)
    Rad -= 6.28318531;

    if (Rad < 0)
    CosA = 1.27323954 * Rad + 0.405284735 * Rad * Rad
    else
    CosA = 1.27323954 * Rad - 0.405284735 * Rad * Rad;

    TimeA += getTimer() - Time;
    }

    j = Inner
    while(--j != 0)
    {
    Time = getTimer();

    SinB = Math.sin(Math.PI / 180 * Angle);
    CosB = Math.cos(Math.PI / 180 * Angle);

    TimeB += getTimer() - Time;
    }
    }

    trace("Fast", TimeA, SinA, CosA);
    trace("Math", TimeB, SinB, CosB);
    }
    }
    }

    Fast 486 -0.8421372695852312 -0.6367761151784564
    Math 420 -0.8114847762502067 -0.5843735602456291

  22. I was wondering if there is an arc sine optimisation (to aim a Display object to your mouse), and I found this aproximation:
    asin = x + x * x * x * x * x * 0.4971
    But I don’t know if it faster than the Math.asin() function. How would I know this?

Comments are closed.