fixed point was a mistake
-
@ratsnakegames and I commend you for that! In a past life I worked for a high-frquency trading firm and their software did financial calculations using floating point numbers đ¤ˇ
@gabrielesvelto @ratsnakegames
Eh, as long as there's enough bits in the mantissa... -
fixed point was in fact *not* a mistake
@eniko following with interest.
-
fixed point was in fact *not* a mistake
I have completed the triangle rasterizer fixed point conversion
The benchmarks have all improved between 0 and 35%
Except for threaded random triangles with flat color. That metric has increased 106%. As in its twice as fast as before
I have no idea why but I'm fairly sure I've ruled out bugs in my benchmarking
I am very confused
-
fixed point was in fact *not* a mistake
@eniko fixed point is very fun
-
I have completed the triangle rasterizer fixed point conversion
The benchmarks have all improved between 0 and 35%
Except for threaded random triangles with flat color. That metric has increased 106%. As in its twice as fast as before
I have no idea why but I'm fairly sure I've ruled out bugs in my benchmarking
I am very confused
@eniko Hyperthreading? That has very different behavior depending on the kind of code being run. A single core might have twice the int resources to handle two threads of throughput but not twice the float resources (and then it depends on how bottlenecked by things like memory latency the benchmark is).
-
@eniko Hyperthreading? That has very different behavior depending on the kind of code being run. A single core might have twice the int resources to handle two threads of throughput but not twice the float resources (and then it depends on how bottlenecked by things like memory latency the benchmark is).
@lina I don't know? >_> I just split the screen between 4 worker threads that all draw each triangle to their quadrant
-
@lina I don't know? >_> I just split the screen between 4 worker threads that all draw each triangle to their quadrant
@eniko How many cores/threads does your machine have? Can you pin app threads to specific cores/threads?
-
@lina I don't know? >_> I just split the screen between 4 worker threads that all draw each triangle to their quadrant
@eniko@mastodon.gamedev.place @lina@vt.social do you happen to own a Bulldozer CPU? â
â -
@eniko How many cores/threads does your machine have? Can you pin app threads to specific cores/threads?
@lina not sure offhand and I'm in bed now but my CPU is a Ryzen 7 5600g
-
@eniko@mastodon.gamedev.place @lina@vt.social do you happen to own a Bulldozer CPU? â
â -
I have completed the triangle rasterizer fixed point conversion
The benchmarks have all improved between 0 and 35%
Except for threaded random triangles with flat color. That metric has increased 106%. As in its twice as fast as before
I have no idea why but I'm fairly sure I've ruled out bugs in my benchmarking
I am very confused
@eniko mutex/locking/semaphores?
-
I have completed the triangle rasterizer fixed point conversion
The benchmarks have all improved between 0 and 35%
Except for threaded random triangles with flat color. That metric has increased 106%. As in its twice as fast as before
I have no idea why but I'm fairly sure I've ruled out bugs in my benchmarking
I am very confused
@eniko (Highly speculative, since FPUs are pretty good these days:) If the CPU has hyperthreading: Maybe two fixed-point threads share ALUs better than two floating-point threads share FPUs?
You might even find that combined fixed and float makes better overall use of a modern CPU, provided you can do both on a single thread or jump through the hoops to get both threads scheduled on the same core.
-
@lina not sure offhand and I'm in bed now but my CPU is a Ryzen 7 5600g
@eniko That doesn't exist... Ryzen 5 or different model number?
If it's the 5 5600G then that's 6 cores, so with 4 threads you shouldn't have HT effects as long as the OS scheduler isn't dumb about it...
-
fixed point was in fact *not* a mistake
@eniko these two messages are my constant bistable state about fixed points
-
@eniko That doesn't exist... Ryzen 5 or different model number?
If it's the 5 5600G then that's 6 cores, so with 4 threads you shouldn't have HT effects as long as the OS scheduler isn't dumb about it...
@lina er yeah 5 sorry
-
@eniko mutex/locking/semaphores?
@slyecho wouldn't that make it slower, not faster?
-
@slyecho wouldn't that make it slower, not faster?
@eniko one would assume 4 times as fast with four threads, not 30% faster. But I donât know exactly what the code is doing without seeing it
-
I have completed the triangle rasterizer fixed point conversion
The benchmarks have all improved between 0 and 35%
Except for threaded random triangles with flat color. That metric has increased 106%. As in its twice as fast as before
I have no idea why but I'm fairly sure I've ruled out bugs in my benchmarking
I am very confused
@eniko do you already have experience with the kind of profiler that lets you get performance counter values?
on linux my go-to first step is
perf stat -d ./myprogram(-d for details gives a couple more numbers. gotta have numbers!) then you'll see a few numbers that may point at a drastic difference.I'm thinking a higher instruction per cycle number probably means fewer instructions that take many cycles (though I hear integer division is much better nowadays?), or your cache hit rate for data or instruction cache may be a lot better, or maybe your code ends up with fewer total instructions for some reason?
-
@lina I don't know? >_> I just split the screen between 4 worker threads that all draw each triangle to their quadrant
-
@eniko do you already have experience with the kind of profiler that lets you get performance counter values?
on linux my go-to first step is
perf stat -d ./myprogram(-d for details gives a couple more numbers. gotta have numbers!) then you'll see a few numbers that may point at a drastic difference.I'm thinking a higher instruction per cycle number probably means fewer instructions that take many cycles (though I hear integer division is much better nowadays?), or your cache hit rate for data or instruction cache may be a lot better, or maybe your code ends up with fewer total instructions for some reason?
@timotimo I'm incredibly new at running benchmarks at this level so I don't really know what that is