Skip to content

Piero Bosio Social Web Site Personale Logo Fediverso

Social Forum federato con il resto del mondo. Non contano le istanze, contano le persone

Un ulivo che nessuno voleva è stato buttato via da un vivaio perché "brutto".

Uncategorized
4 3 3

Gli ultimi otto messaggi ricevuti dalla Federazione
  • @youen @eniko @lina as I mentioned in the other reply, it is of course possible, but I'd find it extremely surprising. I'm assuming the data size and access patterns are similar, and the FPU should have more registers and shield be able to retire more instructions per cycle than the ALU + with fixed point there's generally a higher op density for equivalent abstract operation in the numbers (consider a multiplication in fixed point vs in floating point).

    read more

  • @eniko says "i won't infodump without permission, but here's the short version", writes two hundred words anyway. the struggle to contain oneself can sometimes really be immense, huh 🫠

    read more

  • @eivind @NanoRaptor the skibidi is going great on the other hand

    read more

  • @eniko I am swamped with work right now, otherwise I would ask for permission to infodump :D

    TLDR is there's thousands (or at least multiple hundred) counters inside the CPU that tick up for all kinds of performance- and otherwise relevant events (cache misses for example) and each OS gives you a way to read these out, plus the kernel can make sure it's properly accounted per process or thread and whatnot. some of these counters are preposterously specific, like uhhhhhh "Number of cycles dispatch is stalled for integer scheduler queue 3 tokens". But there's usually a little selection of commonly useful ones available with some extra simple command.

    It's absolutely fascinating what you can, in theory, do if you want to dig really really really far down, but to be honest, I usually get very little really actionable insights from anything more intricate than the most basic ones ;(

    absolutely a skill issue from my end I'm convinced!

    read more

  • @eniko

    How well would it work to do it in two passes at 160x50x4?

    read more

  • @TheBreadmonkey @jbenjamint geniusinely ;-)

    read more

  • @ygathgoch @TheBreadmonkey that's how capoeira was born. Also I'm sure there's some Ranma episode about a Dancing School of Martial Arts

    read more

  • @eniko @oblomov @lina I was mostly trying to find a potential explanation to the x5 gain instead of x4.

    There might be multiple factors playing together to make things even more confusing...

    About the difference floating vs fixed point, it could be floating point is FPU bound, while for fixed point the bottleneck was the memory cache? Though honestly I'm not experienced enough with such low level considerations.

    read more
Post suggeriti