<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<br>
<blockquote type="cite"
cite="mid:5fd66107.1c69fb81.9ecf5.ba03@mx.google.com">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<div dir="auto">Beefing up the out of order execution prediction
is definitely the main reason why the M1 SoC performs well - but
this sort of execution can only be efficient if the instruction
size remains constant, as is the case with RISC-only
architechtures like ARM. This is where SGI MIPS was headed
before they died.</div>
</blockquote>
<p>Pipeline approach to design essentially partitions tasks that
forces speed-of-light sensitive things to be localized in regions,
allowing overall sprawl of silicon estate.<br>
</p>
<blockquote type="cite"
cite="mid:5fd66107.1c69fb81.9ecf5.ba03@mx.google.com">
<div dir="auto">The main takeaway here IMO is that now that Apple
has demonstrated that a phone SoC can be beefed up to perform
general-purposed computing well, we'll start seeing more of this
hit the market in the workstation space. And when those fast
SoC systems start running Linux, developers will flock to them
and that will accelerate the adoption of ARM in the
cloud/datacenter. Yes, Amazon has their nice Graviton platform,
but without developers running ARM on their workstations,
adoption of ARM in the cloud/datacenter is not going to gain a
lot of traction.</div>
<div dir="auto"><br>
</div>
<div dir="auto">If Apple allowed Linux to run natively on their M1
SoC, it would actually be a game-changer in this space. But that
would require they release their SoC documentation to the open
source community, as well as digitally sign Linux boot
components im their secure enclave (neither of which is likely
because Apple is as closed as Oracle's wallet ;-)</div>
<div dir="auto"><br>
</div>
<div dir="auto">What I'm most interested in seeing in the coming
years is what Nvidia is planning for ARM (no matter what they
say, they definitely have a plan in mind if they bought ARM).</div>
</blockquote>
<p>There is this new outpost of "make your RISC-V silicon" shop,
<a class="moz-txt-link-freetext" href="https://www.sifive.com/risc-v-core-ip">https://www.sifive.com/risc-v-core-ip</a> They also seem to have whole
RISC-V board for sys devs.<br>
</p>
<p>May be this approach will open floodgates of choice, goodness,
rainbows, etc. :)<br>
</p>
<blockquote type="cite"
cite="mid:5fd66107.1c69fb81.9ecf5.ba03@mx.google.com">
<blockquote type="cite">
<div>
<div dir="auto">Found a nice blog post explaining why M1 is
fast. </div>
<div><a
href="https://debugger.medium.com/why-is-apples-m1-chip-so-fast-3262b158cba2"
moz-do-not-send="true">https://debugger.medium.com/why-is-apples-m1-chip-so-fast-3262b158cba2</a></div>
</div>
</blockquote>
<p>I knew it! I felt it all my life! It takes insurmountable
amount of time to prepare place for painting, more than painting
itself takes. ... Eight preppers of micro-ops in M1 versus four
in Intel/AMD.<br>
</p>
<p>I still have feeling that co-locating memory also helps
preppers' result, besides the benefit of RISC's constant length
of instruction.</p>
<p>It also explains talks of AMD going with ARM. RISC-y business
:)<br>
</p>
<blockquote type="cite">
<div>
<div class="gmail_quote">
<blockquote style="margin:0 0 0 .8ex;border-left:1px #ccc
solid;padding-left:1ex" class="gmail_quote">
<div>
<div>Rust provides both Atomic Reference Counting
(called Arc) and non-atomic Reference Counting (called
Rc). You choose the one that makes sense. Hopefully
the type system complains if you use Rc in a context
where atomicity is required, but I don't use Rust. C++
provides only atomic refcounting in the standard
library; for the other kind you roll your own (which I
have done).<br>
<blockquote id="m_816261540811853656qt" type="cite">
<p><moving into discussing silicon and near
it><br>
</p>
<blockquote type="cite">
<div>Another trick is that Apple's dev languages
and frameworks (Swift and Objective-C) use
reference counting, which requires atomic
increments and decrements. On Intel, these
operations are five times slower than non-atomic
operations; on Apple Silicon they run at the
same speed. This is something I wish the other
CPU vendors would get right, because refcounting
has some technical advantages over tracing GC,
and I use it in software I write. C++ and Rust,
both "performance" languages, provide
refcounting but not tracing GC.<br>
</div>
<blockquote id="m_816261540811853656qt-qt"
type="cite">
<div>Regarding M1. My Understanding is that
placement of RAM inside of processor
package/silicon is the trick that makes it run
fast. Is there anything else?<br>
</div>
<div> <br>
</div>
<blockquote type="cite">
<div dir="ltr"> The Apple M1 looks decent, but
since Apple no longer lets you run Linux on
their hardware, I have no desire to ever buy
one.<br>
</div>
</blockquote>
</blockquote>
</blockquote>
<div>Does Rust standard refcounting, or
implementation of such pointers need to use atomic
in/decrements? Can't it use non-atomic something,
given a more detailed knowledge of ownership? Just
wondering.<br>
</div>
</blockquote>
</div>
</div>
</blockquote>
</div>
</div>
</blockquote>
</blockquote>
<br>
</body>
</html>