Follow

y'know

processor chips are really just a very clever hack to split one ginormous circuit up across time, and memory is a very clever hack to save that description of how to split that circuit up and feed it into that process

i really wish fpgas were more affordable and performed better than they do, because i could see a software compiler literally compiling really hot loops directly into a soft coprocessor and opcodes to call out into it.

like user-reprogrammable microcode But More

Show thread

or like

if you want to write a timing-perfect sega genesis emulator, you could literally specify an implementation of that isa and rewire your fpga to be that, and set up async interrupts when code running there tries to access the sound or graphics chip and emulate those in software on the host processor

Show thread

@KitRedgrave I believe that is a thing that currently exists, with the only limitation being our understanding of how the original hardware worked

@KitRedgrave there's uuh, whachamacallits.
coarse grained heterogenous reconfigurable something something. it sits somewhere between an FPGA and discrete CPUs.
you have a bunch of functional modules, you wire them up, a compiler figures out the optimal routing for your code, badamim bada boom. maybe those could work better than FPGAs. or you could maybe use an FPGA as just another component.

@csepp @KitRedgrave getting connection machine vibes here

and I was literally just doing my yearly read of the wiki page on the Thinking Machines company haha

@alexandria @KitRedgrave AFAIK that had homogenous cores, but i might be wrong. but i'm pretty sure the routing was fixed in hardware.
from what i understand, this is physically reconfigurable, so you can eg. put the memory next to the module that you know would need to access it the fastest, or something like that? idk.

@csepp @KitRedgrave I never actually got all the way through the book (that's what I get for having a physical copy I guess), but IIRC the routing is fixed but each processor has an independent memory buffer that's it's own, and there are there routing chips and buffers to allow communications across far away nodes.

And yeah, that's why I said it gives me connection machine vibes :)

@alexandria @KitRedgrave oooh, physical copy?? can i see????
i've only read some of that technical report about its use in sonar imaging that basically boiled down to "yup, parallel good"

@alexandria @KitRedgrave also, does that have anything about how its parellel Lisp worked?

@csepp @KitRedgrave Sure! Like I said it's been ages since I opened it and I'm not actually sure what parts I got to 😅😅

A quick skim indicated it goes over:

- The design and architecture (Not just the layout but also stuff like the routing -- I think each processor has a separate communications bus as linkage, as well as direct connections between nodes?)
- The programming environment including some of the lisp
- Data structures and the design and optimization of programs for the machine

It's really thin so it does not go *deep* but it covers things reasonably well as an introduction.

DejaVu of this book is here:
(libgen page ): http://libgen.rs/book/index.php?md5=D026289F3E50944E47EE334922C7BA29
(libgen direct): http://library.lol/main/D026289F3E50944E47EE334922C7BA29

And the CmLisp manual is officially here:
https://dl.acm.org/doi/10.1145/319838.319870

But you can get it from scihub.st

(CC @OCRbot )

OCR Output (chars: 2841) 

@alexandria
Image 1:
‘These special purpose flags and the eight general purpose flags are
ble to the programmer through the microcode but are not visible from <

macrocode.

4.3 The Topology

Each router handles messages for 16 processing cells. The communications
network of the CM-1 is formed by 4,096 routers connected by 24,576 bidi.
rectional wires. The routers are wired in the pattern of a Boolean n-cube,
The address of the routers within the network depend on their relative
position within the n-cube. Assume that the 4,096 routers have addresses 0)
through 4,095. Then the router with address 7 will be connected to the router
with address 7 if and only if Ii - 7] = 2* for some integer k. In this case, we
say that the routers are connected along the kth dimension. Geometrically,
the Boolean n-cube can be interpreted as a generalization of a cube to an
n-dimensional Euclidean space. Each dimension of the space corresponds to
one bit position in the address. An edge of the cube pointing along the kth
dimension connects two vertices whose addresses differ by 2*, that is, they
differ in the kth bit of the address. Because any two 12-bit addresses differ by
no more than 12 bits, any vertex of the cube can be reached from any other
by traveling over no more than 12 edges. Each router is no more than 12
wires away from a neighboring router. Smaller networks can be constructed
by deleting nodes from the 12-cube. Networks with more than 2!2 nodes
would require a larger router, although this would be a simple extension of

the current design. ee Bese
The operations of the router can be divided into five categories: injection,
delivery, forwarding, buffering, and referral. The 16 processors that a router

serves can send new messages into the network by the process of lajectices:

Image 2:
HOW TO PROGRAM A CONNECTION MACHINE 33

Xectors

All concurrent operations in CmLisp involve a simple data structure called
a zector (pronounced zek’tor). A xector corresponds roughly to a set of pro-
cessors with a value stored in each processor. Because a xector is distributed
Scross many processors, it is possible to operate on all its elements simulta-
neously. To add two xectors together, for example, the Connection Machine
directs each processor to add the corresponding values locally, producing a
third xector of the sums. This requires only a single addition time, even
though the xector may have hundreds of thousands of elements.

CmLisp supports many potentially concurrent operations to combine,
create, modify, and reduce xectors. These operations could be implemented
on a conventional computer, but they would be much slower, perhaps tens
of thousands of times slower, than they are on the Connection Machine.

ep At EE Se :

execute concurrently. This is the source of its power.

Image 3:

@alexandria @KitRedgrave that's sci-hub instead of scihub, right? (the latter didn't work)

@KitRedgrave @xj9 I see where you’re going, but given the questionable quality of optimizing compilers, I don’t want them anywhere *near* the ability to tweak my hardware.

Sign in to participate in the conversation
The Vulpine Club

The Vulpine Club is a friendly and welcoming community of foxes and their associates, friends, and fans! =^^=