samps (u/samps) - Readit News

samps commented on Hotline for modern Apple systems github.com/mierau/hotline... · Posted by u/tonymet

zmb_ · 7 months ago

It was a life-defining piece of software for me too. As a teenager I found a server called “REALbasic Cafe” that inspired and helped me go from knowing next to nothing about programming to making my first money from shareware as a high school kid.

To this day I’m grateful I stumbled across the Hotline software and the server.

samps · 7 months ago

I had an identical experience with the REALbasic Cafe as a kid, down to eventually selling a couple of shareware projects. I wonder if we were there at the same time.

samps commented on Flattening ASTs and other compiler data structures (2023) cs.cornell.edu/~asampson/... · Posted by u/aw1621107

finnh · 8 months ago

The sample flat program in the post is exactly RPN, no?

samps · 8 months ago

I think it would be more like RPN if it used a stack, and operands were specified as relative offsets (i.e., stack offsets). In the version I wrote, operands are still represented as absolute offsets in the expression table.

samps commented on Flattening ASTs and other compiler data structures (2023) cs.cornell.edu/~asampson/... · Posted by u/aw1621107

kazinator · 8 months ago

> Instead of allocating Expr objects willy-nilly on the heap, we’ll pack them into a single, contiguous array.

This happens naturally if you bump-allocate them in a garbage-collected run-time, particularly under a copying collector. Free lists also tend to co-locate because they are produced during sweep phases of GC which run through heaps in order of address.

Don't make me bring out the L word for the billionth time.

> A flat array of Exprs can make it fun and easy to implement hash consing

OK, it's not a case of L-ignorance, just willful neglect.

samps · 8 months ago

FWIW I did acknowledge this in the article:

> A sufficiently smart memory allocator might achieve the same thing, especially if you allocate the whole AST up front and never add to it

> Again, a really fast malloc might be hard to compete with—but you basically can’t beat bump allocation on sheer simplicity.

samps commented on Advanced Compilers: Self-Guided Online Course cs.cornell.edu/courses/cs... · Posted by u/matt_d

pfdietz · 5 years ago

Nice, although I'd have wanted a segment on compiler testing.

samps · 5 years ago

Indeed; I previously had these papers on the list but had to take them out for time this semester:

- Finding and understanding bugs in C compilers. Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. PLDI 2011. https://dl.acm.org/citation.cfm?id=1993532 - Compiler validation via equivalence modulo inputs. Vu Le, Mehrdad Afshari, and Zhendong Su. PLDI 2014. https://dl.acm.org/citation.cfm?id=2594334

samps commented on Advanced Compilers: Self-Guided Online Course cs.cornell.edu/courses/cs... · Posted by u/matt_d

dbcurtis · 5 years ago

One way to think of it is to split the "compiler" problem into three big pieces. 1) Front-end, 2) back-end, 3) tooling.

Front end is lexing, parsing, building the AST (in whole or just keeping pieces of it around), semantic analysis. When that is done, you can say "Yup, valid program. It is possible to generate code." So, to your question, yes. Those techniques from the course are 100% in play.

Back-end is turning the AST into something you can optimized and generate code from. (Sometimes there is a "middle-end" in there....) High-level optimizations, low level optimizations, memory allocation, register allocation.

Tooling is everything that keeps your life from being miserable as a user -- starting with debug symbols and these days people expect library and package management, etc, etc.

So if you are interested in exploring the Next Great Syntax or some semantic problems that can be re-written into C, then doing a C-front is a great way to do that. Let the C compiler handle all the really-really-really-hard back-end issues for you, and avoid all that distraction. But.... expect that getting debug symbols from your spiffy language into the ready-to-link object file is going to be, as they say, "non-trivial". So... great for experiments and a great learning exercise, but it's hard to do a production compiler that way.

samps · 5 years ago

Hi! This course is about the "middle end," FWIW. We do not do parsing or codegen in 6120, and there is no goal (as there is in many undergrad-level compilers courses) of making a complete, end-to-end compiler for a C-like language.

samps commented on AMD to Acquire Xilinx amd.com/en/corporate/xili... · Posted by u/ajdlinux

andromeduck · 5 years ago

My read on the Altera acc was that Intel needed to shore up fab volumes in the face of their foundry customers jumping to TSMC first chance they could. As the capital required per node continues to rise exponentially, they need more and more volume to amortize that over. This is also why they're trying to get into GPUs again.

samps · 5 years ago

To slightly refine this, Intel didn't have many "foundry customers" before Altera. Via Wikipedia (https://en.wikipedia.org/wiki/Intel#Opening_up_the_foundries...), the need to fill up the manufacturing lines was engendered by poor x86 CPU sales around ~2013, not poor third-party fab runs. In 2013, Intel was still ahead of TSMC with 22 nm.

samps commented on Apple will host WWDC virtually, beginning June 22 apple.com/newsroom/2020/0... · Posted by u/jmsflknr

futguy11 · 5 years ago

How much are they going to charge? :)

samps · 5 years ago

From the first sentence of the article:

> for free for all developers.

samps commented on A Look at Celerity’s Second-Gen 496-Core RISC-V Mesh NoC fuse.wikichip.org/news/32... · Posted by u/rbanffy

pjc50 · 6 years ago

Interesting. There have been a few of these super-manycore processors before; their main characteristic is being quite hard to program effectively due to the need to partition the work and think very hard about memory bottlenecks.

> The Vanilla-5 cores are a 5-stage in-order pipeline RV32IM cores so they support the integer and multiply extensions

So, roughly comparable to a high-speed Cortex M.

> instead of using caches, the entire memory address space is mapped across all the nodes in the network using a 32-bit address scheme. This approach, which also means no virtualization or translation, simplifies the design a great deal.

The diagram shows each core has icache and dcache; what they've ditched is cache coherency. That certainly makes it simpler to implement but now the cores have to be responsible for their own coherency. Also, none of your protected mode operating system nonsense - this is designed to run a single program and get everything out of the way. Every core can potentially overwrite any other core's memory, and if it does so you won't know until you have a cache miss. Good luck figuring that one out in the debugger.

This is very clearly intended for the sort of AI or image processing workload where you can clearly partition it two-dimensionally across the array to identical nodes, and then have those nodes collaborate locally by passing messages across the edges.

samps · 6 years ago

> The diagram shows each core has icache and dcache; what they've ditched is cache coherency.

This is not quite true: the local data memories are not caches, i.e., they do not implicitly move memory in from a more distant tier in the memory hierarchy. They are just plain explicitly managed local memories (sometimes called "scratchpads" to distinguish them from caches).