mikeurbach (u/mikeurbach)

mikeurbach commented on Chisel: A Modern Hardware Design Language github.com/chipsalliance/... · Posted by u/nairboon

gchadwick · 2 years ago

Thanks for the info these all certainly sound like promising developments though I still think there's a major hurdles to overcome.

> good PPA with popular backend tools

Getting good PPA for any given thing you can express in the language is only part of the problem. The other aspect is how easy does the language make it to express the thing you need to get the best PPA (discussed in example below)?

> Think of it like a source map that allows you to jump back and forth between the final System Verilog and the source HDL.

This definitely sounds useful (I wish synthesis tools did something similar!) but again it's only part of the puzzle here. It's all very well to identify the part of the HDL that relates to some physical part of the circuit but how easy is it to go from that to working out how to manipulate the HDL such that you get the physical circuit you want?

As a small illustrative example here's a commit for a timing fix I did recently: https://github.com/lowRISC/opentitan/commit/1fc57d2c550f2027.... It's for a specialised CPU for asymmetric crypto. It has a call stack that's accessible via a register (actually a general stack but typical used for return addresses for function calls). The register file looks to see if you're accessing the stack register, in which case it redirects your access to an internal stack structure and when reading returns the top of the stack. If you're not accessing the stack it just reads directly from the register file as usual.

The problem comes (as it often does in CPU design) in error handling. When an error occurs you want to stop the stack push/pop from happening (there's multiple error categories and one instruction could trigger several of them, see the documentation: https://opentitan.org/book/hw/ip/otbn/index.html for details). Whether you observed an error or not was factored into the are you doing a stack push or pop calculation and in turn factored into the mux that chose between data from the top of the stack and data from the register file. The error calculation is complex and comes later on in the cycle, so factoring it into the mux was not good as it made the register file data turn up too late. The solution, once the issue was identified, was simple, separate the logic deciding whether action itself should occur (effectively the flop enables for the logic making up the stack) from the logic calculating whether or not we had a stack or register access (which is based purely on the register index being accessed). The read mux then uses the stack or register access calculation without the 'action actually occurs' logic and the timing problem is fixed.

To get to this fix you have two things to deal with, first taking the identified timing path and choosing a sensible point to target for optimization and second actually being able to do the optimization. Simply having a mapping saying this gate relates to this source line only gets you so far, especially if you've got abstractions in your language such that a single source line can generate complex structures. You need to be able to easily understand how all those source lines relate to one another to create the path to choose where to optimise something.

Then there's the optimization itself, pretty trivial in this case as it was isolated to the register file which already had separate logic to determine whether we were actually going to take the action vs determine if we were accessing the stack register or a normal register. Because of SystemVerilog's lack of powerful abstractions making a tweak to get the read mux to use the earlier signal was easy to do but how does that work when you've got more powerful abstractions that deal with all the muxing for you in cases like this and the tool is producing the mux select signal for you. How about where the issue isn't isolated to a single module and spread around (e.g. see another fix I did https://github.com/lowRISC/opentitan/commit/f6913b422c0fb82d... which again boils down to separating the 'this action is happening' from the 'this action could happen' logic and using it appropriately in different places).

I haven't spend much time looking at Chisel so it may be there's answers to this but if it gives you powerful abstractions you end up having to think harder to connect those abstractions to the physical circuit result. A tool telling you gate X was ultimately produced by source line Y is useful but doesn't give you everything you need.

> the combination of Chisel and CIRCT offers a unique solution to a deeper problem than dealing with minor annoyances in System Verilog: capturing design intent beyond the RTL > you could add information about bus interfaces directly in Chisel, and have a single source of truth generate both the RTL and other collateral like IP-XACT.

Your example here certainly sounds useful but to me at least falls into the bucket of annoying and tedious tasks that won't radically alter how you design nor the final quality and speed of development. Sure if you need to generate IP-XACT for literally thousands of variations of some piece of IP this kind of things is essential but practically you have far fewer variations you actually want to work with and the manual work required is annoying busy work that will generate some issues but you can deal with it. Then for the thousand of variations case the good old pile o' python doing auto-generation can work.

Certainly having a solution based upon a well designed language with a sound type system sounds great and I'll happily have it but not if this means things like timing fixes and ECOs become a whole lot harder.

Thanks for the link to the video I'll check it out.

Maybe I should make one of my new year's resolution to finally get around to looking at Chisel and CIRCT more deeply! Could even have a crack at toy HDL in the form of the fixed SystemVerilog with a decent type system solution I proposed above using CIRCT as an IR...

mikeurbach · 2 years ago

> To get to this fix you have two things to deal with, first taking the identified timing path and choosing a sensible point to target for optimization and second actually being able to do the optimization.

> Because of SystemVerilog's lack of powerful abstractions making a tweak to get the read mux to use the earlier signal was easy to do but how does that work when you've got more powerful abstractions that deal with all the muxing for you in cases like this and the tool is producing the mux select signal for you.

Thanks for the example and illustrating a real world change. In this specific case, Chisel provides several kinds of Mux primitives[1], which CIRCT tries to emit in the form you'd expect, and I think Chisel/CIRCT would admit a similarly simple solution.

That said, there are other pain points here where Chisel's higher-level abstractions make it hard to get the gates you want, or make a simple change when you know how you want the gates to be different. A complaint we hear from users is the lack of a direct way to express complex logic in enable signals to flops. Definitely something we can improve, and the result will probably be new primitive constructs in Chisel that are lower-level and map more directly to the System Verilog backend tools expect. This is one example of what I was alluding to in my previous reply about new primitives in Chisel.

> Your example here certainly sounds useful but to me at least falls into the bucket of annoying and tedious tasks that won't radically alter how you design nor the final quality and speed of development.

I guess it depends on your goals. I spoke[2] about CIRCT and the new features in this realm at Latch-Up 2023, and after the talk people from different companies seemed very excited about this. For example, someone from a large semiconductor company was complaining about how brittle it is to maintain all their physical constraints when RTL changes.

> Maybe I should make one of my new year's resolution to finally get around to looking at Chisel and CIRCT more deeply!

We'd love to hear any feedback!

> Could even have a crack at toy HDL in the form of the fixed SystemVerilog with a decent type system solution I proposed above using CIRCT as an IR...

That's exactly what the CIRCT community is hoping to foster. If you're serious about diving in, I'd recommend swinging by a CIRCT open design meeting. The link is at the top of the CIRCT webpage. These can be very informal, and we love to hear from people interested in using CIRCT to push hardware description forward.

[1] https://www.chisel-lang.org/docs/explanations/muxes-and-inpu...

[2] https://www.youtube.com/watch?v=w_W0_Z3n9PA

mikeurbach commented on Chisel: A Modern Hardware Design Language github.com/chipsalliance/... · Posted by u/nairboon

gchadwick · 2 years ago

I've said as much before but I find the issue with alternative HDLs Vs SystemVerilog is they concentrate on fixing annoying and frustrating things but don't address the really hard issues in hardware design and can actually make them harder.

For example SystemVerilog has no real typing which sucks, so a typical thing to do is to build a massively improved type system for a new HDL. However in my experience good use of verilog style guides and decent linting tools solves most of the problem. You do still get bugs caused by missed typing issues but they're usually quickly caught by simple tests. It's certainly annoying to have to deal with all of this but fundamentally if it's all made easier it's not significantly improving your development time or final design quality.

Another typical improvement you'll find in an alternative HDL is vastly improved parameterization and generics. Again this is great to have but mostly makes tedious and annoying tasks simpler but doesn't produce major impact. The reason for this is writing good HDL that works across a huge parameterisation space is very hard. You have to verify every part of the parameter space you're using and you need to ensure you get good power/performance/area results out of it too. To do this can require very different micro architectural decisions (e.g. single, dual and triple issue CPUs will all need to be built differently improved parameterization doesn't save you from this). Ultimately you often only want to use a small portion of the parameter space anyway so just doing it in system verilog possibly with some auto generated code using python works well enough even if it's tedious.

So if the practical benefits turn out to be minor why not take all the nice quality of life improvements anyway? There's a large impact on the hard things. From a strictly design perspective these are things like clock domain crossing, power, area and frequency optimization. Here you generally need a good understanding of what the actual circuit is doing and to be able to connect tool output (e.g. the gates your synthesis tool has produced) and your HDL. Here the typical flow of HDL -> SystemVerilog -> tool output can become a big problem. The HDL to SystemVerilog step can produce very hard to read code that's hard to connect to your input HDL. This adds a new and tricky mental step when you're working with the design, first understand the circuit issue then map that to the hard to read SystemVerilog then map that to your HDL and work out what you need to change.

Outside of design alone a major cost of building silicon is verification. Alternative HDLs generally don't address this at all and again can make it harder. Either you entirely simulate the HDL itself which can be fine but then you're banking on minimal bugs in that simulator and there's no bugs in the HDL -> SystemVerilog step. Alternatively you simulate the SystemVerilog directly with an existing simulator but then you've got the HDL to SystemVerilog mapping problem all over again.

I think my ideal HDL at this point is a stripped down SystemVerilog with a good type system, better generative capability that crucially produces plain system verilog that's human readable (maintaining comments, signal and module names and module hierarchy as much as possible).

mikeurbach · 2 years ago

Disclaimer: I work on Chisel and CIRCT, and these opinions are my own.

These are good points, and I think Chisel is actually improving in these areas recently. Chisel is now built on top of the CIRCT[1] compiler infrastructure, which uses MLIR[2] and allows capturing much more information than just RTL in the intermediate representations of the design. This has several benefits.

Regarding the problem of converting from HDL to System Verilog, and associating the tool outputs to your inputs: a ton of effort has gone into CIRCT to ensure its output is decently readable by humans _and_ has good PPA with popular backend tools. There is always room for improvement here, and new features are coming to Chisel in the form of intrinsics and new constructs to give designers fine grained control over the output.

On top of this, a new debug[3] intermediate representation now exists in CIRCT, which associates constructs in your source HDL with the intermediate representation of the design as it is optimized and lowered to System Verilog. Think of it like a source map that allows you to jump back and forth between the final System Verilog and the source HDL. New tooling to aid in verification and other domains is being built on top of this.

Besides this, the combination of Chisel and CIRCT offers a unique solution to a deeper problem than dealing with minor annoyances in System Verilog: capturing design intent beyond the RTL. New features have been added to Chisel to capture higher-level system descriptions, and new intermediate representations have been added to CIRCT to maintain this information and its association to the design. For example, you could add information about bus interfaces directly in Chisel, and have a single source of truth generate both the RTL and other collateral like IP-XACT. As the design evolves, the collateral stays up to date with the RTL. I gave a talk[4] at a CIRCT open design meeting that goes into more detail about what's possible here.

[1] https://circt.llvm.org/

[2] https://mlir.llvm.org/

[3] https://circt.llvm.org/docs/Dialects/Debug/

[4] https://sifive.zoom.us/rec/share/MhHtXPg_7iZk-QWw0A66CaBJDGs...

mikeurbach commented on Gleam: a type safe language on the Erlang VM gleam.run/... · Posted by u/ljlolel

bmitc · 2 years ago

I think it might actually be easier to inspect the BEAM over those other VMs. For one, it is much older and stable. And secondly, the BEAM has a lot of self-introspection features.

I highly recommend the talk The Soul of Erlang and Elixir by Sasa Juric which shows off the essence of the BEAM.

https://youtu.be/JvBT4XBdoUE

mikeurbach · 2 years ago

Since you brought up Sasa Juric, I will second that and also mention their book Elixir in Action. It really helped me get from toy examples to feeling confident running the BEAM in production. This is of course Elixir-centric, but the parts about OTP, inspecting running applications, etc. are really about the BEAM.

mikeurbach commented on The Cube Rule (2017) cuberule.com/... · Posted by u/rococode

mikeurbach · 4 years ago

I used to go very deep on philosophical discussions about the nature of sandwiches with my friends. I’m still digesting the cube rule, but here’s what we came up with:

Every sandwich has an axis. The axis is through the filling. You can roughly divide all sandwiches based on whether you eat along the axis (what we called axially), or around it (what we called radially).

A burrito is consumed axially, while a crunchwrap supreme is consumed radially.

Hope this is interesting, we’ve found pretty much every sandwich can be classified as radial or axial, which seemed like an achievement.

mikeurbach commented on A Case for Asynchronous Computer Architecture (2000) [pdf] avlsi.csl.yale.edu/~rajit... · Posted by u/mahami

fivelessminutes · 4 years ago

This seems to be from 20 years ago, the most recent citation was from 2000 and it describes a MIPS chip built on a 1998 process.

mikeurbach · 4 years ago

We had the pleasure of hosting Dr. Manohar at a CIRCT weekly discussion session earlier this year. He presented much more recent work if anyone is interested. The talk and discussion was recorded here: https://sifive.zoom.us/rec/play/Bg99_niHh9OG_8uE_nhaz6otxvA0...

EDIT: talk begins around 7 minutes.

mikeurbach commented on Chisel/FIRRTL Hardware Compiler Framework chisel-lang.org/... · Posted by u/lnyan

ekiwi · 4 years ago

Did you see the work being done on CIRCT? https://github.com/llvm/circt

I remember one of the reasons you did not want to use firrtl was that its compiler is implemented in Scala and thus hard to integrate into other projexts. CIRCT will solve that problem by providing a firrtl compiler implemented in C++. Other languages like Verilog/VHDL and new high level languages for HLS-like designs are also on the todo list.

mikeurbach · 4 years ago

I contribute to CIRCT, so I feel like I should chime in here. I personally hope that it can provide exactly the kind of unifying IRs we are all hoping for in the open-source community. The fact that the tools are implemented in C++ may be a win for some, but I think the CIRCT project is compelling for much deeper reasons. The README states the ambition clearly:

> By working together, we hope that we can build a new center of gravity to draw contributions from the small (but enthusiastic!) community of people who work on open hardware tooling.

There are weekly community meetings that are open to the public, and we have guest speakers from all sorts of interesting projects in the open-source community. Many of those are leading to collaborations and contributions to CIRCT.

There hasn't been much (any?) discussion of CIRCT on HN, but rather than present the reasons I think it's so great here, I'll point to a talk[1] I gave earlier this year and a much better talk[2] Chris Lattner gave shortly thereafter, both of which lead up to the "Why CIRCT?" question in the second half.

Looking back at that SymbiFlow thread, I see familiar faces that are now actively contributing to CIRCT. There are mentions of many different hardware IRs in some of the posts, but at least three have first-class support in CIRCT today: FIRRTL[3], LLHD[4], and Calyx[5]. This is all very recent and experimental, but I would say the results are already promising.

[1] https://slideslive.com/38955645/applying-circuit-ir-compiler...

[2] https://www.youtube.com/watch?v=4HgShra-KnY

[3] https://circt.llvm.org/docs/Dialects/FIRRTL/

[4] https://circt.llvm.org/docs/Dialects/LLHD/

[5] https://circt.llvm.org/docs/Dialects/Calyx/

mikeurbach commented on River Runner: drop a raindrop anywhere in the USA, watch where it ends up river-runner.samlearner.c... · Posted by u/prawn

cpascal · 5 years ago

Depending on where you placed that drop in Colorado it also could've ended up in the Pacific. The continental divide is in Colorado and all water west of the divide will eventually drain into the Pacific and all water east will drain to the Atlantic.

https://en.wikipedia.org/wiki/Continental_Divide_of_the_Amer...

mikeurbach · 5 years ago

I've been enjoying placing a drop right along the divide and seeing which way it goes. For example, up near Lenawee Mountain (aka A-basin). I've also noticed that up there, sometimes there is no route to either ocean. I guess it ends up in a high-alpine catchment and stays on the divide.

mikeurbach commented on Clash: A modern, functional, hardware description language clash-lang.org/... · Posted by u/lelf

xvilka · 6 years ago

The biggest pain point of "programming" hardware is the vendor tools. Imagine if all modern languages from Swift to Rust, from Python to Fortran, will all compile to C. The same situation is with HDLs - most of them compile to Verilog (some to VHDL). The world needs an open standard to rule them all - low-level HDL IR, that like LLVM can give birth to the endless variations of the cool languages, while giving them all flexibility they need. The synthesizing part would have to be written only once too. I proposed joining efforts[1] on building such a language either from scratch or based on something existing.

[1] https://github.com/SymbiFlow/ideas/issues/19

mikeurbach · 6 years ago

There was some interesting discussion between the LLHD[1] and MLIR[2] folks about just this topic recently[3]. My takeaway: modeling behavioral HDL semantics in the IR is a huge mess that has to account for all of the complexity in the existing HDL landscape. However, there is hope for modeling structural HDL semantics in an IR that could be the target for many languages.

You mentioned LLVM, and I think of MLIR as sort of a successor to LLVM. I'm hopeful that a low-level HDL IR is standardized as an MLIR dialect, so the compiler ecosystem and the hardware ecosystem can start to unite.

[1] https://news.ycombinator.com/item?id=22825107 [2] https://news.ycombinator.com/item?id=22429107 [3] https://drive.google.com/file/d/1x7B0IRdcJ5JBQvfHbPUBcShbTFC...

mikeurbach commented on LLHD: Multi-Level Intermediate Representation for Hardware Description Languages arxiv.org/abs/2004.03494... · Posted by u/matt_d

mikeurbach · 6 years ago

I find it fascinating this research and MLIR[1] seem to have been developed independently around the same time. Figure 1 tells the same story in both papers. The authors mention the possibility of representing their concepts in MLIR, which could be really interesting.

[1] https://arxiv.org/pdf/2002.11054.pdf

mikeurbach commented on Types for Python HTTP APIs instagram-engineering.com... · Posted by u/YoavShapira

mikeurbach · 6 years ago

When I read the headline and saw the source, I assumed this would be about GraphQL. I know Instagram utilizes GraphQL, for example on the web client, so now I'm wondering how that fits in.