Readit News logoReadit News
ghayes · a year ago
I just started playing around with PIO and DMA on a Pico, and it’s really fun just how much you can do on the chip without invoking the main CPU. For context, PIO is a mini-language you can program at the edge of the chip that can directly respond to and write to external IO. DMA allows you to tell the chip to send a signal based on data in memory, and can be programmed to loop or interrupt to limit re-invoking. The linked repo uses these heavily for its fast Ethernet communication.
stackghost · a year ago
For added clarity, the Pico includes an RP2040 which is where the PIO runs.
ghayes · a year ago
Thanks, and you're correct; not sure why you got downvoted for this. For anyone curious here are the data sheets for RP2040 [for original Pico] and RP2350 [for Pico 2], which describe the systems in detail.

RP2040: https://datasheets.raspberrypi.com/rp2040/rp2040-datasheet.p...

RP2350: https://datasheets.raspberrypi.com/rp2350/rp2350-datasheet.p...

__michaelg · a year ago
> receive side uses a per-packet interrupt to finalize a received packet

This has made much faster systems not being able to process packets at line speed. A classic was that standard Gigabit network cards and contemporary CPUs were not able to process VoIP packets (which are tiny) at line speed, while they could easily download files (which are basically MTU-sized packets) at line speed.

rscott2049 · a year ago
Fortunately, the receive ISR isn't cracking packets, just calculating a checksum and passing the packet on to LWIP. I wish there were two DMA sniffers, so that the checksum could be calculated by the DMA engine(s), as that's where a lot of processor time is spent (event with a table driven CRC routine).
dmitrygr · a year ago
You can do it using PIO. I did that for emulating memory stick slave on rp2040. One PIO SM plus two dma channels with chained descriptors. XOR is achieved using any io reg you don’t need, with 0x3000 offset (manual mentions this as the XOR alias)
crote · a year ago
Luckily the RP2040 has a dualcore CPU so one core can be dedicated entirely to receiving the interrupts, passing it to user code on the other core via a FIFO or whatever else you fancy.
sulandor · a year ago
almost

context switching between processors will reduce cache coherence and hence hits, but yea, it might be worth the tradeoff on busy systems

montecarl · a year ago
Why is the transfer rate non-linear with respect to the system clock? At 100 MHz the rate is 1.38 Mbit/s and at 200 Mhz it is 65.4 Mbit/s.
nyrikki · a year ago
Latency kills...and Ethernet uses exponential backoff.
crote · a year ago
More specifically TCP uses exponential backoff. Ethernet will happily keep drowning you in packages at line rate, if I'm not mistaken.
vardump · a year ago
Maybe a lot of CRC errors or something. Just a guess.
rscott2049 · a year ago
Wish I could answer that! All I can guess is that the slower processing speed creates a bottleneck in the LWIP stack somewhere...
vardump · a year ago
Impressive.

At first I thought it was the new Pico 2 (RP2350), but no, it’s the old Pi Pico with RP2040.

rscott2049 · a year ago
I expect the RP2350 to perform much better in this scenario! At the minimum, one of the DMA channels should be eliminated, and I'm hoping the CRC calculation will get faster.
molticrystal · a year ago
I see some examples that show this can be used as a lite http daemon.

Is there enough room to have it control the ethernet port for another weaker or perhaps more powerful microcontroller?

Can you combine multiple picos with one being the ethernet stack and another that modifies certain packets?

Are there any other interesting things that can be done?

bangaladore · a year ago
> Is there enough room to have it control the ethernet port for another weaker or perhaps more powerful microcontroller?

Well there is a whole unused core and plenty of built in SRAM. Seems like a good way to have an open-source version of Wiznet chips [1]. It could support full protocol offloading like Wiznet's or a lower-level raw packet sender/receiver like the ENC424J600.

[1] https://docs.wiznet.io/Product/iEthernet

MartijnBraam · a year ago
I just quickly tried to fit the whole rp2040+ethernet phy in the WIZ850io formfactor (mainly because I already used that module in some projects before) and have not yet been able to make it fit without using the more expensive jlcpcb features like burried vias. It would be very cool to have though since the W5500 really needs an update.
Lerc · a year ago
Make a package that has a rp2050 mounted on a microSD and you've got a NAS that nobody will ever find.

Back when I was doing a dumb-server/smart-client desktop environment. Something like this would have been pretty cool. It needed a tiny API to save files, but the bulk of the environment worked as a static server.

throwaway81523 · a year ago
This stuff all already exists, Raspberry Pi Zero 2 W. Board is slightly bigger than a Pico but has a full blown Linux system, 4 core arm64 cpu, 512MB ram, SD card slot, wifi, no ethernet though (add-ons are available). Or you could use a larger Pi.
crote · a year ago
Very impressive!

It would be interesting to see a short writeup of what kind of magic was required to achieve this, as there have been multiple failed attempts before this.

I'm also curious about the performance boost from 2.81Mbit/link failure at 150MHz to 65.4Mbit/31.4Mbit at 200MHz. That doesn't sound like basic processor bottlenecks, but rather some kind of catastrophic breakdown at a lower level? Does it just occasionally completely fail to lock onto an incoming clock signal or something?

rscott2049 · a year ago
I did some further investigating - it's apparently due to not having enough setup time on the RX pio SM. Even though the PIO clocking is fixed at 100 MHz, there are CRC errors at the lower system clocks. I tried changing the delay in the PIO instruction that starts the RX sampling, but that only made things worse (as expected). Also tried disabling the synchronizers with no improvement.
crote · a year ago
Hmm, interesting. Am I understanding it correctly that you're doing some kind of reset on the RX PIO from regular C code, and the time for "RX finish -> interrupt CPU -> reset RX PIO" is longer than the gap between packets?

If so, might it be possible to use two RX PIOs, automatically starting the next one via inter-PIO IRQ when a packet is finished? That'd give you an entire packet receive time to reset the original PIO, which should be plenty.

drones · a year ago
> Achieves 94.9 Mbit/sec when Pico is overclocked to 300 MHz, as measured by iperf

Is this an effective rate, or just the reflection of a hardware limit?

gonzo · a year ago
A 1500 byte (octet) MTU frame is 1538 bytes “on the wire”.

7 byte preamble

1 byte SFD

6 byte dst MAC

6 byte src MAC

2 byte ethertype or length

46-1500 bytes of payload (ignoring “Jumbo” frames and 802.1q tags)

4 byte CRC

12 byte IFG (which is silence, but still counts for time on the wire)

Add it up and you have 1538 bytes “on the wire”.

TCP overhead for IPv4 is 20 bytes for IP(v4) (no options) and 20 bytes for TCP (again, no options).

So 1460 bytes of data for 1538 bytes on the wire. 1460/1538 = 0.949284

So for 100M Ethernet, 94.9284Mbps is “perfect”.

dlenski · a year ago
Usually I can grok the significance of almost any item on HN that catches my eye, but here I'm at a loss. Can someone explain why this matters?

As far as I can tell, someone has figured out how to send Ethernet packets at a relatively high rate using hardware with very limited CPU. Cool, but what can you _do with that_? If the RPi Pico has the juice to run interesting network _application-level traffic_ at line rate it's more intriguing, but I doubt that anyone's going to claim that can serve web traffic at line rate on this device, for example.

What am I missing?

boffinAudio · a year ago
Its quite popular in the retro-computing scene, for example, to bring these old machines into the 21st century with modern microcontrollers being used to add peripheral support.

For example, the Oric-1/Atmos computers recently got a project called "LOCI" which adds USB support to the 40-year old computer[1], by using an RP2040's PIO capabilities to interface the 8-bit DATA bus with a microcontroller capable of acting as the 'gateway' to all of the devices on the USB peripheral bus.

This is amazing, frankly.

And now, being able to do Ethernet in such a simple way means that hundreds of retro-computing platforms could be put on the Internet with relative ease ..

[1] - https://forum.defence-force.org/viewtopic.php?t=2593&sid=2d3...

vardump · a year ago
RP2040/2350 are IO monsters. You could for example make a logic analyzer that transfers logic data through ethernet.

This "very limited" microcontroller has two cores. Both of them can execute about 25 instructions per byte for generating "application-level traffic". You could definitely saturate a 100 Mbps connection with just one core.

_flux · a year ago
Now that you mention it, I think I would like to see a logic analyzer that does just that. No buffering, just straight up shovel the data to a mac address, or even IP address, and be done with it (maybe lose a few frames here and there). Let the PC worry about what to do with it, like triggers etc.

Should be cheap, right? Though 1Gbit version might still be expensive..