acidx (u/acidx) - Readit News

acidx commented on Love C, hate C: Web framework memory problems alew.is/lava.html... · Posted by u/OneLessThing

Can you do parsing of JSON and XML without allocating?

acidx · 4 months ago

Yes! The JSON library I wrote for the Zephyr RTOS does this. Say, for instance, you have the following struct:

    struct SomeStruct {
        char *some_string;
        int some_number;
    };

You would need to declare a descriptor, linking each field to how it's spelled in the JSON (e.g. the some_string member could be "some-string" in the JSON), the byte offset from the beginning of the struct where the field is (using the offsetof() macro), and the type.

The parser is then able to go through the JSON, and initialize the struct directly, as if you had reflection in the language. It'll validate the types as well. All this without having to allocate a node type, perform copies, or things like that.

This approach has its limitations, but it's pretty efficient -- and safe!

Someone wrote a nice blog post about (and even a video) it a while back: https://blog.golioth.io/how-to-parse-json-data-in-zephyr/

The opposite is true, too -- you can use the same descriptor to serialize a struct back to JSON.

I've been maintaining it outside Zephyr for a while, although with different constraints (I'm not using it for an embedded system where memory is golden): https://github.com/lpereira/lwan/blob/master/src/samples/tec...

acidx commented on Love C, hate C: Web framework memory problems alew.is/lava.html... · Posted by u/OneLessThing

MathMonkeyMan · 4 months ago

Unspecified, really? cppreference's [C documentation][1] says that it returns zero. The [OpenGroup][2] documentation doesn't specify a return value when the conversion can't be performed. This recent [draft][3] of the ISO standard for C says that if the value cannot be represented (does that mean over/underflow, bad parse, both, neither?), then it's undefined behavior.

So three references give three different answers.

You could always use sscanf instead, which tells you how many values were scanned (e.g. zero or one).

[1]: https://en.cppreference.com/w/c/string/byte/atoi.html

[2]: https://pubs.opengroup.org/onlinepubs/9799919799/functions/a...

[3]: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2310.pdf

acidx · 4 months ago

The Linux man page (https://man7.org/linux/man-pages/man3/atoi.3.html#VERSIONS) says that POSIX.1 leaves it unspecified. As you found out, it's really something that should be avoided as much as possible, because pretty much everywhere disagrees how it should behave, especially if you value portability.

sscanf() is not a good replacement either! It's better to use strtol() instead. Either do what Lwan does (https://github.com/lpereira/lwan/blob/master/src/lib/lwan-co...), or look (https://cvsweb.openbsd.org/src/lib/libc/stdlib/strtonum.c?re...) at how OpenBSD implemented strtonum(3).

For instance, if you try to parse a number that's preceded by a lot of spaces, sscanf() will take a long time going through it. I've been hit by that when fuzzing Lwan.

Even cURL is avoiding sscanf(): https://daniel.haxx.se/blog/2025/04/07/writing-c-for-curl/

acidx commented on Love C, hate C: Web framework memory problems alew.is/lava.html... · Posted by u/OneLessThing

acidx · 4 months ago

One thing to note, too, is that `atoi()` should be avoided as much as possible. On error (parse error, overflow, etc), it has an unspecified return value (!), although most libcs will return 0, which can be just as bad in some scenarios.

Also not mentioned, is that atoi() can return a negative number -- which is then passed to malloc(), that takes a size_t, which is unsigned... which will make it become a very large number if a negative number is passed as its argument.

It's better to use strtol(), but even that is a bit tricky to use, because it doesn't touch errno when there's no error but you need to check errno to know if things like overflow happened, so you need to set errno to 0 before calling the function. The man page explains how to use it properly.

I think it would be a very interesting exercise for that web framework author to make its HTTP request parser go through a fuzz-tester; clang comes with one that's quite good and easy to use (https://llvm.org/docs/LibFuzzer.html), especially if used alongside address sanitizer or the undefined behavior sanitizer. Errors like the one I mentioned will most likely be found by a fuzzer really quickly. :)

acidx commented on Go has added Valgrind support go-review.googlesource.co... · Posted by u/cirelli94

1718627440 · 5 months ago

Valgrind is way faster and can be attached to a running program.

acidx · 5 months ago

Programs running under any Valgrind tool will be executed using a CPU emulator, making it quite a bit slower than, say, running the instrumented binaries as required by sanitizers; it's often an order of magnitude slower, but could be very well be close to two orders of magnitude slower in some cases. This also means that it just can't be attached to any running program, because, well, it's emulating a whole CPU to track everything it can.

(Valgrind using a CPU emulator allows for a lot of interesting things, such as also emulating cache behavior and whatnot; it may be slow and have other drawbacks -- it has to be updated every time the instruction set adds a new instruction for instance -- but it's able to do things that aren't usually possible otherwise precisely because it has a CPU emulator!)

acidx commented on Go has added Valgrind support go-review.googlesource.co... · Posted by u/cirelli94

DishyDev · 5 months ago

Very cool. Should flush out a few bugs.

I'd be interested to know why Valgrind vs the Clang AddressSanitizer and MemorySaniziter. These normally find more types of errors (like use-after-return) and I find it significantly faster than Valgrind.

acidx · 5 months ago

Go has had its own version of msan and asan for years at this point.

acidx commented on Go has added Valgrind support go-review.googlesource.co... · Posted by u/cirelli94

stingraycharles · 5 months ago

Somewhat yes, but as soon as you enter the world of multi-threading (which Go does a lot), the abstraction doesn’t work anymore: as I understand it (or rather, understood: last time I really spent a lot of time digging into it with C++ code was a while ago) it uses its own scheduler, and as such, a lot of subtle real world issues that would arise due to concurrency / race conditions / etc do not pop up in valgrind. And the performance penalty in general is very heavy.

Having said that, it saved my ass a lot of times, and I’m very grateful that it exists.

acidx · 5 months ago

I wrote the Lwan web server, which similarly to Go, has its own scheduler and makes use of stackful coroutines. I have spent quite a bit of time Valgrinding it after adding the necessary instrumentation to not make Valgrind freak out due to the stack pointer changing like crazy. Despite a lot of Valgrind's limitations due to the way it works, it has been instrumental to finding some subtle concurrency issues in the scheduler and vicinity.

From a quick glance, it seems that Go is now registering the stacks and emitting stack change commands on every goroutine context switch. This is most likely enough to make Valgrind happy with Go's scheduler.

acidx commented on Show HN: I built a minimal Forth-like stack interpreter library in C · Posted by u/Forgret

bertili · 5 months ago

Another thread on small forth interpreters from just 15 days ago:

https://news.ycombinator.com/item?id=45039301

Forth can be beautifully and efficiently implemented in portable c++ using the using continuation passing style via the clang musttail attribute.

Have a look at Tails (not my project):

[1] https://github.com/snej/tails

acidx · 5 months ago

I recently wrote one, in C, using tail calls to implement dispatch with CPS: https://tia.mat.br/posts/2025/08/30/forth-haiku.html

It's already pretty efficient but I'm working on it to make it even more efficient so I can use it as some sort of primitive fragment shader for an art project. This Forth variant is intended to execute Forth Haikus, as defined by the Forth Salon website.

acidx commented on Implementing a Forth ratfactor.com/forth/imple... · Posted by u/todsacerdoti

acidx · 8 months ago

I recently implemented a Forth, compatible with the Forth Haiku dialect used by https://forthsalon.appspot.com/ -- it uses a tail call/continuation-passing dispatching method, and performs some rudimentary optimizations. At some point it also spit out some C but I decided to give this feature the axe until it has a better infrastructure for optimizations and codegen.

The idea is to use it to drive an LED matrix and have a simple web UI to develop "fragment shaders" in Forth. It's developed as part of the Lwan project, although currently it generates GIF files on the fly rather than drive a LED matrix.

The source code is here for those that want to play with it: https://github.com/lpereira/lwan/tree/master/src/samples/for...

acidx commented on A map of torii around the world google.com/maps/d/viewer?... · Posted by u/ilamont

ricardobeat · a year ago

Missing the small one in Porto Alegre, likely the the southernmost gate in Brazil: https://diariogaucho.clicrbs.com.br/dia-a-dia/noticia/2024/0...

acidx · a year ago

There's another small one in Campinas, near the Nipo-Brazilian Institute: https://viajantesemfim.com.br/rua-camargo-paes-um-pedacinho-... (or on GMaps: https://maps.app.goo.gl/P17sBzkNG3Gk4frq5)

I'm sure there are many, many more throughout the country, especially in the states of São Paulo and Paraná.

acidx commented on New York, Ukraine en.wikipedia.org/wiki/New... · Posted by u/earksiinni

acidx · 2 years ago

Also in Brazil: https://en.wikipedia.org/wiki/Nova_Iorque