If you enjoy stretching what you can get done with limited environments, I recently discovered BusyBox includes not only an AWK implementation but an HTTP server[0] that supports CGI as well. I spent a weekend setting up a really simple web-app using recutils [1] as the database and BusyBox AWK as the programming language.
If you enjoy bricolage and you feel like wasting a weekend, I would recommend giving it a try!
That looks so featureful (thinking of attack surface here), you might as well install a regular web server at that point. What's the benefit of using busybox for this?
Edit: ah is this file size? Or does this refer to RAM? Either way, sounds like that's the benefit
> BB httpd can be compiled with only basic features like CGI and ETag and will have only 8Kb
BusyBox provides a complete set of POSIX utilities and more in a single minimal binary; there are lots of embedded systems with pretty much just BusyBox installed, although I'm not sure how common it is to compile it with httpd. GP linked an OpenWRT article, so I'm guessing (the site seems to be down) it's included in that OS.
All this tells me is that preventing directory traversals can only be done by checking absolute file paths are within a bounded range, and nothing else.
Running the server as a service account that can only read its own directories, running it in a chroot, running it in a mount and pid namespace, using SELinux to further restrict what files it can read even in principle.
Of course, if you're trying to go superminimal anyway, it's not that big a deal to create a server that doesn't even have sensitive data on it. You can make init simply mount a root filesystem that only has busybox and whatever files you want to serve and starts up the httpd process and nothing else. Turn Linux into a unikernel basically. If you compile busybox yourself, you're also able to remove all the subcommands you don't actually need.
neat! i was aware of bash's built-in tcp client at /dev/tcp, but gawk being able to function as a tcp server is way cooler. Thanks for that bit of knowledge :>
What does the loop and sleep 1 do? Is that to respawn upon crashes (why'd it crash?) or does it exit socat after handling a request?
I recently made a netcat webserver returning only one static response just to have a tiny info page for my new email server, that needs a while true loop but no sleep. I then benchmarked the performance and was very surprised to find that a slow VPS manages [spoiler answer] https://lgms.nl/p/cau/?b64&bY%2FBTsQwDER%2FZbivViDxAxw58Q1pO... Performance graph: https://snipboard.io/Vtn0MO.jpg
I've heard of Awk for years and (probably like many) only used it for single-line snippets for the vast majority of that time.
Imagine my surprise when I just decided to look into it one day (after finding slightly more complex Awk scripts that did a lot with very little code which piqued my curiosity) and finding this very nice line-oriented DSL that has aged SHOCKINGLY well given how old it is.
I just wish it had interrupt handling of some sort without running a custom fork/patched version
If you enjoy bricolage and you feel like wasting a weekend, I would recommend giving it a try!
[0] https://openwrt.org/docs/guide-user/services/webserver/http....
[1] https://www.gnu.org/software/recutils/
Edit: ah is this file size? Or does this refer to RAM? Either way, sounds like that's the benefit
> BB httpd can be compiled with only basic features like CGI and ETag and will have only 8Kb
Deleted Comment
Hmmmm
http://localhost:8888/..../..../..../..../..../..../.../.......
Was gonna write:
http://localhost:8888/..../..../..../..../..../..../etc/host...
mypc
These regex substitutions are so easy to bypass :)
Of course, if you're trying to go superminimal anyway, it's not that big a deal to create a server that doesn't even have sensitive data on it. You can make init simply mount a root filesystem that only has busybox and whatever files you want to serve and starts up the httpd process and nothing else. Turn Linux into a unikernel basically. If you compile busybox yourself, you're also able to remove all the subcommands you don't actually need.
> gsub(/\/\.\.+\/?/, "/", request_filename) # avoid directory traversal
source: am "regex expert" >..< (and know how to spell)
https://github.com/chebykinn/sedmario
https://news.ycombinator.com/item?id=22085459
Pretty printed: https://gist.github.com/tyingq/4e568425e2e68e6390f3105e58878...
The patch is now included in bash 5.2
homepage https://marek.terminus.sk/prog/hawkh.shtml
I recently made a netcat webserver returning only one static response just to have a tiny info page for my new email server, that needs a while true loop but no sleep. I then benchmarked the performance and was very surprised to find that a slow VPS manages [spoiler answer] https://lgms.nl/p/cau/?b64&bY%2FBTsQwDER%2FZbivViDxAxw58Q1pO... Performance graph: https://snipboard.io/Vtn0MO.jpg
Imagine my surprise when I just decided to look into it one day (after finding slightly more complex Awk scripts that did a lot with very little code which piqued my curiosity) and finding this very nice line-oriented DSL that has aged SHOCKINGLY well given how old it is.
I just wish it had interrupt handling of some sort without running a custom fork/patched version