I work on a complex desktop application, and it's been astounding the number of bugs that have appeared over the years triggered by spaces and other unusual characters in file names. If you do anything with subprocesses or path processing, it's absurdly easy to hit in a thousand different ways, over and over again.
Pro tip: rename your development directory (or even better: the workspace path in CI) to put a space and/or special characters in it.
Forces you to deal with this properly, and immediately ensures that every automated test checks this case without you having to remember every time. Hasn't been particularly inconvenient, since I'm autocompleting it 99% of the time anyway, and I haven't shipped a single path parsing bug since.
It not only keeps people on their toes due to the whitespace. The folder name is even localized. E.g. with german settings there is C:\Programme and c:\Programme (x86).
I wonder how much global work could have been saved if Microsoft also provided a covered interface for all paths in the system. Not sure if there is any, but one good implementation might save thousands of poor implementations required to handle it.
On the other hand their case sensitivity behaviour means that “cross-platform” Java applications can break if they are run on a non-windows platform where opening files is case sensitive (unlike on windows)
I just wish they had a decent way to execute programs with arguments that might include spaces. But no, every program can do argument delineation differently.
I know that at least like, idk like 3-5 years ago, when I had gotten a new windows laptop (windows 7 or 8 I think), setting the main account to have the name "" (without the quotes), caused some problems with the basic functioning, including, I think, with some pre-installed programs,
So, some things were still being handled not quite right (whether that's because it shouldn't be allowed to be the username, or because programs should handle it being in the path, I'm not sure, but probably one of those.)
And then to really mess you up and ensure you handle parens properly, threw “(x86)” into the mix. (A real pain on some REPLs as well as dealing with environment variables).
Except for programs that were too old / obscure to fix I guess. I think at least the Symbian Development Kit was such that builds would fail with strange errors unless you installed it in any other path than the default immediate subdirectory of C:\, let alone under "Program Files".
I set my nickname to U+FFFD at one point in one work system, resulting in a variety of bug reports and concerned emails. I think I dropped it since it was generating false reports from people who didn't check what character the page contained before reporting it.
To have such thoughtful coworkers. On an old team I had two coworkers named Chris and once in a blue moon when they reviewed each other code master would start crashing because one of them accidentally left in an absolute path starting with "/home/chris/".
A related too for CI: change the system time to be a time zone that is during your work hours in a different day already than UTC. Really helped getting failures earlier than 4pm PST.
One of the systems I built is being used by a group of younger people. I included an emoji in the superuser account name, just to make sure it would work. And to remind me to think more broadly about user input.
I've used to have a space in my user name and even contemplated to add a bit of non-1252 Unicode. You find a lot of issues, but unfortunately often in tools you have little control over and end up not being able to work effectively at times. It ended up being more frustrating than helpful.
I add a Japanese character into any .py, .js and .html file to ensure that Unicode is working properly through the entire chain. Mostly in form of a variable which gets passed along, even in URL parameters.
> it's been astounding the number of bugs that have appeared over the years triggered by spaces and other unusual characters in file names
If you consider spaces “unusual” I would say you haven’t encountered a single average user in your lifetime. Spaces in file-names is the single most common thing people have, outside programming environments.
As a x-plat developer, the only platform where I (still) regularly encounter these kind of bugs are platforms where solving problems through scripting is common, like Linux, where the primary means of operation is through stringly-typed statements getting parsed and processed in a untyped-fashion. It's not very reliable.
On Windows people more often use “real APIs” (because scripting doesn't really work as well), but then these problems just goes away.
It's especially funny that it affects Linux so much. Most file systems allow everything except `/` and NULL in file names. Early AT&T UNIX even allowed NULLs! POSIX shells use the IFS variable to perform field splitting, and it defaults to <space>, <tab>, and <newline>. The choice to perform field splitting by default (particularly with spaces in the default IFS set) has caused no end of headaches for developers and users.
It doesn't even have to be complex, often basic automation tasks fail with spaces and special characters. Honestly, treating a file system like a natural language processor is a bad idea. Besides at this point with how digital we have all become who can't understand...
thisismyconfig.txt vs this is my config.txt or this_is_my_config.txt
...i've forced myself to stop using spaces, character, and even cap. They are all constructs that provide minimal value for the extra complexity.
I'm similar, but I would like to support labels intended for humans, along with various translations, as metadata on top of e.g. filesystem path components.
Why stop there. A computer works more efficiently with numbers rather than strings, so let’s just give each file a number instead of a string. Besides, at this point with how digital we have all become who can’t understand… But wait, that already exists and is called an inode.
A file system has a human interface and a computer interface. Don’t mix them. Let users give file names in whichever way they please.
My favorite filename special character bug was when I implemented CD ripping in 2005, and one of our beta testers ripped a CD with a song called "Have You Ever?". My code wasn't prepared to filter out the question mark on Windows.
I changed my username to not contain a space because it was too annoying to deal with all the random dev tools breaking. The worst offender was probably npx on Windows [1] (resolved after four years by deprecating npx), but it was far from the only one (though the JS ecosystem was somehow the worst in this regard of all languages I worked with).
Saw a few hacks where malware authors used the RTL feature (which is baked into Windows) to obfuscate file extensions. It looked like .exe.innocuous-document.docx, but was actually .docx.innocuous-document.exe
I once returned a printer because the Mac driver and support software expected and enforced case insensitive access and basically couldn't install properly on my case-sensitive HFS+ volume. It half installed and blatantly just didn't work in any way when installed.
I have coworkers on Mac that write node/JS code. Every once in awhile I'd pull down the latest code and it wouldn't run. I'm on Linux.
Sure enough, they had SomeFile and were importing Somefile and it works fine on Mac but not on Linux (which, of course, is what our production servers use). It amazes me that "works fine on my machine" is still a thing when I definitely worked at companies that solved this back in the 2000s. It was solved. It was done. Then devs became enamored with running everything locally. Even dozens of microservices or databases. Even though JS is fairly isolated, you still have NPM packages that need built against the local OS and C/C++ library and compilers, etc. Which also has caused issues in the past.
I also enjoyed doing that, but had to make a DMG just for Steam because it straight-up refuses to run on a case sensitive FS (that's true on Windows, also, which I suspect is how we all got here). I think the most recent Steam versions either caught wind of my trickery or -- more likely -- run something from $HOME/Library/SomethingOrOther and thus the work-around it no longer works
When I got a new Mac, I just gave up and acquiesced to the case-retentive world :-(
Circa Y2k, I learned that the OSX Palm Pilot software didn't work with case sensitive. I've since given up and stuck with the default. (I'm anti-case folding in general, because of the ambiguity.)
I maintain a similar system, where a variety of companies submit files that get processed through multiple services - it is astounding how ridiculous people’s naming of files can be; spaces are the least concerning!
I'm begging software developers to stop using subprocess APIs that take a string argument (system(), child_process.exec(), Process.Start(string)) and start using subprocess APIs that take an array of arguments (execvp(), child_process.execFile(), Process.Start(string, IEnumerable<string>).)
I don't know if it's still a problem, but it used to break Python virtualenv badly. If your working directory had a space anywhere in the path, it would throw a huge fit and not work. Which is problematic when the expected name for a Mac's boot drive is "Macintosh HD" (if you ever had a reason to run a virtualenv outside of your home directory).
The only common place where it doesn't work is in CMD for executing programs and as arguments for built-in commands. Everything else goes directly to the relevant APIs which don't care about / or \.
These days using CMD instead of PowerShell should be rare enough and PowerShell certainly doesn't mind the slashes.
It's easy to tell users to make a folder with no spaces if you're setting up a global path, however if you have an application that runs in user directories things can become painful fast. Changing your user name is a pain and can leave things inconsistent, but having to handle all the variations in people's names with spaces, punctuation, international characters, can just be mind boggling.
I did something similar on accident. I used to keep all my development work synced with Dropbox and I had a work and a personal account. So any of my own projects would have /Dropbox (Personal)/ in the path which did catch some bugs. Dropbox renamed my folder to "Dropbox (Personal)" automatically when connecting a work account.
More importantly than your source files, put your testing data on such a path as well. Nobody uses absolute paths in testing so it doesn't matter how many spaces your absolute path has if your input is "./tests/file1". Put those files in a folder with spaces too and throw in a unicode character for good measure.
> Pro tip: rename your development directory (or even better: the workspace path in CI) to put a space and/or special characters in it.
The problem with that is that YOUR code may handle it, but your tooling may not. If my code formatter break on spaces, I'm not going to change the formatter.
Somewhat related to injecting unusual characters, in my experience in localization efforts:
Inject a Turkish 'I'. I don't know how to type or paste it here, but picture an English lower case 'i' that is upper case. It is a splendid way among many to shake out some loc bugs.
That would only shake out anything if you'd also test in a Turkish locale, wouldn't it? Since Unicode casing rules are locale-dependent and en-US doesn't care much about dotless i or dotted i.
Late '90s I worked on Java software that got installed on several Unix platforms, including Linux for IBM mainframes. When you deal with the default en/de-coding of Unicode to EBCDIC you never have trouble with Java byte encodings ever again.
Even capitalization is a pain in the ass thanks to how OSes treat file names. I pretty much stick with either `file-name.ext` or `file_name.ext` exclusively now.
In my case, and for many people writing desktop software, and for absolutely everybody writing open-source tools or libraries, unfortunately you can't control the environment.
Non-ASCII paths are extremely common (e.g. the user's home directory on Windows, for the large majority of users outside the English-speaking world) and spaces, punctuation and weirder characters will definitely happen when you least expect it.
Yes if you can avoid it then absolutely that's great, but I don't think most people can.
It's also not usually very difficult to deal with, as long as you actually spot the issue in the first place.
only allow ASCII, maybe dashes, and up to twelve characters. Problem solved
...and only hire people from the exact same background as you, who will never have unusual characters or accents in their name. And also make sure not to have any users who aren't exactly like you, and conform to this very narrow requirement. Surely, excluding 90% of the world won't hurt revenue in any way.
Ugh, we have the 15 character Active Directory limit now with hostnames, and a previous IT administration has imposed a convention that every name had to follow [prod|dev]-[ph|vm]-[service]-[nn]. So basically every production service is prod-vm-owtf-01— you get exactly four characters to actually describe what the machine does. Works great when the service is "jira" or "wiki", but there are a lot that are pretty mystical-sounding, like jkns, jwrk, cntr, hrbr, etc, where you kind of just have to know.
I spent hours trying to figure out why an entire folder suddenly stopped syncing. Turns out I accidentally added a hidden space to the end of a folder name.
> Pro tip: rename your development directory (or even better: the workspace path in CI) to put a space and/or special characters in it.
This will also break any code in external tools that are called during the builds of your application and do not handle spaces correctly for whatever reason, thus making it so that you won't be able to successfully finish the build.
Then again, you probably shouldn't be relying on technologies like that, but when you're struggling to keep an old enterprise system alive, causing yourself more problems is not necessarily what you should do.
Overly aggresive is right! I don't know if this is genius or deranged! I'm leaning towards genius and stealing the idea.
By the way: what's your beef with en dashes? I mean, if it was "everything should be 'HYPHEN-MINUS' (U+002D)", then fine, but why specifically en dashes and not em dashes?
I totally agree that for some people, this could be a terrible command to have around. However, I know that it has been working for me for about 8+ years or so. I almost always run in in my ~/Downloads folder on files that I don't really care about. I download a lot of academic papers and books, and this just saves me a lot of time to put files in the format I like: author--paper-title.pdf. And that's part of the reason why I make all of the dashes the same, so if I'm opening something by an author, I can easily autocomplete and not have to remember how to make other sorts of dashes on the command line.
I use this snippet, to change spaces to underscore for directories and files in the current directory and below. Haven't made it a function yet, but should. I got it from stack overflow or somewhere, but no attribution. Thanks to whoever did it first:
find . -depth -name '* *' | while IFS= read -r f ; do mv -i "$f" "$(dirname "$f")/$(basename "$f"|tr ' ' _)" ; done
I've thrown some edge cases at it, and it handles it super well. It deals with consecutive "_", remove leading garbage, normalize unicode, and even prevents naming conflicts by opting out early.
I agree. I am a developer and I know how to deal with special characters. But this isn't something I use professionally. I just prefer not to have to deal with special characters in the pdfs, m4as, txts, and other files that I use on a daily basis. When I write papers, I'll write ū or Ñ or ç or whatever (incidentally, I have a lot of shortcuts in my .vimrc for those). I would not say I am "afraid" to use spaces in filenames, but I get a certain satisfaction storing academic papers in the author--paper-title.pdf format and my notes in author--paper-title.md because it helps me find things.
Yeah, sometimes I end up renaming things I don't want to, but it really doesn't happen all that often. And sometimes I throw caution to the wind, add some excitement to my life, and rename a bunch of files (not for anything professional) in some really old directory and hope I don't break anything. But I'm not aiming for perfect with this comment. I just mentioned in another comment, but the vast majority of times I run this is in my ~/Downloads folder on files I don't really worry about breaking.
That's the most beautiful part! After running this script there are no more conflicts, because it just silently overwrites all but one version of the "cleaned" filename.
(Also—that entire function is super inefficient and could be replaced with a single invocation of "rename".)
Define "space". Is the Hangul filler we talked about yesterday a spacing character? Is the zero-width non-breaking space a spacing character? What about the typographic spacing characters?
You should better be very afraid of using spaces in filenames.
You should do everything you can to support them but you have to know you'll invariably encounter countless cases where you'll have this or that tool that won't work properly with them.
I still live in a world where I cannot name a song from the french group L'impératrice with an eacute in the filename or my car's media system will display garbage (it's running QNX and I don't know which filesystem).
FWIW, and it should be food for thought, every single Git repository in the world contains a pre-commit hook sample (disabled by default but it's there) that enforces that every committed file in the repo is named using a subset of ASCII characters.
Every Git repository in the world has that example: let that sink in.
> FWIW, and it should be food for thought, every single Git repository in the world contains a pre-commit hook sample (disabled by default but it's there) that enforces that every committed file in the repo is named using a subset of ASCII characters.
I use Git for documents too, not only code. Why shouldn't I use my native language?
> I still live in a world where I cannot name a song from the french group L'impératrice with an eacute in the filename or my car's media system will display garbage (it's running QNX and I don't know which filesystem).
I have an Android phone and I tell MusicBrainz Picard to save all files with ASCII-only names and Windows-compatible names for the ones that get sent over to the phone. Basically for this reason. Sometimes it's players on Android itself, but even more frequently, whatever bluetooth radio I'm connected to freaking out with non-ASCII characters.
You get all those space characters working and then some jerk comes along and uploads a file like this: ŗ̶̧̢͓̳͍͙͔̳̻̥͉̭͓̫̟͍̞̭͉͓͉̮̹͍͚̳̹̬͉͚̰͈̘̐̊̾̈̀̒͒̀͛̓̋̔͊̏͘̚ę̴̨̛̣͙̤̟̬̩̟͙͖̥̹̱̱̊͑͗̇̇͛̆̈́̃͋̓̀̔̍̍̌̐͊̎̓̅̀̕ͅģ̴̹̜̘͍̱̑͐̉̌̐̄̊͛̎́̐̌̅̈́͂͑̈́̋̔͂̊̊̒̒̔͛͆̚͘̕͠e̶̙͕̫̳̘͐̾́̑͆̓͂̿͊̊̍͛͐̌̆͗̌̅̅̔͊̂͛͗̅̕͝͝͝͝x̵̢̧̦̫͖̝̥̹͓̬͖̤̩͚̝̫̋̃̅̈́̆͋̌͑́̎̈́̊̾͒̀̒̎̓͛͊̿̓͊̀̍͐̆̚͝͝-̴̨̮̯͖͖̠̜̲̪͕̘͈͖̮̈́̓̐̃́̅̄̏́̍̉̐͌́̔̓̄͋͗̐̕͜͝ţ̴̢̧̖̗͖̞̮̫̦̼̝̺̼̱̳͓͉̜̟̤̲͖̻͙́̌̈̌̈͆̾̄͊̿̏̓͗̈́̕͜ͅh̶̢̧̨̥̭̼̟̣͖̯̗̤̖̙͉͕̙͎̰̠̝̖͈̻͙̪̮̘̯̻̼͕͓̖̣͈̽́͊̎͐͌̆̍̎̏̿͐̒́͋͑̍̿̎͆̑͆̄͂̀͐̄͑̀͗̿̽̎̾̊̕͝͝͝͝͝ͅi̴͚͈͍̫̮̝̣͖͉͓̯̠̙̭̟̖̘̾̓̄̈́̒̏̽̆̉̿͛̀́̃̋̒̈́͋̂̇̈́͛̕͜͠͠͝ͅs̶͇̖̳̞͉̱̞͓̖͔͔͍̗͇̖̮̹̅͊̔͋͊̈́̎̐̆̋̒̀̍̕͜ͅ.̴̧͎͇̰͉̼̱̰̦̟̑̋̏͌̍͊͑̄̀͌́̆̓͛̒̆̾̉͐̄̂̈́͆̒̃͗̐̂̎̈́̈͛̿́͛̾̚͘͜͝͝ͅȩ̷̡̲̪̱̪̥̳͍̼̰̘̗̹͙͙͓̣̟̩̥̥̖̠̪̮̹̞̥̻͎͖͍̯̂͑̏̑̆̍͋̎͛̅̑̑̏̎̓̀̓̒̈́͊͌̀̈́̒̌͐͂͛̊̍̐͂́̔̌̾͐̈́̋̇̏̚͜͝͝͝͠ͅx̶̧̛͚̗̜̪͍͖̘̙͎͚͇͙̬̱̟̭͓̺̙͍̖̱͚̣̘̪̭͔͔̮͎̬̪̤̹̟͔̩͍̬͕͔̩͐̈́̒̂͛̂̈̀̿̍̔̓̓̀̃̍͆̈́̍̓̌͐̈́̾̇̎̑͌͒̄̆̿̍͆̅͗͆͘͠͝͝ͅͅͅe̷̢̡̡̨̧̛͕͚̬̮̞̥̼͍͔̝̟̝̯͈̟̥͖̱̹̣̩̼̩̅̌͌̑̎̐̀̽̏́͐̋̏̎̎͛͌̀̊͊͒̑͌̎̎̑͊̌̉͆̾̚͘̚͜͠͠͠͝͝ͅͅͅ
For anyone who is curious (and acolytes of Zalgo): "In Unicode, character rendering does not use a simple character cell model where each glyph fits into a box with given height. Combining marks may be rendered above, below, or inside a base character. So you can easily construct a character sequence, consisting of a base character and “combining above” marks, of any length, to reach any desired visual height, assuming that the rendering software conforms to the Unicode rendering model."
Until now, I haven't actually thought of what would happen if zalgotext occurred anywhere other than a web browser. Looking forward to the five minutes of fun with the file manager and whatnot.
768 characters is too long for macOS it seems. (References online say HFS+ has a limit of 255 UTF-16 characters. Didn't find anything for APFS immediately... edit: same for APFS)
I have an uneasy feeling whenever I see a path parameter declared as string. Path is not a string - it's a sequence of path components and should be treated as such by our APIs. A path should be parsed once - on user input - and then used in its "sequence form" throughout the software stack.
And "path component" is not an arbitrary string either - e.g. appending a path component to the path should first require converting/parsing the string into the path component, and only if that's successful appending it to the path.
"Path is not a string - it's a sequence of path components and should be treated as such by our APIs."
For maximum correctness, you want to turn it into a file handle as soon as possible, and do all operations through the variations of the file functions that end in "at", like: https://linux.die.net/man/2/openat
The downside of this approach is that you still technically have to carry the path around with you if you ever want to present it back to the user, because once you have a directory handle, you can get back to the root directory easily enough by following parent links and seeing what directories you end up in, but that may not be what the user "thinks" the path is, and they want to see their path, not a canonicalized one. And they're mostly right. And it's not easy to correctly track changes to their intended path from this basis either.
Basically, I don't know of a really solid, 100% correct way to handle this with any reasonable degree of effort.
> For maximum correctness, you want to turn it into a file handle as soon as possible
That's not right. You want to resolve a file/folder path to a file/folder at the exact point it makes sense.
It's a problem if you're using a path when you wanted the file. The file can be switched/modified out from underneath you.
It's also a problem if you've got the file when you only wanted a reference. Now you can't simply switch/modify the file independent of the reference. E.g., maybe you want config file changes to take effect immediately and transparently.
You can also have the hybrid case, e.g., where you want the folder directly, but have a relative path to a file that is resolved late.
If you're unsure, I'd err on the side of late resolution.
"you want to turn it into a file handle as soon as possible"
But no sooner.
For example, I've run into problems where I'm configuring program A server to talk to file location B... but I don't have access to file location B. But the client-side library for talking to the server tries to convert location B into a file handle and then freaks out because I can't access it. When I don't want to access it. I want that program to serve it.
If it was using simple "path" objects that didn't confirm that I have access to the path, everything would be hunky dory. But because it tried to convert it into a file handle unnecessarily, I get blocked.
Another inconvenience with this approach is that you can keep thousands of paths in memory no problem. But thousands of FDs may cause you to exceed per-process limits.
This goes for most instances of user input. Timestamps is the other common one people get wrong. I've even seen programs that pass around timestamps as strings in multiple formats and as integers (Unix time).
Why not just hold onto both? The users representation and the file handle. Only ever "display" the representation, while you do all operations on the handle. (Not trying to be sarcastic, just curious).
> I have an uneasy feeling whenever I see a path parameter declared as string. Path is not a string
I guess that depends on what you mean by "string". `open` and `fopen` need a char* path to open a file. Whatever fancy Path abstraction you use eventually becomes a char* string, because that's what the kernel needs.
Strings following certain rules are entirely valid representations of paths, just like sequences of path components in the chosen language/framework are. Similarly, the sequences of bits that make up the sequences of your language/framework in memory are an entirely valid representation of said sequences of components.
Yes, paths have structure, but saying "a path is not a string" is equivalent of saying "C source code is not a string". Both are strings, and both are something else, represented by strings according to rules. Different internal representations have different advantages and disadvantages. I fully agree that for things such as "adding components" an internal sequence/list representation is better, but strings can pass arbitrary IPC or even ABI boundaries much easier for example. (And you wouldn't bat an eye for example when you see FQDNs like "www.google.com" passed as a string instead of as ["www","google","com"] because the string representation works pretty well.)
C source code and paths are both representable by strings, true, but the fact that they're not actually strings is still important, because most people don't know that, and in the case of paths that leads to a lot of edge cases (in the case of source code it leads to a bunch of inefficient and weak tooling, which isn't quite as bad).
Because neither are strings, their native representation shouldn't be such - it should be something structured, and only when necessary (IPC, FFI, serdes) be serialized into a string representation. This would save people a lot of time and effort.
POSIX "Fully portable filenames" allow all characters except 0x2F (/) and 0x00 (NULL). That means file names can include line feeds, backspaces, EOF, etc.
"This is `a
perfectly vali'd.\010! file name\377, despite the weirdness"
things like this are why the Unix philosophy is so bad.
text processing is hard if you must support Unicode, and that means every Unix command line tool must implement or employ a text processor to handle input. it would be much easier if objects were passed back and forth. PowerShell got this right.
Everything seems to be going this way in Linux land. Longer names, harder to type names, camelcase names, spaces... I'm looking forward to an OS that treats command line ergonomics as a first class feature and where camelcase & spaces are verboten.
Well, if you think that's bad, behold the recent trend in network interface names on Linux.
We started out with 'eth0', 'eth1', etc. Which adapter was which could change when adding and removing a network card. That was bad, so that prompted the evolution.
Now we have 'enp1s0', 'enp0s31f6', 'enp13s0' and many similar variations. These are supposedly more stable across device changes. As it turns out, it wasn't.
But wait, there is more! Now we have the "predictable names" scheme that produces interface names that are even longer, and not even slightly easier to remember.
I do get that it is not an easy problem to solve, especially in the face of removable network interfaces (like USB Ethernet / WLAN). But surely this is not the best we can do.
I was actually ranting about this on IRC last night (yeah now my laptop has two enp* interfaces and enx[MAC])..
One thing I like about OpenBSD is that buses are scanned and drivers probe in order and there's no race between drivers coming up. Unless your hardware is physically tampered with or broken, all interfaces come up with the same name across reboots. Linux isn't like that (even if you don't touch your hardware, interfaces could swap across reboots), so you need to do something about it.
As is typical on Linux, the default is unergonomic and if you want something nice, you're on your own to make it so.
If you already have userspace daemons responsible for device insertion and naming, it really wouldn't have been so hard for it to e.g. automatically add a config file / database entry for each interface the first time is seen. So the devices that came up as eth0 and eth1 are still eth0 and eth1 on the next boot; if I unplug eth0 and add a new card, the new one would be eth2 because eth0 is still reserved for the first card I had.
I find this attitude misguided. More descriptive names are more ergonomic for things you only use rarely but they need to be combined with much better autocompletion than most shells provide by default.
You state that as if that were objective.. but that's not my subjective experience at all. Somehow I have a hard time remembering these long names, (is it --conf or --config or --config-file or --config-path? -c would've done it for me. --set or --set-prop or --set-property or --prop or --property?), and I need to look them up in a man page anyway, and I make more typos typing them, and shell completion rarely works well if at all. I also find it harder to read and edit long lines that wrap.
Somehow these short letters stick much better for me, and the effort for finding them in the manual is the same, although in case of extra complexity as with xinput, it's even worse with the long names. I don't use either command often, but it's hard to forget xset m. The only thing I remember about xinput is that it's a horribly long lithany of things which I need to look up every time, and the syntax still feels weird.
Spaces don't make anything more descriptive, they just cause completely unnecessary quoting and escaping hassle.
The amount of time that has been wasted by Windows using "C:\Program Files" instead of "C:\Program_Files" far outweighs any highly questionable aesthetic benefit IMO.
Short option for interactive terminal. Long option in automation.
I’ll be damned if I have to remember or lookup what -n means to some obscure program, when reading someone else’s script. Exception given for super common tools where everybody knows like ls -la.
With the disclaimer that shell scripts, especially ls, aren’t exactly suitable for reliable automation in the first place.
Cue nmcli (CLI for Gnome's NetworkManager) which uses UUIDs for everything and (at least a while ago) did not accept partial-but-unique UUIDs. Basically goes "nmcli connection up 5095665a-d82c-4ae6-8964-283623387941".
Weird, I haven't had to do this. Most(/all?) connections have nice names you can see with `nmcli c`... and so I can do `nmcli c up id DroidNet` and that's pretty dang nice. Pretty sure this worked with Ubuntu 14.04 (though, nmcli has gotten much more featureful since then)
(The ability to shorthand connection->c and similar is great, too; obviously not unique to nmcli)
Sure. One could also make "move-down-one-line" be the incantation to move the cursor down a line in vi, but I prefer j.
Ergonomics isn't all about making everything self-descriptive for someone seeing the thing for the first time. It's about making things comfortable to actually use. If it's so long and complicated that you can't even remember how to do it, it's not very comfortable to use. Even if I could remember, xset m 0 0 is still far more comfortable.
And fwiw you still don't know what 0, 1 in accel profile do; you need to look that up or take a wild guess, and if you want to use that command, you'll also have to know how to look up the device because chances are yours is not the same as mine. So it's not any less magical in the end, just more verbose.
The "cool" thing about the xinput command is that you don't even find accel profile in the man page. You gotta look elsewhere if you want to understand what it is and what it does and what the parameters are.
But which case should software interfaces optimize for? Ergonomics of someone who uses a tool frequently, or interpretability for casual by-standers of some out-of-context shell command?
That may be a part of the problem but honestly I don't feel like all these new crazy interfaces are easy to learn either. I mean how do you come up with the lithany xinput calls for? You need to understand the syntax for specifying a device. You need to know that you're to set a libinput property, and you need to know the name of that property, and it's not documented in xinput man page, and of course you need to know the values to pass which again are not documented in xinput man page. You can play with --list-props and then take your search elsewhere because it is completely opaque and doesn't explain what the properties actually do.
I suspect the number of people who figured all that out without having to find it by googling / arch wiki / whatever is very very low.
Now I'm not gonna say xset is the easiest interface to figure out, but the syntax for setting mouse acceleration is right there in the synopsis, and if you search down the man page, you'll learn a little more (and also if you just run xset without arguments, it'll tell you how to set mouse acceleration). It might not be the best designed tool but it's something I learned back in the day as a teenager just by looking at the man page.
I think the real issue is that people nowadays are designing these interfaces to be consumed by interactive configuration tools, GUI apps, and desktop environments; they're more dynamic, more complex, more flexible, but not easier to figure out, not for you on the command line. The command line is just a last resort. Second class citizen if you will.
On some level it makes sense. The problem with the command line is familiarity.
How often do you reach for iptables? If you're like myself, and most home/desktop users, then probably once in a blue moon to set it up and then you leave it alone. But a system admin? Maybe they touch it a few times a week or month. Every time I use iptables I have to relearn how Linux networking works.
Similarly, the xset/xinput thing. When I need those tools I just create a script or throw it in .bashrc. I adjust the settings once and will not touch them again for a couple years. It makes sense to have long parameters that are readable. I can look at my .bashrc and see exactly what device is getting adjusted.
Long option names are more descriptive, more easily distinguished, and easier to remember. Your shell should be intelligent enough to provide tab completion for option names, assuming it is configured to.
Long option names are more difficult to remember because a long option name can be spelled multiple ways and it is difficult to remember which spelling is correct.
IMO, powershell got it right. Yeah, it’s syntax is strange, but it has standard flag usage with proper autocomplete, and you can shorten any flag the way you want (eg. fuzzy match) if it is unambiguous.
These changes are meant to make it easier to read and understand command-line incantations (and to make them more explicit, which is always good), because the command-line paradigm, being text-based, imposes an unavoidable trade-off between ergonomics and understandability/ease-of-use. It sounds like you prefer ergonomics - although I wouldn't be surprised if most users would prefer ease-of-use.
Of course, if one doesn't write a CLI to begin with, this trade-off doesn't exist - you can have your cake and eat it too.
Needlessly long parameter/command names and the bizarre insistence on capital letters are the #1 and #2 reasons I detest PowerShell. Like GP, I resent that Linux tools are moving in that direction.
I am also that age, and kebab-case is the best case for filenames.
2021-01-01-some-important-document.pdf gives me the warm fuzzies. On the off chance that some more differentiation is needed, throw in an underscore and a whole new world opens up
I see and applaud your use of the underscore there, but I must reject the premise!
work/client/project/2021-11-11-file.ext is more or less how I lay stuff out. I’d say client/project is a folder level distinction (arguably dates too).
[EDIT] Realistically most of the stuff under <project> is git repos and I usually make a “home” repo where I keep org files for tracking hours, notes, and resources related to the engagement.
> But underscore separates fields, hyphens for space replacement
But why not the other way, hyphen-minus for separating fields and underscore for space replacement? That seems to me more consistent with how underscores and dashes are used.
I've recently shifted sharply toward the dash from the underscore. I find it more readable, and it doesn't require the shift key. However, I do find it useful to use underscores to create groups, e.g. test-001_2021-10-11.log. Including hours, minutes, seconds is still awkward.
There's a customer for everything. I've just never liked the aesthetics of the underscore. Also if your underscored thing gets put in some document and then underlined the underscores can become invisible.
If you hadn't heard kebob-case called that before there's a chance you haven't heard SCREAMING_SNAKE_CASE called that before, and I couldn't live myself if I didn't let you know.
I'm of the opinion that kebab-case is the best case for all identifiers, because it's easy to read and to type. As always, Lispers were right all along.
Kebab case is the often overlooked benefit of prefix notation and semantic white space in programming languages. Honestly the best case of all cases imo.
I found that some_document_2021-01-01_v03.pdf works best because it keeps the same document next to its other versions alphabetically, keeps them in date order, and keeps them in a sub-day version order.
As a side note, in the good ol' times of ISO9660 level 1-4 and the various mkisofs parameters, an underscore _ which is a CAPITAL -, may have given issues, only for the record/as a curiosity:
One of the main reasons why Windows used "Program Files" and "Documents and Settings" was to force the programs (and programmers) to deal with paths with spaces. And you know, for the most part it kinda, more or less worked out although of course even today you will find programs that ask you to install them in a folder without spaces in the path.
And that was a good idea, if only Microsoft also fixed the CreateProcess function, Windows would be somewhat sane in this regard. But somehow nobody seemed to think of it. Seriously, look at it:
The arguments are a single string. So you want to pass parameters with spaces in them? You've got to add quotes and stuff all of that into a single string. Instead of doing it in a more sane manner, like oh, the arguments to main().
The root cause is that argv isn't a first-class citizen like on linux, but an abstraction. The kernel only cares about a single string argument. If you use main instead of WinMain, the CRT will transform the single string into an argv for you.
Oh and cmd.exe uses a different escaping scheme than the CRT.
The main culprit for space issues is stuff relying on BAT or CMD files, where escaping variables seems to be a black art.
Sadly such set includes loads of Java programs. If only SUN had shipped a standard way to generate isolated exe files in 1998... but they worked under the presumption that you'd have a JVM already there, because distributing that monster was difficult in dialup times, so you could just hand people a jar; and the enterprise market did not care, since they had webapp servers. Sadly it's an "optimization" that became obsolete very quickly but wasn't rectified until it was too late (java 9+).
> The main culprit for space issues is stuff relying on BAT or CMD files, where escaping variables seems to be a black art.
Actually it isn't, just use double quotes and add a '~'. It's just about the only thing batch files handle better than shell scripts.
set "VARIABLE=%~PATH"
That annoys me every time I use a Windows system. It was a terrible decision, especially since both the command prompt and the new powershell doesn't accept like bash a backspace before a space, you have to quote the whole path! I get that most users on Windows don't use the shell, but as a developer I do a lot, and every time it's a pain (no wonder they added the WSL in Windows after the failure of Powershell...)
Why would they accept a backslash? Backslash is a path separator on Windows. In most Windows programs, you don't even need to escape the space - arguments can contain spaces and it will understand it, like `notepad My file.txt`
The escape character on PowerShell is backtick, and on cmd it is caret. You don't need to quote everything.
Yes, I was doing code to quickly read FAT folders (on a micro controller) and got to the bit about filenames more than 8.3. I decided my life was too short (and processing time) to go and sort out what the "real" file name is. Enforced 8.3 as a requirement!
They may have thought that would happen but I saw just as much stuff end up in C:\Windows or \Users or (always my favorite) those “Documents” that are really just “whatever random crap every app wants to put there”.
Pro tip: rename your development directory (or even better: the workspace path in CI) to put a space and/or special characters in it.
Forces you to deal with this properly, and immediately ensures that every automated test checks this case without you having to remember every time. Hasn't been particularly inconvenient, since I'm autocompleting it 99% of the time anyway, and I haven't shipped a single path parsing bug since.
> Microsoft intentionally made programs install to C:\Program Files on Windows 95+ to force programmers to deal with spaces in filenames.
Not least, it makes writing scripts for various shells and getting the quoting rules right an absolute pain as well...
https://docs.microsoft.com/en-gb/archive/blogs/twistylittlep...
In nix at least you can call execve or other APIs that take a char argv[] and the whole problem is largely solved and you don't need to quote things.
So, some things were still being handled not quite right (whether that's because it shouldn't be allowed to be the username, or because programs should handle it being in the path, I'm not sure, but probably one of those.)
Deleted Comment
> C:\P̷̧̽r̸̬͘ŏ̵̮g̷̜͘r̸̦̋a̴͎̒m̶̲̈́ ̷̠̉F̵͇̈ĩ̴̫l̶̨͗ë̵̦s̸͚͆\
Deleted Comment
Easy fix!
A former co-worker changed his name in our auth system to include an apostrophe, so that whenever we handled names wrong he'd find it.
it keeps everybody on their toes lol.
If you consider spaces “unusual” I would say you haven’t encountered a single average user in your lifetime. Spaces in file-names is the single most common thing people have, outside programming environments.
As a x-plat developer, the only platform where I (still) regularly encounter these kind of bugs are platforms where solving problems through scripting is common, like Linux, where the primary means of operation is through stringly-typed statements getting parsed and processed in a untyped-fashion. It's not very reliable.
On Windows people more often use “real APIs” (because scripting doesn't really work as well), but then these problems just goes away.
Pros and cons, I guess.
thisismyconfig.txt vs this is my config.txt or this_is_my_config.txt
...i've forced myself to stop using spaces, character, and even cap. They are all constructs that provide minimal value for the extra complexity.
Just wondering, what is the readability of this for people who are dyslexic?
A file system has a human interface and a computer interface. Don’t mix them. Let users give file names in whichever way they please.
could you please explain what you mean by that?
I changed my username to not contain a space because it was too annoying to deal with all the random dev tools breaking. The worst offender was probably npx on Windows [1] (resolved after four years by deprecating npx), but it was far from the only one (though the JS ecosystem was somehow the worst in this regard of all languages I worked with).
1: https://github.com/zkat/npx/issues/100
Saw a few hacks where malware authors used the RTL feature (which is baked into Windows) to obfuscate file extensions. It looked like .exe.innocuous-document.docx, but was actually .docx.innocuous-document.exe
League of legends doesn’t run until I sed files for instance.
Sure enough, they had SomeFile and were importing Somefile and it works fine on Mac but not on Linux (which, of course, is what our production servers use). It amazes me that "works fine on my machine" is still a thing when I definitely worked at companies that solved this back in the 2000s. It was solved. It was done. Then devs became enamored with running everything locally. Even dozens of microservices or databases. Even though JS is fairly isolated, you still have NPM packages that need built against the local OS and C/C++ library and compilers, etc. Which also has caused issues in the past.
When I got a new Mac, I just gave up and acquiesced to the case-retentive world :-(
I'm begging software developers to stop using subprocess APIs that take a string argument (system(), child_process.exec(), Process.Start(string)) and start using subprocess APIs that take an array of arguments (execvp(), child_process.execFile(), Process.Start(string, IEnumerable<string>).)
A small extra step but something you get used to if you spend a lot of time in the cli.
I never use / on Windows as a result.
These days using CMD instead of PowerShell should be rare enough and PowerShell certainly doesn't mind the slashes.
The problem with that is that YOUR code may handle it, but your tooling may not. If my code formatter break on spaces, I'm not going to change the formatter.
Inject a Turkish 'I'. I don't know how to type or paste it here, but picture an English lower case 'i' that is upper case. It is a splendid way among many to shake out some loc bugs.
From https://en.wikipedia.org/wiki/%C4%B0
Mysterious character requirements that do not conform with Microsoft’s OS limits, limits on tbe fully qualified pathname length, etc.
Naming freedom needs a stdlib module
Enforce this in LDAP.
Strict convention is better than flexibility and predicting obscure edge cases that can fail.
Non-ASCII paths are extremely common (e.g. the user's home directory on Windows, for the large majority of users outside the English-speaking world) and spaces, punctuation and weirder characters will definitely happen when you least expect it.
Yes if you can avoid it then absolutely that's great, but I don't think most people can.
It's also not usually very difficult to deal with, as long as you actually spot the issue in the first place.
...and only hire people from the exact same background as you, who will never have unusual characters or accents in their name. And also make sure not to have any users who aren't exactly like you, and conform to this very narrow requirement. Surely, excluding 90% of the world won't hurt revenue in any way.
But then your program will crash hard and unexpectedly when a user decides to save under "~/house plans" or ~/Téléchargements.
I think it's better to exercise this in CI, that's what CI is for.
Dead Comment
Spaces are very useful for readability.
This will also break any code in external tools that are called during the builds of your application and do not handle spaces correctly for whatever reason, thus making it so that you won't be able to successfully finish the build.
Then again, you probably shouldn't be relying on technologies like that, but when you're struggling to keep an old enterprise system alive, causing yourself more problems is not necessarily what you should do.
Still a good idea in most cases, though.
In addition to CLI I use it from emacs dired-mode too:
I bind it to "_" in dired-mode.By the way: what's your beef with en dashes? I mean, if it was "everything should be 'HYPHEN-MINUS' (U+002D)", then fine, but why specifically en dashes and not em dashes?
Of all the changes in that list, removing the character that doesn't appear on a standard keyboard seems like the least controversial...
I've thrown some edge cases at it, and it handles it super well. It deals with consecutive "_", remove leading garbage, normalize unicode, and even prevents naming conflicts by opting out early.
Thanks you.
https://github.com/dharple/detox
I have:
But also: Would not like to lose files like the the srt.(Also—that entire function is super inefficient and could be replaced with a single invocation of "rename".)
You should better be very afraid of using spaces in filenames.
You should do everything you can to support them but you have to know you'll invariably encounter countless cases where you'll have this or that tool that won't work properly with them.
I still live in a world where I cannot name a song from the french group L'impératrice with an eacute in the filename or my car's media system will display garbage (it's running QNX and I don't know which filesystem).
FWIW, and it should be food for thought, every single Git repository in the world contains a pre-commit hook sample (disabled by default but it's there) that enforces that every committed file in the repo is named using a subset of ASCII characters.
Every Git repository in the world has that example: let that sink in.
I use Git for documents too, not only code. Why shouldn't I use my native language?
I have an Android phone and I tell MusicBrainz Picard to save all files with ASCII-only names and Windows-compatible names for the ones that get sent over to the phone. Basically for this reason. Sometimes it's players on Android itself, but even more frequently, whatever bluetooth radio I'm connected to freaking out with non-ASCII characters.
L'imp?ratrice? L'imp�ratrice? L'impératrice? L'imp‚ratrice? L'impÚratrice?
[https://stackoverflow.com/questions/6579844/how-does-zalgo-t...]
The characters reach up off the screen as I reply to this. They overlay the comment above you. Amazing. How?
And "path component" is not an arbitrary string either - e.g. appending a path component to the path should first require converting/parsing the string into the path component, and only if that's successful appending it to the path.
For maximum correctness, you want to turn it into a file handle as soon as possible, and do all operations through the variations of the file functions that end in "at", like: https://linux.die.net/man/2/openat
The downside of this approach is that you still technically have to carry the path around with you if you ever want to present it back to the user, because once you have a directory handle, you can get back to the root directory easily enough by following parent links and seeing what directories you end up in, but that may not be what the user "thinks" the path is, and they want to see their path, not a canonicalized one. And they're mostly right. And it's not easy to correctly track changes to their intended path from this basis either.
Basically, I don't know of a really solid, 100% correct way to handle this with any reasonable degree of effort.
That's not right. You want to resolve a file/folder path to a file/folder at the exact point it makes sense.
It's a problem if you're using a path when you wanted the file. The file can be switched/modified out from underneath you.
It's also a problem if you've got the file when you only wanted a reference. Now you can't simply switch/modify the file independent of the reference. E.g., maybe you want config file changes to take effect immediately and transparently.
You can also have the hybrid case, e.g., where you want the folder directly, but have a relative path to a file that is resolved late.
If you're unsure, I'd err on the side of late resolution.
But no sooner.
For example, I've run into problems where I'm configuring program A server to talk to file location B... but I don't have access to file location B. But the client-side library for talking to the server tries to convert location B into a file handle and then freaks out because I can't access it. When I don't want to access it. I want that program to serve it.
If it was using simple "path" objects that didn't confirm that I have access to the path, everything would be hunky dory. But because it tried to convert it into a file handle unnecessarily, I get blocked.
This is why I get stressed out when I see paths turned into special objects encoding separators and such.
It tells me the path is living for way too long compared to the file handle.
I only want to see path-specific objects if we're modifying the path, and even then I want that to happen as late as possible.
I guess that depends on what you mean by "string". `open` and `fopen` need a char* path to open a file. Whatever fancy Path abstraction you use eventually becomes a char* string, because that's what the kernel needs.
Yes, paths have structure, but saying "a path is not a string" is equivalent of saying "C source code is not a string". Both are strings, and both are something else, represented by strings according to rules. Different internal representations have different advantages and disadvantages. I fully agree that for things such as "adding components" an internal sequence/list representation is better, but strings can pass arbitrary IPC or even ABI boundaries much easier for example. (And you wouldn't bat an eye for example when you see FQDNs like "www.google.com" passed as a string instead of as ["www","google","com"] because the string representation works pretty well.)
Because neither are strings, their native representation shouldn't be such - it should be something structured, and only when necessary (IPC, FFI, serdes) be serialized into a string representation. This would save people a lot of time and effort.
"This is `a
perfectly vali'd.\010! file name\377, despite the weirdness"
text processing is hard if you must support Unicode, and that means every Unix command line tool must implement or employ a text processor to handle input. it would be much easier if objects were passed back and forth. PowerShell got this right.
We started out with 'eth0', 'eth1', etc. Which adapter was which could change when adding and removing a network card. That was bad, so that prompted the evolution.
Now we have 'enp1s0', 'enp0s31f6', 'enp13s0' and many similar variations. These are supposedly more stable across device changes. As it turns out, it wasn't.
But wait, there is more! Now we have the "predictable names" scheme that produces interface names that are even longer, and not even slightly easier to remember.
Read about the whole sorry saga here:
https://wiki.debian.org/NetworkInterfaceName
I do get that it is not an easy problem to solve, especially in the face of removable network interfaces (like USB Ethernet / WLAN). But surely this is not the best we can do.
One thing I like about OpenBSD is that buses are scanned and drivers probe in order and there's no race between drivers coming up. Unless your hardware is physically tampered with or broken, all interfaces come up with the same name across reboots. Linux isn't like that (even if you don't touch your hardware, interfaces could swap across reboots), so you need to do something about it.
As is typical on Linux, the default is unergonomic and if you want something nice, you're on your own to make it so.
If you already have userspace daemons responsible for device insertion and naming, it really wouldn't have been so hard for it to e.g. automatically add a config file / database entry for each interface the first time is seen. So the devices that came up as eth0 and eth1 are still eth0 and eth1 on the next boot; if I unplug eth0 and add a new card, the new one would be eth2 because eth0 is still reserved for the first card I had.
https://wiki.debian.org/NetworkInterfaceNames
No. These are stable across reboots. The old eth? weren't. And yes, that had been a PITA.
Somehow these short letters stick much better for me, and the effort for finding them in the manual is the same, although in case of extra complexity as with xinput, it's even worse with the long names. I don't use either command often, but it's hard to forget xset m. The only thing I remember about xinput is that it's a horribly long lithany of things which I need to look up every time, and the syntax still feels weird.
The amount of time that has been wasted by Windows using "C:\Program Files" instead of "C:\Program_Files" far outweighs any highly questionable aesthetic benefit IMO.
I’ll be damned if I have to remember or lookup what -n means to some obscure program, when reading someone else’s script. Exception given for super common tools where everybody knows like ls -la.
With the disclaimer that shell scripts, especially ls, aren’t exactly suitable for reliable automation in the first place.
(The ability to shorthand connection->c and similar is great, too; obviously not unique to nmcli)
The first one is some magical incantation.
Ergonomics isn't all about making everything self-descriptive for someone seeing the thing for the first time. It's about making things comfortable to actually use. If it's so long and complicated that you can't even remember how to do it, it's not very comfortable to use. Even if I could remember, xset m 0 0 is still far more comfortable.
And fwiw you still don't know what 0, 1 in accel profile do; you need to look that up or take a wild guess, and if you want to use that command, you'll also have to know how to look up the device because chances are yours is not the same as mine. So it's not any less magical in the end, just more verbose.
The "cool" thing about the xinput command is that you don't even find accel profile in the man page. You gotta look elsewhere if you want to understand what it is and what it does and what the parameters are.
xset m? Yes, that is documented in the man page.
On the first, you think you know what it does, but you're not sure. So maybe it gets looked up.
On the second, you know you don't know what it does. You so know to look it up.
Personally, I'll take the second. Assumptions during debugging are dangerous things.
I suspect the number of people who figured all that out without having to find it by googling / arch wiki / whatever is very very low.
Now I'm not gonna say xset is the easiest interface to figure out, but the syntax for setting mouse acceleration is right there in the synopsis, and if you search down the man page, you'll learn a little more (and also if you just run xset without arguments, it'll tell you how to set mouse acceleration). It might not be the best designed tool but it's something I learned back in the day as a teenager just by looking at the man page.
I think the real issue is that people nowadays are designing these interfaces to be consumed by interactive configuration tools, GUI apps, and desktop environments; they're more dynamic, more complex, more flexible, but not easier to figure out, not for you on the command line. The command line is just a last resort. Second class citizen if you will.
This is a lesson I had to learn the hard way, multiple times.
How often do you reach for iptables? If you're like myself, and most home/desktop users, then probably once in a blue moon to set it up and then you leave it alone. But a system admin? Maybe they touch it a few times a week or month. Every time I use iptables I have to relearn how Linux networking works.
Similarly, the xset/xinput thing. When I need those tools I just create a script or throw it in .bashrc. I adjust the settings once and will not touch them again for a couple years. It makes sense to have long parameters that are readable. I can look at my .bashrc and see exactly what device is getting adjusted.
They are so easy to remember that you need to configure your shell to remember them for you?
Wait, are you saying that I need to change my shell or config to make up for another tool's poor design?
No, thanks.
Of course, if one doesn't write a CLI to begin with, this trade-off doesn't exist - you can have your cake and eat it too.
Programs like Powershell eschew ease of use in CLI for readability in scripts.
If you had comma-delimiting like in an algol-derived language, you wouldn't need to quote things with spaces.
edit: also, code is read more times than it is written, so optimizing for readability over brevity is generally a good move.
Ironically, even NASA doesn't like space.
https://www.nas.nasa.gov/hecc/support/kb/portable-file-names...
2021-01-01-some-important-document.pdf gives me the warm fuzzies. On the off chance that some more differentiation is needed, throw in an underscore and a whole new world opens up
2021-11-11_client_projectName.ext is also OK. But underscore separates fields, hyphens for space replacement.
work/client/project/2021-11-11-file.ext is more or less how I lay stuff out. I’d say client/project is a folder level distinction (arguably dates too).
[EDIT] Realistically most of the stuff under <project> is git repos and I usually make a “home” repo where I keep org files for tracking hours, notes, and resources related to the engagement.
But why not the other way, hyphen-minus for separating fields and underscore for space replacement? That seems to me more consistent with how underscores and dashes are used.
I hadn't heard that before and I love it.
https://developers.google.com/style/word-list#letter-k
https://web.archive.org/web/20151007005513/http://www.911cd....
P.S. should anyone want to see/run the actual batch, a copy has been uploaded here:
http://reboot.pro/index.php?showtopic=18962&page=29#entry204...
Yes on the date format.
Saves you so much time.
211111_foobar_v1.txt
I am old enough that I still save before printing. I think it was Lotus 123 that engrained it for me.
2021-01-01_what-happened_who-did-it_possible-reason
https://docs.microsoft.com/en-us/windows/win32/api/processth...
The arguments are a single string. So you want to pass parameters with spaces in them? You've got to add quotes and stuff all of that into a single string. Instead of doing it in a more sane manner, like oh, the arguments to main().
Oh and cmd.exe uses a different escaping scheme than the CRT.
Sadly such set includes loads of Java programs. If only SUN had shipped a standard way to generate isolated exe files in 1998... but they worked under the presumption that you'd have a JVM already there, because distributing that monster was difficult in dialup times, so you could just hand people a jar; and the enterprise market did not care, since they had webapp servers. Sadly it's an "optimization" that became obsolete very quickly but wasn't rectified until it was too late (java 9+).
Actually it isn't, just use double quotes and add a '~'. It's just about the only thing batch files handle better than shell scripts. set "VARIABLE=%~PATH"
The escape character on PowerShell is backtick, and on cmd it is caret. You don't need to quote everything.