I'm “still afraid to use spaces in file names” years old

I work on a complex desktop application, and it's been astounding the number of bugs that have appeared over the years triggered by spaces and other unusual characters in file names. If you do anything with subprocesses or path processing, it's absurdly easy to hit in a thousand different ways, over and over again.

Pro tip: rename your development directory (or even better: the workspace path in CI) to put a space and/or special characters in it.

Forces you to deal with this properly, and immediately ensures that every automated test checks this case without you having to remember every time. Hasn't been particularly inconvenient, since I'm autocompleting it 99% of the time anyway, and I haven't shipped a single path parsing bug since.

alpaca128 · 4 years ago

Seems like MS had the same idea according to an answer in the link:

> Microsoft intentionally made programs install to C:\Program Files on Windows 95+ to force programmers to deal with spaces in filenames.

ealexhudson · 4 years ago

I wish they did "User Files" instead of "Users" too, because so much software breaks on the home area having a space in it.

Not least, it makes writing scripts for various shells and getting the quoting rules right an absolute pain as well...

323 · 4 years ago

Laughs in C:\PROGRA~1\ (try it, still works in Windows 10)

lifthrasiir · 4 years ago

And yet they introduced C:\ProgramData in later versions.

Matthias247 · 4 years ago

It not only keeps people on their toes due to the whitespace. The folder name is even localized. E.g. with german settings there is C:\Programme and c:\Programme (x86).

hetspookjee · 4 years ago

I wonder how much global work could have been saved if Microsoft also provided a covered interface for all paths in the system. Not sure if there is any, but one good implementation might save thousands of poor implementations required to handle it.

dan-robertson · 4 years ago

On the other hand their case sensitivity behaviour means that “cross-platform” Java applications can break if they are run on a non-windows platform where opening files is case sensitive (unlike on windows)

lamontcg · 4 years ago

Then they made poor APIs so that you have to do this to get it correct:

https://docs.microsoft.com/en-gb/archive/blogs/twistylittlep...

In nix at least you can call execve or other APIs that take a char argv[] and the whole problem is largely solved and you don't need to quote things.

anarazel · 4 years ago

I just wish they had a decent way to execute programs with arguments that might include spaces. But no, every program can do argument delineation differently.

drdeca · 4 years ago

I know that at least like, idk like 3-5 years ago, when I had gotten a new windows laptop (windows 7 or 8 I think), setting the main account to have the name "" (without the quotes), caused some problems with the basic functioning, including, I think, with some pre-installed programs,

So, some things were still being handled not quite right (whether that's because it shouldn't be allowed to be the username, or because programs should handle it being in the path, I'm not sure, but probably one of those.)

billti · 4 years ago

And then to really mess you up and ensure you handle parens properly, threw “(x86)” into the mix. (A real pain on some REPLs as well as dealing with environment variables).

Deleted Comment

vesinisa · 4 years ago

Except for programs that were too old / obscure to fix I guess. I think at least the Symbian Development Kit was such that builds would fail with strange errors unless you installed it in any other path than the default immediate subdirectory of C:\, let alone under "Program Files".

gattilorenz · 4 years ago

Funny, in the Italian Win9x it is C:\Programmi, which I always thought was more convenient because of the lack of spaces :)

cerved · 4 years ago

Sure. Microsoft only ever ships features

8bitsrule · 4 years ago

At one time there was no number 0. Half of binary was missing.

antihero · 4 years ago

Shame it wasn't

> C:\P̷̧̽r̸̬͘ŏ̵̮g̷̜͘r̸̦̋a̴͎̒m̶̲̈́ ̷̠̉F̵͇̈ĩ̴̫l̶̨͗ë̵̦s̸͚͆\

zaphirplane · 4 years ago

There was a short path name IIRC like prog~1

zerr · 4 years ago

Could you please link the reference?

Deleted Comment

henrikschroder · 4 years ago

C:\PROGRA~1

Easy fix!

Izkata · 4 years ago

> Pro tip: rename your development directory (or even better: the workspace path in CI) to put a space and/or special characters in it.

A former co-worker changed his name in our auth system to include an apostrophe, so that whenever we handled names wrong he'd find it.

floatingatoll · 4 years ago

I set my nickname to U+FFFD at one point in one work system, resulting in a variety of bug reports and concerned emails. I think I dropped it since it was generating false reports from people who didn't check what character the page contained before reporting it.

enragedcacti · 4 years ago

To have such thoughtful coworkers. On an old team I had two coworkers named Chris and once in a blue moon when they reviewed each other code master would start crashing because one of them accidentally left in an absolute path starting with "/home/chris/".

ajmurmann · 4 years ago

A related too for CI: change the system time to be a time zone that is during your work hours in a different day already than UTC. Really helped getting failures earlier than 4pm PST.

reaperducer · 4 years ago

One of the systems I built is being used by a group of younger people. I included an emoji in the superuser account name, just to make sure it would work. And to remind me to think more broadly about user input.

ygra · 4 years ago

I've used to have a space in my user name and even contemplated to add a bit of non-1252 Unicode. You find a lot of issues, but unfortunately often in tools you have little control over and end up not being able to work effectively at times. It ended up being more frustrating than helpful.

ridaj · 4 years ago

Áčçëñts hęlp tóø

qwertox · 4 years ago

I add a Japanese character into any .py, .js and .html file to ensure that Unicode is working properly through the entire chain. Mostly in form of a variable which gets passed along, even in URL parameters.

fernandotakai · 4 years ago

my test accounts always have emojis + accents + other weird characters.

it keeps everybody on their toes lol.

curuinor · 4 years ago

the proper name of the glorious sultan of slack, j. r. "bob" dobbs, has the quotation marks and therefore is a great subject for this

geoduck14 · 4 years ago

Oh, I like this!

soheil · 4 years ago

Obligatory xkcd https://xkcd.com/327/

josteink · 4 years ago

> it's been astounding the number of bugs that have appeared over the years triggered by spaces and other unusual characters in file names

If you consider spaces “unusual” I would say you haven’t encountered a single average user in your lifetime. Spaces in file-names is the single most common thing people have, outside programming environments.

As a x-plat developer, the only platform where I (still) regularly encounter these kind of bugs are platforms where solving problems through scripting is common, like Linux, where the primary means of operation is through stringly-typed statements getting parsed and processed in a untyped-fashion. It's not very reliable.

On Windows people more often use “real APIs” (because scripting doesn't really work as well), but then these problems just goes away.

Pros and cons, I guess.

SAI_Peregrinus · 4 years ago

It's especially funny that it affects Linux so much. Most file systems allow everything except `/` and NULL in file names. Early AT&T UNIX even allowed NULLs! POSIX shells use the IFS variable to perform field splitting, and it defaults to <space>, <tab>, and <newline>. The choice to perform field splitting by default (particularly with spaces in the default IFS set) has caused no end of headaches for developers and users.

jeffwask · 4 years ago

It doesn't even have to be complex, often basic automation tasks fail with spaces and special characters. Honestly, treating a file system like a natural language processor is a bad idea. Besides at this point with how digital we have all become who can't understand...

thisismyconfig.txt vs this is my config.txt or this_is_my_config.txt

...i've forced myself to stop using spaces, character, and even cap. They are all constructs that provide minimal value for the extra complexity.

long_time_gone · 4 years ago

> thisismyconfig.txt vs this is my config.txt or this_is_my_config.txt

Just wondering, what is the readability of this for people who are dyslexic?

rch · 4 years ago

I'm similar, but I would like to support labels intended for humans, along with various translations, as metadata on top of e.g. filesystem path components.

Too · 4 years ago

Why stop there. A computer works more efficiently with numbers rather than strings, so let’s just give each file a number instead of a string. Besides, at this point with how digital we have all become who can’t understand… But wait, that already exists and is called an inode.

A file system has a human interface and a computer interface. Don’t mix them. Let users give file names in whichever way they please.

400thecat · 4 years ago

> treating a file system like a natural language processor is a bad idea

could you please explain what you mean by that?

mwcampbell · 4 years ago

My favorite filename special character bug was when I implemented CD ripping in 2005, and one of our beta testers ripped a CD with a song called "Have You Ever?". My code wasn't prepared to filter out the question mark on Windows.

mixmastamyk · 4 years ago

I just hit the one where an album folder ends in a period. Rsync copies every time because the period is dropped by the filesystem silently. :-/

wongarsu · 4 years ago

> Pro tip: rename your development directory

I changed my username to not contain a space because it was too annoying to deal with all the random dev tools breaking. The worst offender was probably npx on Windows [1] (resolved after four years by deprecating npx), but it was far from the only one (though the JS ecosystem was somehow the worst in this regard of all languages I worked with).

1: https://github.com/zkat/npx/issues/100

kermire · 4 years ago

Same, even I had to rename my user folder to not have a space because so many tools were breaking.

sysadm1n · 4 years ago

> other unusual characters in file names

Saw a few hacks where malware authors used the RTL feature (which is baked into Windows) to obfuscate file extensions. It looked like .exe.innocuous-document.docx, but was actually .docx.innocuous-document.exe

masklinn · 4 years ago

This exact vulnerability in most modern code editors just made the rounds, allowing smuggling malicious code right through review.

shane_b · 4 years ago

My Mac is formatted case sensitive when the default is case insensitive. This will also catch a ton of import related bugs.

League of legends doesn’t run until I sed files for instance.

memsom · 4 years ago

I once returned a printer because the Mac driver and support software expected and enforced case insensitive access and basically couldn't install properly on my case-sensitive HFS+ volume. It half installed and blatantly just didn't work in any way when installed.

deckard1 · 4 years ago

I have coworkers on Mac that write node/JS code. Every once in awhile I'd pull down the latest code and it wouldn't run. I'm on Linux.

Sure enough, they had SomeFile and were importing Somefile and it works fine on Mac but not on Linux (which, of course, is what our production servers use). It amazes me that "works fine on my machine" is still a thing when I definitely worked at companies that solved this back in the 2000s. It was solved. It was done. Then devs became enamored with running everything locally. Even dozens of microservices or databases. Even though JS is fairly isolated, you still have NPM packages that need built against the local OS and C/C++ library and compilers, etc. Which also has caused issues in the past.

mdaniel · 4 years ago

I also enjoyed doing that, but had to make a DMG just for Steam because it straight-up refuses to run on a case sensitive FS (that's true on Windows, also, which I suspect is how we all got here). I think the most recent Steam versions either caught wind of my trickery or -- more likely -- run something from $HOME/Library/SomethingOrOther and thus the work-around it no longer works

When I got a new Mac, I just gave up and acquiesced to the case-retentive world :-(

dunham · 4 years ago

Circa Y2k, I learned that the OSX Palm Pilot software didn't work with case sensitive. I've since given up and stuck with the default. (I'm anti-case folding in general, because of the ambiguity.)

achn · 4 years ago

I maintain a similar system, where a variety of companies submit files that get processed through multiple services - it is astounding how ridiculous people’s naming of files can be; spaces are the least concerning!

rossy · 4 years ago

> anything with subprocesses

I'm begging software developers to stop using subprocess APIs that take a string argument (system(), child_process.exec(), Process.Start(string)) and start using subprocess APIs that take an array of arguments (execvp(), child_process.execFile(), Process.Start(string, IEnumerable<string>).)

lifthrasiir · 4 years ago

While I agree that we should do this in the ideal world, doing so will inevitably break other necessary tools so it is unworkable for me :(

cduzz · 4 years ago

And add a emoji, a character in a right to left language ( א) and perhaps 太. Maybe italicize one of those too...

cerved · 4 years ago

Spaces are a pain in the ass when you're using CLI so I'd rather enforce a no space policy

reayn · 4 years ago

Most shells will behave just fine if you put a quote (single or double) before anything that has a space.

A small extra step but something you get used to if you spend a lot of time in the cli.

redwall_hp · 4 years ago

I don't know if it's still a problem, but it used to break Python virtualenv badly. If your working directory had a space anywhere in the path, it would throw a huge fit and not work. Which is problematic when the expected name for a Mac's boot drive is "Macintosh HD" (if you ever had a reason to run a virtualenv outside of your home directory).

kitkat_new · 4 years ago

Pro tip2: Use std lib path processing utilities

WalterBright · 4 years ago

Sometimes / works as a path separator in Windows, sometimes it doesn't. It's not predictable.

I never use / on Windows as a result.

ygra · 4 years ago

The only common place where it doesn't work is in CMD for executing programs and as arguments for built-in commands. Everything else goes directly to the relevant APIs which don't care about / or \.

These days using CMD instead of PowerShell should be rare enough and PowerShell certainly doesn't mind the slashes.

InfiniteRand · 4 years ago

It's easy to tell users to make a folder with no spaces if you're setting up a global path, however if you have an application that runs in user directories things can become painful fast. Changing your user name is a pain and can leave things inconsistent, but having to handle all the variations in people's names with spaces, punctuation, international characters, can just be mind boggling.

uberswe · 4 years ago

I did something similar on accident. I used to keep all my development work synced with Dropbox and I had a work and a personal account. So any of my own projects would have /Dropbox (Personal)/ in the path which did catch some bugs. Dropbox renamed my folder to "Dropbox (Personal)" automatically when connecting a work account.

franga2000 · 4 years ago

More importantly than your source files, put your testing data on such a path as well. Nobody uses absolute paths in testing so it doesn't matter how many spaces your absolute path has if your input is "./tests/file1". Put those files in a folder with spaces too and throw in a unicode character for good measure.

BiteCode_dev · 4 years ago

> Pro tip: rename your development directory (or even better: the workspace path in CI) to put a space and/or special characters in it.

The problem with that is that YOUR code may handle it, but your tooling may not. If my code formatter break on spaces, I'm not going to change the formatter.

ChrisSD · 4 years ago

You could submit a PR to their repo.

idatum · 4 years ago

Somewhat related to injecting unusual characters, in my experience in localization efforts:

Inject a Turkish 'I'. I don't know how to type or paste it here, but picture an English lower case 'i' that is upper case. It is a splendid way among many to shake out some loc bugs.

gus_massa · 4 years ago

From https://en.wikipedia.org/wiki/%C4%B0

ygra · 4 years ago

That would only shake out anything if you'd also test in a Turkish locale, wouldn't it? Since Unicode casing rules are locale-dependent and en-US doesn't care much about dotless i or dotted i.

ralphc · 4 years ago

Late '90s I worked on Java software that got installed on several Unix platforms, including Linux for IBM mainframes. When you deal with the default en/de-coding of Unicode to EBCDIC you never have trouble with Java byte encodings ever again.

Spooky23 · 4 years ago

Someone should provide the OneDrive/SharePoint people some of this religion.

Mysterious character requirements that do not conform with Microsoft’s OS limits, limits on tbe fully qualified pathname length, etc.

Foobar8568 · 4 years ago

Let's not forget return carriages in filenames within apps...

wldcordeiro · 4 years ago

Even capitalization is a pain in the ass thanks to how OSes treat file names. I pretty much stick with either `file-name.ext` or `file_name.ext` exclusively now.

AlfeG · 4 years ago

Today I learned that You cannot install Tailscale on windows if installer is inside path with non-latin chars.

qwertox · 4 years ago

In that case, be thorough and insert a Chinese and an Arabic character to enforce a Unicode check.

agumonkey · 4 years ago

See the recent article about unicode invisible glyphs in JavaScript or bash.

Naming freedom needs a stdlib module

5faulker · 4 years ago

For those purposes I've found hyphen to be a nice substitute.

echelon · 4 years ago

Better solution: only allow ASCII, maybe dashes, and up to twelve characters. Problem solved.

Enforce this in LDAP.

Strict convention is better than flexibility and predicting obscure edge cases that can fail.

pimterry · 4 years ago

In my case, and for many people writing desktop software, and for absolutely everybody writing open-source tools or libraries, unfortunately you can't control the environment.

Non-ASCII paths are extremely common (e.g. the user's home directory on Windows, for the large majority of users outside the English-speaking world) and spaces, punctuation and weirder characters will definitely happen when you least expect it.

Yes if you can avoid it then absolutely that's great, but I don't think most people can.

It's also not usually very difficult to deal with, as long as you actually spot the issue in the first place.

reaperducer · 4 years ago

only allow ASCII, maybe dashes, and up to twelve characters. Problem solved

...and only hire people from the exact same background as you, who will never have unusual characters or accents in their name. And also make sure not to have any users who aren't exactly like you, and conform to this very narrow requirement. Surely, excluding 90% of the world won't hurt revenue in any way.

mikepurvis · 4 years ago

Ugh, we have the 15 character Active Directory limit now with hostnames, and a previous IT administration has imposed a convention that every name had to follow [prod|dev]-[ph|vm]-[service]-[nn]. So basically every production service is prod-vm-owtf-01— you get exactly four characters to actually describe what the machine does. Works great when the service is "jira" or "wiki", but there are a lot that are pretty mystical-sounding, like jkns, jwrk, cntr, hrbr, etc, where you kind of just have to know.

MayeulC · 4 years ago

Ah, that's the he enterprise edition.

But then your program will crash hard and unexpectedly when a user decides to save under "~/house plans" or ~/Téléchargements.

I think it's better to exercise this in CI, that's what CI is for.

Dead Comment

chris_wot · 4 years ago

And yet OneDrive WP t allow fir spaces before or after a file name.

alx__ · 4 years ago

I spent hours trying to figure out why an entire folder suddenly stopped syncing. Turns out I accidentally added a hidden space to the end of a folder name.

dheera · 4 years ago

Or not, which when bugs crop up will teach the businessy types to stop putting spaces in their filenames.

macintux · 4 years ago

The beatings will continue until morale improves?

Spaces are very useful for readability.

KronisLV · 4 years ago

> Pro tip: rename your development directory (or even better: the workspace path in CI) to put a space and/or special characters in it.

This will also break any code in external tools that are called during the builds of your application and do not handle spaces correctly for whatever reason, thus making it so that you won't be able to successfully finish the build.

Then again, you probably shouldn't be relying on technologies like that, but when you're struggling to keep an old enterprise system alive, causing yourself more problems is not necessarily what you should do.

Still a good idea in most cases, though.

# Rename all files in a directory rn() { rename "s/ /-/g" * rename "s/_/-/g" * rename "s/–/-/g" * rename "s/://g" * rename "s/$//g" * rename "s/$//g" * rename "s/\[//g" * rename "s/\]//g" * rename 's/"//g' * rename "s/'//g" * rename "s/,//g" * rename "y/A-Z/a-z/" * rename "s/---/--/g" * rename "s/-‎--/--/g" * }

I'm hardly afraid but I just think it's poor ergonomics. Same as the move from

   xset m 0 0

    xinput --set-prop 'pointer:Logitech USB Receiver' 'libinput Accel Profile Enabled' 0, 1

Everything seems to be going this way in Linux land. Longer names, harder to type names, camelcase names, spaces... I'm looking forward to an OS that treats command line ergonomics as a first class feature and where camelcase & spaces are verboten.

ansible · 4 years ago

Well, if you think that's bad, behold the recent trend in network interface names on Linux.

We started out with 'eth0', 'eth1', etc. Which adapter was which could change when adding and removing a network card. That was bad, so that prompted the evolution.

Now we have 'enp1s0', 'enp0s31f6', 'enp13s0' and many similar variations. These are supposedly more stable across device changes. As it turns out, it wasn't.

But wait, there is more! Now we have the "predictable names" scheme that produces interface names that are even longer, and not even slightly easier to remember.

Read about the whole sorry saga here:

https://wiki.debian.org/NetworkInterfaceName

I do get that it is not an easy problem to solve, especially in the face of removable network interfaces (like USB Ethernet / WLAN). But surely this is not the best we can do.

foxfluff · 4 years ago

I was actually ranting about this on IRC last night (yeah now my laptop has two enp* interfaces and enx[MAC])..

One thing I like about OpenBSD is that buses are scanned and drivers probe in order and there's no race between drivers coming up. Unless your hardware is physically tampered with or broken, all interfaces come up with the same name across reboots. Linux isn't like that (even if you don't touch your hardware, interfaces could swap across reboots), so you need to do something about it.

As is typical on Linux, the default is unergonomic and if you want something nice, you're on your own to make it so.

If you already have userspace daemons responsible for device insertion and naming, it really wouldn't have been so hard for it to e.g. automatically add a config file / database entry for each interface the first time is seen. So the devices that came up as eth0 and eth1 are still eth0 and eth1 on the next boot; if I unplug eth0 and add a new card, the new one would be eth2 because eth0 is still reserved for the first card I had.

nocman · 4 years ago

Missed the 's', it's:

https://wiki.debian.org/NetworkInterfaceNames

ReleaseCandidat · 4 years ago

> These are supposedly more stable across device changes.

No. These are stable across reboots. The old eth? weren't. And yes, that had been a PITA.

account42 · 4 years ago

If netwok interfaces were files we could just have both short names and stable names, like what we have for block devices.

martin-t · 4 years ago

I find this attitude misguided. More descriptive names are more ergonomic for things you only use rarely but they need to be combined with much better autocompletion than most shells provide by default.

foxfluff · 4 years ago

You state that as if that were objective.. but that's not my subjective experience at all. Somehow I have a hard time remembering these long names, (is it --conf or --config or --config-file or --config-path? -c would've done it for me. --set or --set-prop or --set-property or --prop or --property?), and I need to look them up in a man page anyway, and I make more typos typing them, and shell completion rarely works well if at all. I also find it harder to read and edit long lines that wrap.

Somehow these short letters stick much better for me, and the effort for finding them in the manual is the same, although in case of extra complexity as with xinput, it's even worse with the long names. I don't use either command often, but it's hard to forget xset m. The only thing I remember about xinput is that it's a horribly long lithany of things which I need to look up every time, and the syntax still feels weird.

omnicognate · 4 years ago

Spaces don't make anything more descriptive, they just cause completely unnecessary quoting and escaping hassle.

The amount of time that has been wasted by Windows using "C:\Program Files" instead of "C:\Program_Files" far outweighs any highly questionable aesthetic benefit IMO.

Too · 4 years ago

Short option for interactive terminal. Long option in automation.

I’ll be damned if I have to remember or lookup what -n means to some obscure program, when reading someone else’s script. Exception given for super common tools where everybody knows like ls -la.

With the disclaimer that shell scripts, especially ls, aren’t exactly suitable for reliable automation in the first place.

formerly_proven · 4 years ago

Cue nmcli (CLI for Gnome's NetworkManager) which uses UUIDs for everything and (at least a while ago) did not accept partial-but-unique UUIDs. Basically goes "nmcli connection up 5095665a-d82c-4ae6-8964-283623387941".

apricot · 4 years ago

By this point, I'm pretty sure there are people at gnome who compete to see who will make the stupidest suggestion that gets put in production.

gertlex · 4 years ago

Weird, I haven't had to do this. Most(/all?) connections have nice names you can see with `nmcli c`... and so I can do `nmcli c up id DroidNet` and that's pretty dang nice. Pretty sure this worked with Ubuntu 14.04 (though, nmcli has gotten much more featureful since then)

(The ability to shorthand connection->c and similar is great, too; obviously not unique to nmcli)

prionassembly · 4 years ago

apt-get install nmtui # it's better

Dudeman112 · 4 years ago

I could infer a lot about the second and what those params mean and what they do.

The first one is some magical incantation.

foxfluff · 4 years ago

Sure. One could also make "move-down-one-line" be the incantation to move the cursor down a line in vi, but I prefer j.

Ergonomics isn't all about making everything self-descriptive for someone seeing the thing for the first time. It's about making things comfortable to actually use. If it's so long and complicated that you can't even remember how to do it, it's not very comfortable to use. Even if I could remember, xset m 0 0 is still far more comfortable.

And fwiw you still don't know what 0, 1 in accel profile do; you need to look that up or take a wild guess, and if you want to use that command, you'll also have to know how to look up the device because chances are yours is not the same as mine. So it's not any less magical in the end, just more verbose.

The "cool" thing about the xinput command is that you don't even find accel profile in the man page. You gotta look elsewhere if you want to understand what it is and what it does and what the parameters are.

xset m? Yes, that is documented in the man page.

zsmi · 4 years ago

Another interpretation is:

On the first, you think you know what it does, but you're not sure. So maybe it gets looked up.

On the second, you know you don't know what it does. You so know to look it up.

Personally, I'll take the second. Assumptions during debugging are dangerous things.

eloisius · 4 years ago

But which case should software interfaces optimize for? Ergonomics of someone who uses a tool frequently, or interpretability for casual by-standers of some out-of-context shell command?

apricot · 4 years ago

The problem is we're optimizing for "easy to learn" rather than "easy to use".

foxfluff · 4 years ago

That may be a part of the problem but honestly I don't feel like all these new crazy interfaces are easy to learn either. I mean how do you come up with the lithany xinput calls for? You need to understand the syntax for specifying a device. You need to know that you're to set a libinput property, and you need to know the name of that property, and it's not documented in xinput man page, and of course you need to know the values to pass which again are not documented in xinput man page. You can play with --list-props and then take your search elsewhere because it is completely opaque and doesn't explain what the properties actually do.

I suspect the number of people who figured all that out without having to find it by googling / arch wiki / whatever is very very low.

Now I'm not gonna say xset is the easiest interface to figure out, but the syntax for setting mouse acceleration is right there in the synopsis, and if you search down the man page, you'll learn a little more (and also if you just run xset without arguments, it'll tell you how to set mouse acceleration). It might not be the best designed tool but it's something I learned back in the day as a teenager just by looking at the man page.

I think the real issue is that people nowadays are designing these interfaces to be consumed by interactive configuration tools, GUI apps, and desktop environments; they're more dynamic, more complex, more flexible, but not easier to figure out, not for you on the command line. The command line is just a last resort. Second class citizen if you will.

jjoonathan · 4 years ago

In a world of broken promises and tool churn, minimizing tooling investment isn't laziness, it's a defense mechanism.

This is a lesson I had to learn the hard way, multiple times.

deckard1 · 4 years ago

On some level it makes sense. The problem with the command line is familiarity.

How often do you reach for iptables? If you're like myself, and most home/desktop users, then probably once in a blue moon to set it up and then you leave it alone. But a system admin? Maybe they touch it a few times a week or month. Every time I use iptables I have to relearn how Linux networking works.

Similarly, the xset/xinput thing. When I need those tools I just create a script or throw it in .bashrc. I adjust the settings once and will not touch them again for a couple years. It makes sense to have long parameters that are readable. I can look at my .bashrc and see exactly what device is getting adjusted.

nomorecommas · 4 years ago

Long option names are more descriptive, more easily distinguished, and easier to remember. Your shell should be intelligent enough to provide tab completion for option names, assuming it is configured to.

Angostura · 4 years ago

> Long option names are ... easier to remember ... Your shell should be intelligent enough to provide tab completion

They are so easy to remember that you need to configure your shell to remember them for you?

forgotmypw17 · 4 years ago

>Your shell should be intelligent enough to provide tab completion for option names, assuming it is configured to.

Wait, are you saying that I need to change my shell or config to make up for another tool's poor design?

No, thanks.

Jiro · 4 years ago

Long option names are more difficult to remember because a long option name can be spelled multiple ways and it is difficult to remember which spelling is correct.

kaba0 · 4 years ago

IMO, powershell got it right. Yeah, it’s syntax is strange, but it has standard flag usage with proper autocomplete, and you can shorten any flag the way you want (eg. fuzzy match) if it is unambiguous.

throw10920 · 4 years ago

These changes are meant to make it easier to read and understand command-line incantations (and to make them more explicit, which is always good), because the command-line paradigm, being text-based, imposes an unavoidable trade-off between ergonomics and understandability/ease-of-use. It sounds like you prefer ergonomics - although I wouldn't be surprised if most users would prefer ease-of-use.

Of course, if one doesn't write a CLI to begin with, this trade-off doesn't exist - you can have your cake and eat it too.

skohan · 4 years ago

What's wrong with camelCase? It's easier to type than snake

thrwyoilarticle · 4 years ago

There's a tendency away from snake_case and towards kebab-case in things you interact with via CLI. Even moreso towards nocase.

Programs like Powershell eschew ease of use in CLI for readability in scripts.

chrismorgan · 4 years ago

camelCase is objectively harder to read than snake_case or kebab-case, though familiarity can mitigate that.

Pxtl · 4 years ago

imho, the fundamental problem is using space as a delimiter. Also, case-sensitivity is a disaster for ergonomics.

If you had comma-delimiting like in an algol-derived language, you wouldn't need to quote things with spaces.

edit: also, code is read more times than it is written, so optimizing for readability over brevity is generally a good move.

zibzab · 4 years ago

I've a feeling you will hate powershell

akersten · 4 years ago

Needlessly long parameter/command names and the bizarre insistence on capital letters are the #1 and #2 reasons I detest PowerShell. Like GP, I resent that Linux tools are moving in that direction.

mtift · 4 years ago

I have an overly-aggressive function in my .bashrc to rename all files in the current directory:

I use this all the time, especially when I download files.

tgbugs · 4 years ago

Word of warning from hard experience: rn is a really dangerous thing to name a function because it is one char away from rm.

spurgu · 4 years ago

One char away also physically on the keyboard (maybe that's what you meant?).

post-it · 4 years ago

Looks like it's typically run without any arguments, so it's probably fine.

itsbenweeks · 4 years ago

Agree. Having this function exit if any arguments are passed to it seems like a good safety measure.

TheSkyHasEyes · 4 years ago

ren would be better than rn. :)

eyelidlessness · 4 years ago

Note to self: snag “notTerseAtAllMoreVerboseIdentifiersForGreatGood.js” on NPM

donio · 4 years ago

https://github.com/dharple/detox is a nice tool for this. Sane defaults but configurable.

In addition to CLI I use it from emacs dired-mode too:

    (defun my-dired-detox ()
      (interactive)
      (dired-do-shell-command "detox" nil (dired-get-marked-files))
      (revert-buffer))

I bind it to "_" in dired-mode.

OskarS · 4 years ago

Overly aggresive is right! I don't know if this is genius or deranged! I'm leaning towards genius and stealing the idea.

By the way: what's your beef with en dashes? I mean, if it was "everything should be 'HYPHEN-MINUS' (U+002D)", then fine, but why specifically en dashes and not em dashes?

michaelt · 4 years ago

> By the way: what's your beef with en dashes?

Of all the changes in that list, removing the character that doesn't appear on a standard keyboard seems like the least controversial...

I totally agree that for some people, this could be a terrible command to have around. However, I know that it has been working for me for about 8+ years or so. I almost always run in in my ~/Downloads folder on files that I don't really care about. I download a lot of academic papers and books, and this just saves me a lot of time to put files in the format I like: author--paper-title.pdf. And that's part of the reason why I make all of the dashes the same, so if I'm opening something by an author, I can easily autocomplete and not have to remember how to make other sorts of dashes on the command line.

niccl · 4 years ago

I use this snippet, to change spaces to underscore for directories and files in the current directory and below. Haven't made it a function yet, but should. I got it from stack overflow or somewhere, but no attribution. Thanks to whoever did it first:

   find . -depth -name '* *' | while IFS= read -r f ; do mv -i "$f" "$(dirname "$f")/$(basename "$f"|tr ' ' _)" ; done

Thanks to all the comments in this threads, I now have "sudo apt install rename detox" in my install script, and:

    normalize_names() {
        rename "s/-/_/g" *
        detox -s lower *
    }

in my .bashrc.

I've thrown some edge cases at it, and it handles it super well. It deals with consecutive "_", remove leading garbage, normalize unicode, and even prevents naming conflicts by opting out early.

Thanks you.

l0b0 · 4 years ago

If you're a developer you're doing yourself a big disservice by not learning how to deal with special characters.

I agree. I am a developer and I know how to deal with special characters. But this isn't something I use professionally. I just prefer not to have to deal with special characters in the pdfs, m4as, txts, and other files that I use on a daily basis. When I write papers, I'll write ū or Ñ or ç or whatever (incidentally, I have a lot of shortcuts in my .vimrc for those). I would not say I am "afraid" to use spaces in filenames, but I get a certain satisfaction storing academic papers in the author--paper-title.pdf format and my notes in author--paper-title.md because it helps me find things.

mrzool · 4 years ago

You might be interested in detox:

https://github.com/dharple/detox

theshowmustgo · 4 years ago

Nice but how do you prevent overwrites? What about directories/folders and the files in that directory/folder?

I have:

  Movie Bla (2020)
    Movie Bla (2020).mp4

But also:

  Movie_Bla_(2020)
    Movie_Bla_(2020).mp4
    Movie_Bla_(2020).srt

Would not like to lose files like the the srt.

Yeah, sometimes I end up renaming things I don't want to, but it really doesn't happen all that often. And sometimes I throw caution to the wind, add some excitement to my life, and rename a bunch of files (not for anything professional) in some really old directory and hope I don't break anything. But I'm not aiming for perfect with this comment. I just mentioned in another comment, but the vast majority of times I run this is in my ~/Downloads folder on files I don't really worry about breaking.

rename will stop and output and error.

Tempest1981 · 4 years ago

Surely you must run into conflicts now and then?

nybble41 · 4 years ago

That's the most beautiful part! After running this script there are no more conflicts, because it just silently overwrites all but one version of the "cleaned" filename.

(Also—that entire function is super inefficient and could be replaced with a single invocation of "rename".)

I wonder if rename has an -e flag like sed. It might be worth baking this into one monolithic regex if you call this often

IX-103 · 4 years ago

You missed ~ You really don't want to create a directory named "~".....

TacticalCoder · 4 years ago

Define "space". Is the Hangul filler we talked about yesterday a spacing character? Is the zero-width non-breaking space a spacing character? What about the typographic spacing characters?

You should better be very afraid of using spaces in filenames.

You should do everything you can to support them but you have to know you'll invariably encounter countless cases where you'll have this or that tool that won't work properly with them.

I still live in a world where I cannot name a song from the french group L'impératrice with an eacute in the filename or my car's media system will display garbage (it's running QNX and I don't know which filesystem).

FWIW, and it should be food for thought, every single Git repository in the world contains a pre-commit hook sample (disabled by default but it's there) that enforces that every committed file in the repo is named using a subset of ASCII characters.

Every Git repository in the world has that example: let that sink in.

selfhoster11 · 4 years ago

> FWIW, and it should be food for thought, every single Git repository in the world contains a pre-commit hook sample (disabled by default but it's there) that enforces that every committed file in the repo is named using a subset of ASCII characters.

I use Git for documents too, not only code. Why shouldn't I use my native language?

numpad0 · 4 years ago

Tab completion don’t work well for languages that require IME. That is one reason why I don’t.

non-ascii characters cause annoying hard to fix problems. If you're willing to deal with that - kudos. Personally I don't find it worthwhile

chungy · 4 years ago

> I still live in a world where I cannot name a song from the french group L'impératrice with an eacute in the filename or my car's media system will display garbage (it's running QNX and I don't know which filesystem).

I have an Android phone and I tell MusicBrainz Picard to save all files with ASCII-only names and Windows-compatible names for the ones that get sent over to the phone. Basically for this reason. Sometimes it's players on Android itself, but even more frequently, whatever bluetooth radio I'm connected to freaking out with non-ASCII characters.

torstenvl · 4 years ago

What do you mean, display garbage?

kingcharles · 4 years ago

You get all those space characters working and then some jerk comes along and uploads a file like this: ŗ̶̧̢͓̳͍͙͔̳̻̥͉̭͓̫̟͍̞̭͉͓͉̮̹͍͚̳̹̬͉͚̰͈̘̐̊̾̈̀̒͒̀͛̓̋̔͊̏͘̚ę̴̨̛̣͙̤̟̬̩̟͙͖̥̹̱̱̊͑͗̇̇͛̆̈́̃͋̓̀̔̍̍̌̐͊̎̓̅̀̕ͅģ̴̹̜̘͍̱̑͐̉̌̐̄̊͛̎́̐̌̅̈́͂͑̈́̋̔͂̊̊̒̒̔͛͆̚͘̕͠e̶̙͕̫̳̘͐̾́̑͆̓͂̿͊̊̍͛͐̌̆͗̌̅̅̔͊̂͛͗̅̕͝͝͝͝x̵̢̧̦̫͖̝̥̹͓̬͖̤̩͚̝̫̋̃̅̈́̆͋̌͑́̎̈́̊̾͒̀̒̎̓͛͊̿̓͊̀̍͐̆̚͝͝-̴̨̮̯͖͖̠̜̲̪͕̘͈͖̮̈́̓̐̃́̅̄̏́̍̉̐͌́̔̓̄͋͗̐̕͜͝ţ̴̢̧̖̗͖̞̮̫̦̼̝̺̼̱̳͓͉̜̟̤̲͖̻͙́̌̈̌̈͆̾̄͊̿̏̓͗̈́̕͜ͅh̶̢̧̨̥̭̼̟̣͖̯̗̤̖̙͉͕̙͎̰̠̝̖͈̻͙̪̮̘̯̻̼͕͓̖̣͈̽́͊̎͐͌̆̍̎̏̿͐̒́͋͑̍̿̎͆̑͆̄͂̀͐̄͑̀͗̿̽̎̾̊̕͝͝͝͝͝ͅi̴͚͈͍̫̮̝̣͖͉͓̯̠̙̭̟̖̘̾̓̄̈́̒̏̽̆̉̿͛̀́̃̋̒̈́͋̂̇̈́͛̕͜͠͠͝ͅs̶͇̖̳̞͉̱̞͓̖͔͔͍̗͇̖̮̹̅͊̔͋͊̈́̎̐̆̋̒̀̍̕͜ͅ.̴̧͎͇̰͉̼̱̰̦̟̑̋̏͌̍͊͑̄̀͌́̆̓͛̒̆̾̉͐̄̂̈́͆̒̃͗̐̂̎̈́̈͛̿́͛̾̚͘͜͝͝ͅȩ̷̡̲̪̱̪̥̳͍̼̰̘̗̹͙͙͓̣̟̩̥̥̖̠̪̮̹̞̥̻͎͖͍̯̂͑̏̑̆̍͋̎͛̅̑̑̏̎̓̀̓̒̈́͊͌̀̈́̒̌͐͂͛̊̍̐͂́̔̌̾͐̈́̋̇̏̚͜͝͝͝͠ͅx̶̧̛͚̗̜̪͍͖̘̙͎͚͇͙̬̱̟̭͓̺̙͍̖̱͚̣̘̪̭͔͔̮͎̬̪̤̹̟͔̩͍̬͕͔̩͐̈́̒̂͛̂̈̀̿̍̔̓̓̀̃̍͆̈́̍̓̌͐̈́̾̇̎̑͌͒̄̆̿̍͆̅͗͆͘͠͝͝ͅͅͅe̷̢̡̡̨̧̛͕͚̬̮̞̥̼͍͔̝̟̝̯͈̟̥͖̱̹̣̩̼̩̅̌͌̑̎̐̀̽̏́͐̋̏̎̎͛͌̀̊͊͒̑͌̎̎̑͊̌̉͆̾̚͘̚͜͠͠͠͝͝ͅͅͅ

Liquix · 4 years ago

For anyone who is curious (and acolytes of Zalgo): "In Unicode, character rendering does not use a simple character cell model where each glyph fits into a box with given height. Combining marks may be rendered above, below, or inside a base character. So you can easily construct a character sequence, consisting of a base character and “combining above” marks, of any length, to reach any desired visual height, assuming that the rendering software conforms to the Unicode rendering model."

[https://stackoverflow.com/questions/6579844/how-does-zalgo-t...]

Loughla · 4 years ago

This legitimately made me laugh out loud in my office.

The characters reach up off the screen as I reply to this. They overlay the comment above you. Amazing. How?

aasasd · 4 years ago

Until now, I haven't actually thought of what would happen if zalgotext occurred anywhere other than a web browser. Looking forward to the five minutes of fun with the file manager and whatnot.

jagged-chisel · 4 years ago

768 characters is too long for macOS it seems. (References online say HFS+ has a limit of 255 UTF-16 characters. Didn't find anything for APFS immediately... edit: same for APFS)

dang · 4 years ago

Please don't Zalgo on HN. It's enough to speak its name.

hnuser847 · 4 years ago

Honest question - what the heck are those characters?

quantified · 4 years ago

Glad you didn’t choose a sequence that crashes my browser.

meepmorp · 4 years ago

regex this, bravo

branko_d · 4 years ago

I have an uneasy feeling whenever I see a path parameter declared as string. Path is not a string - it's a sequence of path components and should be treated as such by our APIs. A path should be parsed once - on user input - and then used in its "sequence form" throughout the software stack.

And "path component" is not an arbitrary string either - e.g. appending a path component to the path should first require converting/parsing the string into the path component, and only if that's successful appending it to the path.

jerf · 4 years ago

"Path is not a string - it's a sequence of path components and should be treated as such by our APIs."

For maximum correctness, you want to turn it into a file handle as soon as possible, and do all operations through the variations of the file functions that end in "at", like: https://linux.die.net/man/2/openat

The downside of this approach is that you still technically have to carry the path around with you if you ever want to present it back to the user, because once you have a directory handle, you can get back to the root directory easily enough by following parent links and seeing what directories you end up in, but that may not be what the user "thinks" the path is, and they want to see their path, not a canonicalized one. And they're mostly right. And it's not easy to correctly track changes to their intended path from this basis either.

Basically, I don't know of a really solid, 100% correct way to handle this with any reasonable degree of effort.

jmull · 4 years ago

> For maximum correctness, you want to turn it into a file handle as soon as possible

That's not right. You want to resolve a file/folder path to a file/folder at the exact point it makes sense.

It's a problem if you're using a path when you wanted the file. The file can be switched/modified out from underneath you.

It's also a problem if you've got the file when you only wanted a reference. Now you can't simply switch/modify the file independent of the reference. E.g., maybe you want config file changes to take effect immediately and transparently.

You can also have the hybrid case, e.g., where you want the folder directly, but have a relative path to a file that is resolved late.

If you're unsure, I'd err on the side of late resolution.

"you want to turn it into a file handle as soon as possible"

But no sooner.

For example, I've run into problems where I'm configuring program A server to talk to file location B... but I don't have access to file location B. But the client-side library for talking to the server tries to convert location B into a file handle and then freaks out because I can't access it. When I don't want to access it. I want that program to serve it.

If it was using simple "path" objects that didn't confirm that I have access to the path, everything would be hunky dory. But because it tried to convert it into a file handle unnecessarily, I get blocked.

tmerr · 4 years ago

Another inconvenience with this approach is that you can keep thousands of paths in memory no problem. But thousands of FDs may cause you to exceed per-process limits.

globular-toast · 4 years ago

This goes for most instances of user input. Timestamps is the other common one people get wrong. I've even seen programs that pass around timestamps as strings in multiple formats and as integers (Unix time).

BoorishBears · 4 years ago

This is why I get stressed out when I see paths turned into special objects encoding separators and such.

It tells me the path is living for way too long compared to the file handle.

I only want to see path-specific objects if we're modifying the path, and even then I want that to happen as late as possible.

aspaceman · 4 years ago

Why not just hold onto both? The users representation and the file handle. Only ever "display" the representation, while you do all operations on the handle. (Not trying to be sarcastic, just curious).

doesn't this lock the file?

dahfizz · 4 years ago

> I have an uneasy feeling whenever I see a path parameter declared as string. Path is not a string

I guess that depends on what you mean by "string". `open` and `fopen` need a char* path to open a file. Whatever fancy Path abstraction you use eventually becomes a char* string, because that's what the kernel needs.

VWWHFSfQ · 4 years ago

yeah. it's a string.

anyfoo · 4 years ago

Strings following certain rules are entirely valid representations of paths, just like sequences of path components in the chosen language/framework are. Similarly, the sequences of bits that make up the sequences of your language/framework in memory are an entirely valid representation of said sequences of components.

Yes, paths have structure, but saying "a path is not a string" is equivalent of saying "C source code is not a string". Both are strings, and both are something else, represented by strings according to rules. Different internal representations have different advantages and disadvantages. I fully agree that for things such as "adding components" an internal sequence/list representation is better, but strings can pass arbitrary IPC or even ABI boundaries much easier for example. (And you wouldn't bat an eye for example when you see FQDNs like "www.google.com" passed as a string instead of as ["www","google","com"] because the string representation works pretty well.)

fouric · 4 years ago

C source code and paths are both representable by strings, true, but the fact that they're not actually strings is still important, because most people don't know that, and in the case of paths that leads to a lot of edge cases (in the case of source code it leads to a bunch of inefficient and weak tooling, which isn't quite as bad).

Because neither are strings, their native representation shouldn't be such - it should be something structured, and only when necessary (IPC, FFI, serdes) be serialized into a string representation. This would save people a lot of time and effort.

POSIX "Fully portable filenames" allow all characters except 0x2F (/) and 0x00 (NULL). That means file names can include line feeds, backspaces, EOF, etc.

"This is `a

perfectly vali'd.\010! file name\377, despite the weirdness"

naikrovek · 4 years ago

things like this are why the Unix philosophy is so bad.

text processing is hard if you must support Unicode, and that means every Unix command line tool must implement or employ a text processor to handle input. it would be much easier if objects were passed back and forth. PowerShell got this right.

ourmandave · 4 years ago

A lot of my stuff is cross platform so making filenames portable means avoiding spaces.

Ironically, even NASA doesn't like space.

https://www.nas.nasa.gov/hecc/support/kb/portable-file-names...

Touché my friend, had a good laugh

hardwaresofton · 4 years ago

I am also that age, and kebab-case is the best case for filenames.

2021-01-01-some-important-document.pdf gives me the warm fuzzies. On the off chance that some more differentiation is needed, throw in an underscore and a whole new world opens up

tambourine_man · 4 years ago

I go one step further: 2021-11-11_client_project-name.ext

2021-11-11_client_projectName.ext is also OK. But underscore separates fields, hyphens for space replacement.

I see and applaud your use of the underscore there, but I must reject the premise!

work/client/project/2021-11-11-file.ext is more or less how I lay stuff out. I’d say client/project is a folder level distinction (arguably dates too).

[EDIT] Realistically most of the stuff under <project> is git repos and I usually make a “home” repo where I keep org files for tracking hours, notes, and resources related to the engagement.

zajio1am · 4 years ago

> But underscore separates fields, hyphens for space replacement

But why not the other way, hyphen-minus for separating fields and underscore for space replacement? That seems to me more consistent with how underscores and dashes are used.

Maybe you mean `2021-11-11_client_project-name_v2_final.ext`

pluc · 4 years ago

this is the way

jonnycomputer · 4 years ago

I've recently shifted sharply toward the dash from the underscore. I find it more readable, and it doesn't require the shift key. However, I do find it useful to use underscores to create groups, e.g. test-001_2021-10-11.log. Including hours, minutes, seconds is still awkward.

discreteevent · 4 years ago

There's a customer for everything. I've just never liked the aesthetics of the underscore. Also if your underscored thing gets put in some document and then underlined the underscores can become invisible.

FpUser · 4 years ago

Brother in arms. I just posted similar thing below.

Burn the witch!

> kebab-case

I hadn't heard that before and I love it.

sodapopcan · 4 years ago

If you hadn't heard kebob-case called that before there's a chance you haven't heard SCREAMING_SNAKE_CASE called that before, and I couldn't live myself if I didn't let you know.

Asraelite · 4 years ago

Google considers it too violent apparently. In one of their recent changes to their style guide, they started recommending "dash-case" instead.

https://developers.google.com/style/word-list#letter-k

Same. I had tears in my eyes from laughing. For some inexplicable reason it seems incredibly funny.

I'm of the opinion that kebab-case is the best case for all identifiers, because it's easy to read and to type. As always, Lispers were right all along.

ModernMech · 4 years ago

Kebab case is the often overlooked benefit of prefix notation and semantic white space in programming languages. Honestly the best case of all cases imo.

kibwen · 4 years ago

One glorious day we'll accept programming languages that require spaces around infix arithmetic operators so that we can make kebab case a reality!

jerry1979 · 4 years ago

I found that some_document_2021-01-01_v03.pdf works best because it keeps the same document next to its other versions alphabetically, keeps them in date order, and keeps them in a sub-day version order.

Raineer · 4 years ago

In my work, today's date would be 21K11, to save space over the longer date.

onychomys · 4 years ago

Are you working in some embedded system with tiny memory space or something? What's the use of saving one character? Just make it YYMMDD!

blackboxlogic · 4 years ago

How do you distinguish 21K111 and 21K111?

jaclaz · 4 years ago

As a side note, in the good ol' times of ISO9660 level 1-4 and the various mkisofs parameters, an underscore _ which is a CAPITAL -, may have given issues, only for the record/as a curiosity:

https://web.archive.org/web/20151007005513/http://www.911cd....

P.S. should anyone want to see/run the actual batch, a copy has been uploaded here:

http://reboot.pro/index.php?showtopic=18962&page=29#entry204...

ur-whale · 4 years ago

> 2021-01-01

Yes on the date format.

Saves you so much time.

hnburnsy · 4 years ago

I don't bother with the century or the dashes, saves time...

211111_foobar_v1.txt

I am old enough that I still save before printing. I think it was Lotus 123 that engrained it for me.

zz865 · 4 years ago

Agreed on dates ordering problem but 20210101 is so much easier to type.

ravel-bar-foo · 4 years ago

This used to be my default, and then I used Matlab, and "-" was interpreted as subtraction.

I use this style:

2021-01-01_what-happened_who-did-it_possible-reason

Joker_vD · 4 years ago

One of the main reasons why Windows used "Program Files" and "Documents and Settings" was to force the programs (and programmers) to deal with paths with spaces. And you know, for the most part it kinda, more or less worked out although of course even today you will find programs that ask you to install them in a folder without spaces in the path.

dale_glass · 4 years ago

And that was a good idea, if only Microsoft also fixed the CreateProcess function, Windows would be somewhat sane in this regard. But somehow nobody seemed to think of it. Seriously, look at it:

https://docs.microsoft.com/en-us/windows/win32/api/processth...

The arguments are a single string. So you want to pass parameters with spaces in them? You've got to add quotes and stuff all of that into a single string. Instead of doing it in a more sane manner, like oh, the arguments to main().

IiydAbITMvJkqKf · 4 years ago

The root cause is that argv isn't a first-class citizen like on linux, but an abstraction. The kernel only cares about a single string argument. If you use main instead of WinMain, the CRT will transform the single string into an argv for you.

Oh and cmd.exe uses a different escaping scheme than the CRT.

maintaining backwards compatibility means maintaining silly decisions, and Microsoft does both.

toyg · 4 years ago

The main culprit for space issues is stuff relying on BAT or CMD files, where escaping variables seems to be a black art.

Sadly such set includes loads of Java programs. If only SUN had shipped a standard way to generate isolated exe files in 1998... but they worked under the presumption that you'd have a JVM already there, because distributing that monster was difficult in dialup times, so you could just hand people a jar; and the enterprise market did not care, since they had webapp servers. Sadly it's an "optimization" that became obsolete very quickly but wasn't rectified until it was too late (java 9+).

> The main culprit for space issues is stuff relying on BAT or CMD files, where escaping variables seems to be a black art.

Actually it isn't, just use double quotes and add a '~'. It's just about the only thing batch files handle better than shell scripts. set "VARIABLE=%~PATH"

alerighi · 4 years ago

That annoys me every time I use a Windows system. It was a terrible decision, especially since both the command prompt and the new powershell doesn't accept like bash a backspace before a space, you have to quote the whole path! I get that most users on Windows don't use the shell, but as a developer I do a lot, and every time it's a pain (no wonder they added the WSL in Windows after the failure of Powershell...)

rashil2000 · 4 years ago

Why would they accept a backslash? Backslash is a path separator on Windows. In most Windows programs, you don't even need to escape the space - arguments can contain spaces and it will understand it, like `notepad My file.txt`

The escape character on PowerShell is backtick, and on cmd it is caret. You don't need to quote everything.

Rerarom · 4 years ago

VFAT and stuff like that actually provided alternate names like PROGRA~1

beardyw · 4 years ago

Yes, I was doing code to quickly read FAT folders (on a micro controller) and got to the bit about filenames more than 8.3. I decided my life was too short (and processing time) to go and sort out what the "real" file name is. Enforced 8.3 as a requirement!

makecheck · 4 years ago

They may have thought that would happen but I saw just as much stuff end up in C:\Windows or \Users or (always my favorite) those “Documents” that are really just “whatever random crap every app wants to put there”.

Avalaxy · 4 years ago

Yet in Microsofts own cmd tool I need to put quotes around my path if I want to refer to any files/folders below those folders.