Readit News logoReadit News
12_throw_away · a year ago
I wish there were even more of this sort of thing, even though I'm an anti-braces zealot - it's crazy to me that, here in 2024, code is still so tightly linked to specific textual representations. Maybe there's an alternate reality where IDEs and development tooling got better and better post-smalltalk/lisp, instead of ... whatever we have now. Maybe we'd have editors and viewers where you could configure the syntax however you wanted. And sure, it would be persisted to disk in some canonical representation, maybe text, maybe a DB ... but you'd never need to worry about it, because all your tooling (VCS, diffs, refactoring tools, etc) would work with the AST, and you'd never need to worry about tabs or spaces ever again.
tkuraku · a year ago
I think having a single syntax is the best way. I don't want to have to look at python and have to parse spaces, curly braces, square braces, etc. It all being standard is really helpful. A standard Autoformatter like black or gofmt even though I might might not choose all the options, the uniformity is super valuable.
_the_inflator · a year ago
I agree. To me ESlint in the JavaScript domain is an unsung hero.

Code is so incredibly hard to mentally grasp and every mental overload should be omitted to reflect on the logic.

There is a reason why there is one code basis and this should always be curated by a linter to uniformly enforce a standard.

There is still plenty of room for style and code organization.

I witnessed first hand many trench wars around seemingly small things like curly brackets in IF statements dealing with the question of one white space or none, because it appealed to personal preferences and before ESlint people would go to great length reformatting hundreds of LoCs just to get their right feeling of code syntax.

Weird. And git -diff was massive, as well as the code reviews.

(Un)Happy times. :D

dTal · a year ago
Your justification for a single syntax is "I don't want to have to look at python and have to parse spaces, curly braces, square braces, etc." But the comment you just replied to said:

>Maybe we'd have editors and viewers where you could configure the syntax however you wanted

I took this to mean that, in this fantasy universe, you could make any source file look however you want. Like tabs vs spaces and pure html vs html-with-css, this is about separating meaning from presentation. Is there a good reason to force the same visual representation on everyone?

sverhagen · a year ago
This may be controversial, but I find 2-character indented Javascript, coming from my 4-character indented Java, for the projects I work on, as much of a shock as moving between braces and Python. Particularly if it has been subjected to an opinionated formatter that just doesn't match my personal style. Completely unreadable, I mumble under my breath. And then my eyes adjust to it for a bit, and after an... hour? (maybe) of coding in the (ahum) "hostile" style, it's familiar (again) and we're on our way.
VirusNewbie · a year ago
> I don't want to have to look at python and have to parse spaces, curly braces, square braces, etc.

But if the syntax was separate from the underlying representation, couldn't you just have your editor open it the way you want?

thejohnconway · a year ago
The whole point is that you don't have to parse different syntax, because you choose your syntax at the editor level.
ok_computer · a year ago
+1 for ruff for python as a formatter & linter that allows single quotes without shaming you.

I’m all for keeping consistent flat utf-8 files. I’d hate for my code ultra simpleminded, possible to pen test with a pen and paper, python and sql code to be wrapped in a god awful xml or json or proprietary db markup and object hierarchical model of what code should look like.

Like I imagine trying to check in a jupyter notebook but worse.

For instance tabs vs spaces was decided and text editors accommodated this and despite what may be someone’s personal preference a uniform decision was made.

Humans can learn. We should use accessible formats and push for standards and keep those readable.

phaedrus · a year ago
I like to draw the comparison that in an alternate universe where the ASCII .txt file format had simply included a color byte with every letter byte, we'd now be having coding format guidelines that also go into great detail prescribing manual syntax highlighting. The fact we don't and avoided endless bike shedding over syntax highlighting, instead leaving it to automatic tools and the presentation layer not the code itself, is purely an historical accident. Well, what does that say about other code formatting details which programmers currently do waste time manually doing and worrying about?
pseudosavant · a year ago
I had never considered the bikeshedding that would occur if there was a color byte… We struggle with spaces versus tabs, how many spaces, braces (or not), semi-colon (or not).

It would definitely be the norm that there were languages that were dark or light mode, and both sides would be convinced they were right.

philipov · a year ago
I had never considered that, but you're absolutely right. I would totally be the sort of person to manually idiosyncratically syntax highlight my code if it were feasible. I'm sure glad I don't feel compelled to do that...
zzo38computer · a year ago
Some programming languages such as ColorForth require colours in order to work, that the meaning of a word depends on what colour it is written with.
arp242 · a year ago
There's always someone who mentions this, and it just doesn't work.

Do you group things like:

  do_stuff()
  more_stuff()

  other_stuff()
  moar_function()
Or:

  do_stuff()
  more_stuff()
  other_stuff()
  moar_function()
Or:

  do_stuff()
  more_stuff()
  other_stuff()

  moar_function()
Grouping code by a single blank line can make a big difference. This is a bit of a silly example, but there's tons of not-so-silly examples. You can't really represent that in an AST.

Line length is another. infinite line length doesn't work as screens aren't infinitely long. Automatic wrapping doesn't work because you want to break at specific points. An example is something like:

  window = XCreateWindow(display, XRootWindow(display, screen),
      center_x, center_y, size, size,
      4,              // border width
      vinfo.depth,
      CopyFromParent, // class
      vinfo.visual,
      CWColormap | CWBorderPixel | CWBackPixel | CWOverrideRedirect,
      &window_attr);
Cramming as much as possible on as few lines as possible is just not going to work well.

There's tons of cases.

That's why no one does it. Because it just won't work. Everyone will hate it.

wruza · a year ago
Grouping code … You can't really represent that in an AST.

<visual-group>…</visual-group>

Also, I wish that at least in current editors there would be a way to render \n\n as a half-height line. Full-height empty lines are too bold.

Automatic wrapping doesn't work because you want to break at specific points

You really want to have a set of buttons that switch between:

  - a line of arguments
  - a block of arguments
  - a line/block of only non-default valued arguments
  - arguments sorted by name
  - …
And a hint on a default representation, with some default heuristics.

How does it know that x, y, width, and height are semantically grouped and best put on the same line? It doesn't. (from the other comment)

Look higher, parameters can be grouped at the declaration level. <params><related-params name=“coords”>…</>…</>. Now you can render a nice frame around these in block mode. Or not, depending on local renderer settings.

Everyone will hate it.

There’s always a way to make everyone hate something, especially if the solution is clueless about its problem. It doesn’t mean it should be done this way. Experiment and evolution could make it work, we just have to let people try instead of dismissing it so confidently.

otteromkram · a year ago
I agree about horizontal space adding to readability and I use that method fairly often.

That said, if you use classes, descriptive names, and appropriate comments, it won't matter how you group because your code will be self explanatory.

Finally, with today's wide-screen monitors on desktop, line length is less of a worry. Problem only arises when reading code on mobile devices.

The last workplace I was at had a soft wrap around 80 characters, but we upped that to 100 when functions and methods became almost vertical.

svieira · a year ago
Two points:

1. Breaking at specific points is something that can be specified by the pretty-printer of the _viewer_ you are using. Think of existing auto-formatters and imagine that they're working over the view instead of the persisted form.

2. The AST can have pointers into advisory data (or the other way around, if desired, the "program data" can include the AST, but also other things as well) to note that there is an anonymous region here (C# and friends already have conventions for this _for the source code_ that Visual Studios understands - look at `#region` comments). This would let viewers choose their preferred representation for a region.

foolswisdom · a year ago
Well, a few days ago there was a post about the Unison language, where code is stored in AST form and text is just a representation.

https://news.ycombinator.com/item?id=40882133

KMag · a year ago
I've long dreamt of a system that compiles to native code, but stores a compressed SSA form (similar to SafeTSA or LLVM bitcode) in the binary for efficient runtime re-optimization based on profiling, somewhat similar to current Android Runtimes. One could then have several levels of debugging symbols, one that gives names to local variables represented by CFG nodes, and another that adds a compressed diff between some standardized decompiler output and the original source.

You could then decompile to some alternative syntax, but you'd lose any idiosyncratic formatting represented by the compressed diff.

__MatrixMan__ · a year ago
Last time I looked, [Unison code] -> [entry in the AST DB] was a one-way process. Adding a function means writing it (with whatever style you like) and seeing if what you wrote constitutes a new function or an existing one. You can't fluff db entries back up in to human friendly code.

I don't see why it couldn't be done though, I think it just hasn't been a priority. Heck, you could have 100 different users collaborating in 100 different "languages", and so long as they serialized to the same AST and back, none of them would ever have to see the atrocious syntax which the other users prefer. Their editors and browsers could just render everything according to their users' preferences.

Edit: it appears that Unison has an issue for this feature: https://github.com/unisonweb/unison/issues/499

Varriount · a year ago
I can understand braces (to an extent...). What really confuses me are new languages that still require semicolons at the end of expressions/statements.
TwentyPosts · a year ago
Speaking from a Rust-perspective, having semicolons at the end of statements makes perfect sense and is a brilliant design decision.

Note that I said 'statements', not 'expressions'.

A lot of the confusion here (and maybe yours, too) stems from this difference. In Rust, (almost) everything is an expression by default, and you turn it into a statement by adding a semicolon. This allows you (and the type checker) to very neatly distinguish between expressions and statements, which is great. It's a very nice and elegant approach imo.

stkdump · a year ago
What confuses me is languages that can split statements over multiple lines as long as certain conditions are met (such as the break occours inside braces). I rather have a semicolon at the end of the statement to make it more explicit.
voidUpdate · a year ago
I like semicolons, because it lets me use the brackets style I prefer (allman) instead of whatever the language devs think I should use. This is a really big issue for me trying to use Go, as they automatically insert a semicolon to the end of every line that doesn't end with a {, so the language forces you to use K&R, which I really dislike reading
layer8 · a year ago
In general, what helps parsers for disambiguation also tends to help human readers for disambiguation. In addition, some amount of redundancy helps to prevent errors when (for example) two statements were intended but they are parsed as one, or vice versa. Furthermore, consistency helps avoid errors, such as always ending a statement with a semicolon and not only when it would otherwise be ambiguous.
IcyWindows · a year ago
Why is that so much more different from requiring '\n' instead of ';'?
smsm42 · a year ago
Non-human-readable primary representation may be not the best idea though. Would create a lot of friction when trying to process the data. Text is largely simple and obvious (yes I know there are complications but in 99% of cases you can ignore them without too much trouble) but DBs and ASTs are not. People read texts, but they can't read ASTs, at least not without assistance. So it'd be hard to use.
skissane · a year ago
> Non-human-readable primary representation may be not the best idea though.

What if the AST is persisted as S-expressions, but then you have a different syntax to edit it? Algol-ish, Pascal-ish, C-ish, Python-ish: choose your poison (or even support multiple poisons and let the developer pick the one they prefer?)

This was actually the original plan with Lisp. Lisp was originally supposed to have two syntaxes, S-expressions and M-expressions, with M-expressions being Algol-like. However, the implementation of M-expressions was delayed, and people got so used to using S-expressions directly, they decided M-expressions were unnecessary and they were never implemented in mainstream Lisp. They were implemented in the Lisp 2 project, but that ended up being an evolutionary dead-end; various attempts at the idea have happened since but none of them really took off.

Lisp purists will argue M-expressions are unnecessary and S-expressions are all you need. However, S-expressions can make the language more foreboding to complete beginners, and even among experienced programmers, a decent percentage find them seriously off-putting. Maybe if the M-expression idea had been pursued more seriously, Lisp might be more popular today.

maccard · a year ago
It’s all still bytes under the hood, albeit standard bytes. Personally I’ve not found the ability to open my css files in Word beneficial!

Many editors are already altering what is stored on disc before presenting it to you - type annotations, code folding, git info. I think we could do a lot if our default storage was the semantic representation of the code.

bsza · a year ago
So much opportunity for anti-competitive behaviour. The "canonical representation" of C# could just be a memory dump of Visual Studio. Your code is held hostage forever. Everything you create belongs to Microsoft. You have to drink a verification can to keep using your IDE. Basically Unity/Adobe/etc, but for code. No thanks.
neonsunset · a year ago
https://github.com/dotnet/dotnet there, the entire source code of runtime, sdk, roslyn, etc.

And you will have easier time getting your changes merged into dotnet/runtime than into Python.

quasarj · a year ago
You just described a literal nightmare D:
signaru · a year ago
Maybe no longer true today, but with the right tools, C# and VB.NET used to be auto translatable from each other and IL can be decompiled to either.
hermitdev · a year ago
It was only ever true for "clr safe" code, or a subset of C#. In particular, since VB.NET didn't/doesn't have unsigned types, not all C# could be expressed in VB.NET, even after decompiling from IL. (Not sure what happens if you try and decompile to VB.NET code that uses unsigned, for example).
Vegenoid · a year ago
Seems like the biggest downside would be that sharing code becomes much harder. Currently it is easy to share code in as small of a portion as you'd like. If the canonical representation is an AST, it opens up a lot of problems around sharing pieces of a program. This seems like a very substantial downside.

Even simply "sharing" within your own systems, like copying blocks into notes or another program, would be a lot harder. Maybe I'm not knowledgeable enough here and this wouldn't be as thorny as it seems.

hombre_fatal · a year ago
Also, I really don't want to have to load any and all code into a local editor just to view it and reason about it.
eiffel31 · a year ago
You can (sort of) do all of that with the Eclipse Modeling Framework[0].

Your AST is what EMF calls a "model". By default the "backend" and ecosystem surrounding EMF is skewed towards Java for historical reasons, but there have been some prototypes with other languages as well. You can serialize your AST in any way you like, although by default it relies on XMI files. You can implement your own textual concrete syntax, or rely on a database. The EMF ecosystem has tools for implementing textual or "graphical" concrete syntaxes. You can combine them (e.g. usually a specific subset of your AST gets edited in a certain way that's best for your targetted end users). The ecosystem also has tools for performing comparisons and plugging them into your editing means.

Of course all of this tooling requires a lot more work than an LSP server.

[0]: https://eclipse.dev/modeling/emf/

Deleted Comment

crabbone · a year ago
I believe, this was an idea in ALGOL, not sure which iteration.

I think, the reason it was never implemented was that more translation = more complicated debugging. It also means that programmers have a more distorted and incomplete model of the program they are writing, i.e. more bugs.

NB. Lisp, as originally envisioned by McCarthy, had one more translation layer (the translated version had square brackets instead of the parenthesis), but it didn't take off for, basically, the same reason.

So... while I understand the benefits you see from doing what you suggest, I think that at the same time the downside makes this not worth pursuing.

LeFantome · a year ago
In the .NET universe, you can mechanically convert between C# and VisualBasic. This is more or less done by going through CIL ( Common Intermediate Language - .NET assembly essentially ). So, it is more or less what you are saying.

.NET decompilers are common. I have built a few toy languages and compilers on .NET. For one of them, I could decompile CIL into my language. So, I could view .NET libraries from other sources in my language.

I think this is essentially the same idea you are proposing.

It only works if the languages are similar though. Going between F# and C# does not always work as well for example.

heresie-dabord · a year ago
> you could configure the syntax however you wanted. And sure, it would be persisted to disk in some canonical representation, maybe text, maybe a DB ... but you'd never need to worry about it, because all your tooling (VCS, diffs, refactoring tools, etc) would work with the AST, and you'd never need to worry about tabs or spaces ever again.

You are describing an entire industry of IDE Smell with an IDE monoculture.

turndown · a year ago
Programs are tightly coupled to textual representations because a compiler is a textual representation transformer. If you deviate from the accepted textual form then the compiler is generally clueless to do anything - that is, it can’t read your intent.
harimau777 · a year ago
That actually exists for C. It's pretty neat.

https://en.wikipedia.org/wiki/Indent_(Unix)

hsbauauvhabzb · a year ago
First we gotta solve the tabs/spaces dilemma, maybe then we can start to tackle the rest.

Edit: I do agree and find your AST suggestion profound though!

layer8 · a year ago
People should recognize that tabs vs. spaces is more a question of editing (e.g.: What happens when you press Backspace behind an indentation? Does the caret move at uniform speed on key repeat or does it suddenly jump in places?) than of stored representation. You could even have an editor/viewer that lets you choose the display-width of spaces at the start of lines.
moonlion_eth · a year ago
There are tabs people and there are infidels
TOGoS · a year ago
This is what Unison is supposed to do. As far as I know there is currently only one (Haskell-like) textual representation, but the program is stored in a binary representation of the AST, and once you've entered your code, it's all operations on that AST from then on, and adding different frontends should be not difficult.
griftrejection · a year ago
Code is linked to textual representations because code... is text? The entire point is that there are algorithms to go from "print()" to something that actually makes the computer work. How else could it even possibly work?

I think we don't have any sort of flexible AST sort of thing because they're mostly not necessary. The hard problems of programming don't usually have much to do with syntax.

And if you're going to downvote, kindly explain why, thanks. I just want to know exactly how this thing is supposed to work...

rgovostes · a year ago
The parent’s post describes how it could work.

I may recall incorrectly but AppleScript may be an example: some file formats are serialized ASTs. The editor displays it as textual code. A downside of this is that you can’t save a syntactically invalid file.

eiffel31 · a year ago
Code is as much Text as Vector graphics is Text.

Sure it's an intuitive way of representing your data. Is it the most appropriate though? See an example [0] about using Projectional Editing in order to use mathematical notations for formulas.

[0]: http://voelter.de/data/pub/gemoc2014-voelterLisson-MPSNotati...

RHSeeger · a year ago
> code... is text

There are visual programming languages that chain together blocks, instead of raw text.

> The hard problems of programming don't usually have much to do with syntax.

I guess it depends on how you define hard. You are clearly talking about "a singular issue that needs to be solved", which really only effects a single developer / team and, to a lesser extent, those that use that solution. But if you consider something like syntax, you're now talking about something that much a much smaller impact _per developer_, but has that impact on _every_ developer. The syntax issue may have a much larger impact overall.

rthnbgrredf · a year ago
Thanks to LLMs, we are quite close to achieving this. I can write code in Python (sometimes even plain english), and GPT can convert it to Go or even Haskell if I like. The conversion is accurate 95% of the time on the first attempt in my use cases, and I expect this to improve further with more powerful models in the near future.
lgas · a year ago
You can take it a step further, LLMs can already "execute" arbitrary non-existent languages, with non-existent data. Here are a couple of examples using a tool I wrote[1]:

    % echo "nums 1 10 | filter even | to_words | map uppercase" | refab imagine 
    TWO
    FOUR
    SIX
    EIGHT
    TEN

    % echo "with file '/tmp/top-ten-most-populous-cities.txt' do; cities = read; cities.each { |city| (city.name, city.utc_offset) }" | refab imagine
    Tokyo, 9
    Delhi, 5.5
    Shanghai, 8
    São Paulo, -3
    Mumbai, 5.5
    Mexico City, -6
    Beijing, 8
    Osaka, 9
    Cairo, 2
    New York, -5
    
For what it's worth, the tool isn't specialized for this, 'imagine' is just one of many prompts it can execute.

Of course the execution is non deterministic and at the moment only works for simple things, but you can imagine as LLMs get more capable and more integrated with tools this will matter less and less.

[1] https://github.com/lgastako/refab

hsbauauvhabzb · a year ago
I hope we never work together. 95% accurate will cause problems that you won’t notice when you inevitably get lazy.
digging · a year ago
I largely agree, but I don't think the current experience is the right one.

I recently started writing a game in Godot. I don't know GodotScript, and I've found I don't like it very much in trying to learn. I turned to aider.chat to see if I could describe the functions, data structures, and systems I wanted and have it write them. I also tried writing in a more familiar language (...one with braces...) and having it translate those files.

It does pretty well, but it doesn't feel like software engineering. It's too hands-off and doesn't activate the same neurons. All the problem-solving and puzzle-solving is gone, and the successes are quite boring, and the failure modes are more irritating even if they're necessarily quicker to solve.

It's a weird experience. I'm moving so, so much faster than I would have on my own, but I don't enjoy it. It feels like cheating - I'm not actually ashamed of what I'm doing but I also won't take credit for writing the code.

However, what I'm getting at is this: If I could write the code in a syntax or even language that I prefer and have copilot or whatever translate it in near-real-time (without active prompting), that would be the best of both worlds. I'd still be a little sad at myself if I didn't learn the new language, but I also think this method would facilitate learning better than what I'm doing with aider (because I could see what my code turns into as I'm writing it, and learn that "translation").

Dead Comment

randomtoast · a year ago
I can confirm that it is a suitable use case for GPTs. I do GPT-assisted programming language design and experimentation. In some cases, GPT-4 can even generate a basic interpreter that allows me to test my new language.

Here is an example of GPT's output for Python with braces that was generated after just spending 10 seconds for the prompt:

  def preprocess_braces(code: str) -> str:
      lines = code.split('\n')
      processed_lines = []
      indent_level = 0
      indent_str = '    '  # 4 spaces for indentation

      for line in lines:
          stripped_line = line.strip()
        
          # Check for opening brace
          if stripped_line.endswith('{'):
              processed_lines.append(indent_str * indent_level + stripped_line[:-1].strip() + ':')
              indent_level += 1
          # Check for closing brace
          elif stripped_line == '}':
              indent_level -= 1
          else:
              processed_lines.append(indent_str * indent_level + stripped_line)
    
      return '\n'.join(processed_lines)

  # Example usage:
  code_with_braces = """
  def example_function() {
      if True {
          print("Hello, world!")
      }
      for i in range(5) {
          print(i)
      }
  }
  """


  processed_code = preprocess_braces(code_with_braces)
  exec(processed_code)  # This will execute the transformed Python code

  print("Processed Code:\n", processed_code)

necovek · a year ago
This is a great discussion with many a differing point of view.

To some, significant indentation is better.

Others — too used to braces — miss them dearly in Python.

Next ones, vie for the non-text source code, something to get us past these discussions altogether (editors working on .pyc files directly?).

For programs to be maintained, they need to be read, understood and improved. One—often undervalued—skill in programming is to write beautiful code, because that is more art than craft. And unfortunately, tools like Black prohibit the true artists from expressing themselves clearly with code formatting too. And to those, white-space or braces matters on a different level, and everything else is attempting to make up excuses for why one is better than other.

And while conceptual operations we do on the code seem simple on the surface, devising an editing tool that would do semantic operations on the AST is fricking hard and likely to be very non-ergonomic. Look at all the attempts to make code refactoring tooling: it's crazily complex and confusing that it's simpler to just go and grep for a string and fix anything you find.

As long as it's faster to use regular editing operations to shuffle code around, indent or unindent it (or wrap it with braces), tweak one thing here or there, simple text editors will mostly rule the world of programming.

kazinator · a year ago
Python encodes structure in only one way, using indentation. There is no redundant signal that can be checked for a discrepancy that could indicate a problem. GCC and Clang can diagnose when indentation is "misleading" because it doesn't match what the braces are saying.

Python's choice of representation is such that two different Python programs show zero differences under a white-space-suppressed diff.

The white-space suppressed diff is a useful tool for comparing programs when some sections of code have changed indentation but are otherwise the same; yet we cannot rely on it if we are using Python.

Python's syntax design is objectively poor on several purely technical points. In its favor, there are only handwaving pop psych arguments.

xigoi · a year ago
> There is no redundant signal that can be checked for a discrepancy that could indicate a problem.

What’s the problem with having no redundancy? By the same logic, we should require numerals to be spelled out in words so the compiler can check if the programmer didn’t accidentally write a different number.

  def fahrenheit_to_celsius(x): {
      return (x - 32 [thirty two]) * 5 [five] / 9 [nine];
  }

arcxi · a year ago
>The white-space suppressed diff is a useful tool for comparing programs when some sections of code have changed indentation but are otherwise the same; yet we cannot rely on it if we are using Python.

yes, because in Python two programs with different indentation are not the same. what's the problem that you can't use tools intended for code without significant whitespace with code in Python?

necovek · a year ago
> Python encodes structure in only one way, using indentation. There is no redundant signal that can be checked for a discrepancy that could indicate a problem. GCC and Clang can diagnose when indentation is "misleading" because it doesn't match what the braces are saying.

I've seen both misplaced braces (akin to mis-indenting blocks in Python), indentation not matching braces and other problems of the same sort in non-Python code. Readers of the code would misunderstand the code when badly indented, and might introduce bad braces as well (not everybody uses automatic formatters and linters either, esp as they will sometimes "quickly" edit code in their non-usual dev environment). Not to mention that some "braced" languages allow having single-line blocks without braces.

It's also not true that this is the only way Python encodes structure: new blocks generally only start with a ":", and you've got control flow keywords that allow for new blocks to start. One could argue that's a great feature disallowing you from introducing confusing spacing without actually having a new block started.

While I am fond of applying many of double-entry-accounting principles in programming to increase trust in what we write, I believe that's much better done with unit-tests, which can more clearly demonstrate the expectations for any code and read more like documentation.

Do you think all syntax in programming languages should have some sort of extra validation built-in? Eg. you should type in a constant twice (declare it as `const int a = 5` and then you have to set the value later as `a = 5` or you get an error?)?

As I said above, people will always find an "objective" excuse why their preference is better. But I've seen bad-block-boundaries in Python as much as I've seen it in other languages which use explicit block boundaries like braces (and I've done more Python over the last ~20 years). I've heard this argument a gazillion times, but hundreds of bugs due to that have simply failed to materialize while working on large projects with tens and hundreds of people.

Getting indentation right is _really_ not that hard, just like getting braces right is not that hard. I've yet to find someone who prefers their code to not be indented at all, and only rely on braces — at least not in a team setting where you can't simply reformat all of it.

sverhagen · a year ago
> two different Python programs show zero differences under a white-space-suppressed diff.

This must surely be a bad thing? How would this signal an unfortunate indentation change between:

  if foo:
    bar()
  bar2()
And:

  if foo:
    bar()
    bar2()
???

thoutsnark · a year ago
I strongly agree with your main point that pythons syntax design is objectively poor, but content that even the grace you seek to extend to it is wrong

>Python encodes structure in only one way, using indentation

Except when you write a multiple line string using """ notation.

Or when you put things inside parenthesis, which i have seen as the preferred solution for method chaining.

Point is, python claims indentation is all that is needed, and then very quickly breaks it's own rule.

Too · a year ago
Most diff tools default to not suppressing leading whitespace. Trailing whitespace should be automatically removed and enforced by your code formatting tools, leaving no reason to suppress it.

As someone who has been diffing endless amounts of python through various diff tools, this is totally a non issue.

mjevans · a year ago
go fmt is really nice, that there is A standard for the language which is just included. No more arguments, just do what the including tooling desires.

Python... I'd love if it shipped with a formatter that converted indents to braces, and then had an option for expressing indent as spaces (with number of spaces per indent) OR tabs (same, default 1); then still kept the braces.

sigzero · a year ago
I really hate the "tab width = 8" though.
duncan-donuts · a year ago
I think people are actually searching for Lisp far more than we are willing to admit.
darby_nine · a year ago
I don't understand why we don't make the aesthetic aspects of syntax (e.g. block delimitation) to be a feature of the editor rather than the source of truth for the code. For all unix profited from text I think we have the tooling necessary to move our storage and editors beyond it, and it's been obvious the entire time it comes with non-trivial liability converting to and from more reliably structured representations.
williamdclt · a year ago
I think it boils down to “text is good _enough_” (80% of the value, 10% of the complexity if even that), and it’s a format that’s incredibly interoperable. You can use tools that aren’t just language agnostic, but are not even programming-specific: notepads, grep, git, sed…
darby_nine · a year ago
I agree completely; I just think on a gut-level a solution would be validated with use despite the small marginal gain over text in complexity of tooling. Among RAGs, TreeSitter, and the success of LSPs, I think there's room here to synthesize some improvements.

While we're on the topic, if we store only syntactically valid programs, we can express diffs in terms of semantic refactoring rather than textual changes. This would enable stuff like preserving refactoring across merges, thereby bypassing conflicts that would arise under text merges. There are limits to this of course as you can still come up with conflicts, but anything to ameliorate the nightmare of manually fixing a textural merge.

cranium · a year ago
"Code-as-Text" is way too universal and deeply ingrained to change in the short to medium term. I've worked a bit with low code / no code platforms and everything becomes a mess: little to no version control, search is bad, no import or possibility to generate components programmatically,...

However, I'm totally with you on having the editor show the code as you'd like. As much as I don't like tabs, at least the user could choose their preferred width for indentation. (A less disruptive Python-with-braces could be the editor showing braces but converting to spaces behind the scene.)

quectophoton · a year ago
> However, I'm totally with you on having the editor show the code as you'd like.

Such editors would remove some categories bikeshedding, but would add brand new categories of bikeshedding.

* Why did/didn't you add an empty line (EDIT: or whatever no-op visual equivalent) after that `if` block? My editor needs blocks to be separated a certain way to be able to display them nicely grouped in logical blocks! This representation of AST is so limited that we can't store such differences in style, we need editors that work at some super-AST level!

* What do you mean my code is an unreadable mess with hundreds of operations? All I see in my editor is a nice single operation "copy fields from class X to struct Y". If your editor can't detect such an obvious thing from the AST and display it nicely, then find a better one.

* Some codebases not even bothering with functions because the founding engineers use a specific IDE that just lets them group code arbitrarily (no no that's not reinventing functions, that's what progress looks like!), and you can't use your favorite editor because that's not on the scope of that editor, so the editor's author politely tells you to use that other editor you dislike if you really need such a thing.

* I can probably come up with more scenarios if I spend more time thinking about it. And there's probably also more petty scenarios that I can't even imagine.

I'm not saying such editors wouldn't be nice. Everyone has their own preferences, and some people might work better this way. Just, don't expect them to reduce the amount of petty bikeshedding in projects, much less eliminate it.

Nihilartikel · a year ago
I'm kind of with you here.

Lispy languages have structural editing tools that make it a lot like working directly on an AST. It's a delight when you get used to it.

The spacing/linebreaks are all just auto formatted and mostly an afterthought. It would only be one step further to present the code with the users choice of block start/end sequences.

darby_nine · a year ago
Working with paredit was certainly one of the main ingredients for structural editing (as opposed to textual editing) to click with me. I've been watching TreeSitter with great interest.
kevincox · a year ago
The freedom that text provides is not a blessing and a curse. Even small things like adding "paragraph break" lines in a run of code can make it more readable. Of course you can encode a limited number of these "style" features into your source AST but much like code formatters there is always a limit to helpful style that is preserved and irrelevant stuff that is rendered to each programmers preference.
islandert · a year ago
Would the JVM ecosystem almost be a working example of this? Since there are a variety of languages with editor integration that all compile down to the same byte code, it feels pretty close to what you’re describing.
kevincox · a year ago
I believe that JVM bytecode is too low level for a source format. At the very least you would need to preserve some form of comments. Also I think JVM locals lose their names during compilation.
darby_nine · a year ago
The JVM languages are pretty damn close!
kriiuuu · a year ago
Unison would fit here

Deleted Comment

quasarj · a year ago
You can have my plain text when you pry it from my cold dead fingers!
Gibbon1 · a year ago
I think text based programming languages are a local minima that's it's hard climb out of. Partly because most languages aren't designed around being represented by structured data. And no one has any morden experience with that.

But do imagine a world were changing the name of a field in a struct results in a single diff message, 'struct foo field bar changed to baz'. And where a change set to to a library can be mechanically applied to to code that depends on it and it just works.

voiper1 · a year ago
Would VS Code support a plugin to _display_ python with brackets, but open/save to disk in standard indented format?
morkalork · a year ago
I always open these threads with the hope that someone will mention Fortress but so far I'm just disappointed.
DonHopkins · a year ago
Python has always fully supported C style braces out of the box: you simply have to put a # before them:

    if foo == 1: # {
        print "one foo"
    # }
It also supports PASCAL style begin/end:

    if foo == 1: # begin
        print "one foo"
    # end
And it's fully internationalized, even supporting mixing multiple languages:

    if foo == 1: # beginnen
        print "one foo"
    # 終了
In fact, you can even mix styles:

    if foo == 1: # begin
        print "one foo"
    # }
Or optionally leave one of them out, or spell them any way you like, if you feel like it:

    if foo == 2:
        print "one foo"
    # fi
Even verbose COBOL style:

    if foo == 1: # perform conditional foo value check
        print("one foo")
    # end perform conditional

gary_0 · a year ago
Braces, maybe. But I've always felt that semicolons at the end of lines were just noise. I can tell it's the end of the line because it's the end of the line. The few cases of ambiguity that happen can be solved by common sense (in the programmer or in the parser).
nomel · a year ago
Semicolons are already supported in Python, for the cases where it might be needed (one liners on the command line), just not required.

    import sys; print(sys.executable)
    x = 5; # this is fine

darby_nine · a year ago
> But I've always felt that semicolons at the end of lines were just noise.

I felt this way until I engaged in code with very horizontal coding styles. Semicolons make it extremely easy to visually break up sequential statements from expressions consuming multiple lines.

gary_0 · a year ago
> code with very horizontal coding styles

I'm not sure what you mean. Could you give an example?

voidUpdate · a year ago
The end of the line isn't always the end of the line, such is if there is a chain of methods and you are limited to 80 columns. Then you can just wrap on a . , and tell the language where you actually want the line to end
ilyagr · a year ago
trealira · a year ago
Haskell also inserts semicolons, but it has a different rule, which I think is a bit interesting.

If the current line starts on the same line that the previous expression started on, then a semicolon is inserted at the beginning of the current line. There are also some messy rules about opening and closing braces being inserted.

So, this:

  do expr1
     expr2
     do expr3
        expr4
Ends up being parsed as though it were written like this:

  do { expr1
     ; expr2
     do { expr3
        ; expr4
        }
     }
If the next line is indented, it's just taken as a continuation of the previous line's expression, so nothing needs to happen.

[0]: https://amelia.how/posts/parsing-layout.html

xpl · a year ago
Also, JavaScript.
pasc1878 · a year ago
For another way of removing indentation needed for structure.

Make python a lisp - indentation is just the number of brackets.

Hy - http://hylang.org

xigoi · a year ago
Lisp-style parentheses are still better than brackets because at least they don’t waste verticat space.

  (if x
    (if y
      (if z
        foo)))
versus

  if (x) {
    if (y) {
      if (z) {
        foo;
      }
    }
  }

kazinator · a year ago
Braces can be formatted in a Lisp-inspired style!

Try it like this:

  if (x) 
  { if (y)
    { if (z)
      { foo; } } }
The rules are a little different. We have two-space indentation, like what is favored in Lisp, but there is a space after the opening brace. For symmetry, we put spaces between the closing ones. It works best if we don't "cuddle" the opening brace but put it on a new line.

There is a rhyme and reason to it, and consistency.

I've written some C programs that way, though not recently and can't point to any online examples.

However, I also wrote, and maintain, the Yacc grammar file in TXR Lisp in this style (just the actions in the grammar portion). For whatever reason, it works well in grammar files.

https://www.kylheku.com/cgit/txr/tree/parser.y

pasc1878 · a year ago
Brackets are these () - so lisp

{} are braces

This is English English

blharr · a year ago
I feel like this is a great idea, but as with other lisps, it looks really unfamiliar and loses the simple readability that python has for me.
perrygeo · a year ago
> it looks really unfamiliar

Swahili looks unfamiliar to me. Because I didn't grow up in southern Africa. Don't conflate subjective familiarity with objective simplicity - any "unfamiliar" concept only reflects on you, not your subject.

p5a0u9l · a year ago
Glad that auto formatting tools like black and ruff (at least in Python world) are increasingly becoming the norm. It’s really nice to not think about whitespace or humor these silly arguments.
Ringz · a year ago
“Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away.”

Antoine de Saint-Exupéry

Unfortunately, we still live in an era where humans have to adapt to technology rather than the other way around. From my perspective, this is particularly true for programming.

For me, Python is a good (though not ideal) mix of simple syntax and power. The language is characterized by low redundancy, meaning it uses fewer unnecessary characters like semicolons or curly braces to mark the end of a line. If a human can recognize the end of a line without special characters, then the compiler should be able to as well.

As someone with ADHD, I find it particularly difficult not to get distracted by these and other superfluous details. These small distractions add up and can become very burdensome. Interestingly, I found it easier to program in Assembler and Modula than in languages like C++ (MSVC), PHP, or JavaScript – at least as long as the projects were small.

Even a brief look at Rust’s syntax causes me almost physical discomfort, no matter how great, powerful, and useful the language may be.

For this reason, I almost exclusively use the terminal for emails, calendar, and programming, even though complex GUIs can simplify some tasks.

Although Python is not perfect in terms of syntax, it offers a good balance. Perhaps one day, before the perfect programming language exists, we will be able to use AI and ML to explain to the computer what a program should do with simple language (better than ChatGPT right now), just like Captain Picard. In fantasy, a few letters, punctuation marks, and some grammar is all that is needed. This may lead to inaccuracies in human-to-human communication, but that does not mean the same problems must occur in communication with an intelligent compiler.

Making syntax as “human-readable” as possible should always be the highest priority. We could unlock so much potential this way.

masklinn · a year ago
> For me, Python is a good (though not ideal) mix of simple syntax and power. The language is characterized by low redundancy, meaning it uses fewer unnecessary characters like semicolons or curly braces to mark the end of a line. If a human can recognize the end of a line without special characters, then the compiler should be able to as well. […] Even a brief look at Rust’s syntax

That’s a shame because you’re missing that these sigils actually have meaning there, because the semantics of the language are completely different: Python is statements-based with very limited scoping (global and function), Rust is expression based with block scoping.

As a result, blocks (paired braces) are a way to pack multiple statements into an expression e.g.

    let v = {
        let a = thing1();
        let b = thing2();
        thing3(a, b)
    };
And `;` is not an alias for end-of-line, it’s a separator for statements.

And not aliasing end-of-line to end-of-statement is relevant to rust being expression oriented, it’s very common for expressions to span multiple lines, in that case Python requires either wrapping the entire thing in parenthesis or escaping the EOL with `\`.

Could you find other ways to do this? Sure, but then you have to make other tradeoffs e.g. wrap everything in matching symbols à la lisp, or make statements into special cases à la Haskell.

mavhc · a year ago
Wrapping 99% of things in {} seems worse than wrapping 1% of things in ()
xigoi · a year ago
> the semantics of the language are completely different: Python is statements-based with very limited scoping (global and function), Rust is expression based with block scoping.

This is a red herring. Haskell, CoffeeScript, Nim, Lean, etc. are expression-oriented and use indentation like Python, while C(++), Java, JavaScript, etc. are statement-oriented and use braces.

Ringz · a year ago
> And `;` is not an alias for end-of-line, it’s a separator for statements.

And in Python \n is not an alias for end-of-line, it’s a separator for statements.

wqtrez · a year ago
That looks very clean, though the deviation from OCaml takes a minute to get used to.
__jonas · a year ago
> Making syntax as “human-readable” as possible should always be the highest priority. We could unlock so much potential this way.

I think there is a balance to strike here.

I often like to work with code by cutting and pasting sections around and then hitting the format hotkey to align everything.

I enjoy the guarantee that as long as the syntax is correct, it doesn't matter how I type out the code because I'll just hit the formatter hotkey immediately afterwards and it will apply the correct indentation and lay it all out nicely

Obviously that's impossible in Python and it makes working with it really frustrating to me, it feels so delicate, I almost don't want to touch the code because I'm always accidentally changing the indentation, it's a real limit of this "human-readable" syntax in my opinion.

Ringz · a year ago
That’s definitely possible in Python. Maybe not if you’re using the simplest text editor. But regardless of whether it’s NVIM or Visual Studio Code, you can easily jump to the end of a line, press Return, and the code will be inserted with the correct indentation. RUFF or BLACK do the rest, and they do it quite intelligently.
xigoi · a year ago
> Obviously that's impossible in Python and it makes working with it really frustrating to me, it feels so delicate, I almost don't want to touch the code because I'm always accidentally changing the indentation

Skill issue. I just press my “paste with correct indentation” hotkey and move on.

madeofpalk · a year ago
> Unfortunately, we still live in an era where humans have to adapt to technology rather than the other way around. From my perspective, this is particularly true for programming.

I think it's interesting that you say this, and point it towards Python.

Personally, with Python's significant whitespace, I feel more constrained writing code in a style that I prefer, with the computer requiring me to adapt to it, compared to other languages. I see code with braces and semi-colons more freeing because I get more control over the line structure.

At the end of the day, it's all stylistic personal preference. Python isn't the evolutionary ideal form for programming languages, it's just what some people prefer.

wqtrez · a year ago
The Saint-Exupéry quote applies to Lisp, may in some manner apply to Python 2.7, but certainly not to Python 3.12.

Many people here argue that braces do provide some structure that helps in understanding and navigating the program. Python files over 100 lines with a lot of if-statements become syntactically unreadable.

So taking them away does not help.

Generally, minimalism (except for the Lisp-style one) is not always good. Python is called executable pseudo-code. Do academics use it to specify algorithms?

No, most still use some form of Pascal/Algol style syntax, which conveys the meaning much better.

Ringz · a year ago
> The Saint-Exupéry quote applies to Lisp, may in some manner apply to Python 2.7, but certainly not to Python 3.12.

Is the syntax not about 80% the same? And isn't most of that "same simple syntax" commonly used daily?

> Many people here argue that braces do provide some structure that helps in understanding and navigating the program. Python files over 100 lines with a lot of if-statements become syntactically unreadable.

That's probably True. But poor programming discipline and syntax that eases reading problematic code also contribute to this issue.

> So taking them away does not help. For me it actually promotes better coding style and hygiene.

> Generally, minimalism (except for the Lisp-style one) is not always good. The quote was about perfectionism, not minimalism.

kazinator · a year ago
There is already nothing left to take away before you've made anything, so just don't start the project and you have perfection.