Readit News logoReadit News
lucumo · a year ago
Grepping for symbols like function names and class names feels so anemic compared to using a tool that has a syntactic understanding of the code. Just "go to definition" and "find usages" alone reduce the need for text search enormously.

For the past decade-plus I have mostly only searched for user facing strings. Those have the advantage of being longer, so are more easily searched.

Honestly, posts like this sound like the author needs to invest some time in learning about better tools for his language. A good IDE alone will save you so much time.

laserbeam · a year ago
Scenarios where an IDE with full syntactic understanding is better:

- It's your day to day project and you expect to be working in it for a long time.

Scenarios where grepping is more useful:

- Your language has #ifdef or equivalent syntax which does conditional compilation making syntactic tools incomplete.

- You just opened the project for the first time.

- It's in a language you don't daily drive (you write backend but have to delve in frontend code, it's a 3rd party library, it's configuration files, random json/xml files or data)

- You're editing or searching through documentation.

- You haven't even downloaded the project and are checking things out in github (or some similar site for your project).

- You're providing remote assistance to someone and you are not at your main development machine.

- You're remoting via SSH and have access to code there (say it's a python server).

Yes, an IDE will save you time daily driving. But there's no reason to sabotage all the other usecases.

emn13 · a year ago
Further important (to me) scenarios that also argue for greppability:

- greppability does not preclude IDE or language server tooling; there's often special cases where only certain e.g. context-dependant usages matter, and sometimes grep is the easiest way to find those.

- projects that include multiple languages, such as for instance the fairly common setup of HTML, JS, CSS, SQL, and some server-side language.

- performance in scenarios with huge amounts of code, or where you're searching very often (e.g. in each git commit for some amount of history)

- ease of use across repositories (e.g. a client app, a spec, and a server app in separate repos).

I treat greppability as an almost universal default. I'd much rather have code in a "weird" naming style in some language but have consistent identifiers across languages, than have normal-style-guide default identifiers in each language, but differing identifiers across languages. If code "looks weird", if anything that's often actually a _benefit_ in such cases, not a downside - most serialization libraries I use for this kind of stuff tend to do a lot of automagic mapping that can break in ways that are sometimes hard to detect at compile time if somebody renames something, or sometimes even just for a casing change or type change. Having a hint as to this fragility immediate at a glance even in dynamically typed languages is sometimes a nice side-effect. Very speculatively, I wouldn't be surprised if AI coding tools can deal with consistent names better than context-dependent ones too; greppability is likely not specifically about merely the tool grep.

And the best part is that there's almost no downside; it's not like you need to pick either a language server, IDE or grep - just use whatever is most convenient for each task.

popinman322 · a year ago
Grep is also useful when IDE indexing isn't feasible for the entire project. At past employers I worked in monorepos where the sheer size of the index caused multiple seconds of delay in intellisense and UI stuttering; our devex team's preferred approach was to better integrate our IDE experience with the build system such that only symbols in scope of the module you were working on would be loaded. This was usually fine, and it works especially well for product teams, but it's a headache when you're doing cross-cutting work (e.g. for infrastructure projects/overhauls).

We also had a livegrep instance that we could use to grep any corporate repo, regardless of where it was hosted. That was extremely useful for investigating failures in build scripts that spanned multiple repositories (e.g. building a Go sidecar that relies on a service config in the Java monorepo).

lolinder · a year ago
> It's your day to day project and you expect to be working in it for a long time.

I don't think we need to restrict the benefits quite that much—if it's a project that isn't my day-to-day but is in a language I already have set up in my IDE, I'd much prefer to open it up in my IDE and use jump to definition and friends than to try to grep and hope that the developers made it grepable.

Going further, I'd equally rather have plugins ready to go for every language my company works in and use them for exploring a foreign codebase. The navigation tools all work more or less the same, so it's not like I need to invest effort learning a new tool in order to benefit from navigation.

> Yes, an IDE will save you time daily driving. But there's no reason to sabotage all the other usecases.

Certainly don't sabotage, but some of these suggestions are bad for other reasons that aren't about grep.

For example: breaking the naming conventions of your language in order to avoid remapping is questionable at best. Operating like that binds your business logic way too tightly to the database representation, and while "just return the db object" sounds like a good optimization in theory, I've never not regretted having frontend code that assumes it's operating directly on database objects.

cxr · a year ago
- You're fully aware that it would be better to be able to use tooling for $THING, but tooling doesn't exist yet or is immature.
joe-six-pack · a year ago
You forgot massive codebases. Language servers really struggle with anything on the order of the Linux kernel, FreeBSD, or Chromium.
jollyllama · a year ago
>It's your day to day project and you expect to be working in it for a long time.

Bold of everyone here to assume that everyone has a day to day project. If you're a consultant or for other reasons you're switching projects on a month to month basis, greppability is probably the top metric second to UT coverage.

beeboobaa3 · a year ago
> - Your language has #ifdef or equivalent syntax which does conditional compilation making syntactic tools incomplete.

You need a better IDE.

> - You just opened the project for the first time.

Go grab a coffee

> - It's in a language you don't daily drive

Jetbrains all products pack, baby.

> - You haven't even downloaded the project and are checking things out in github (or some similar site for your project).

On GitHub, press `.` to open it in a web-based vscode. Download it & open it in your IDE while you are doing this.

> - You're remoting via SSH and have access to code there (say it's a python server).

Don't do this. Check the git hash that was deployed and checkout the code locally.

jvanderbot · a year ago
> - Your language has #ifdef or equivalent syntax which does conditional compilation making syntactic tools incomplete.

LSP-based tools are fine with this, generally. A syntactic understanding is an incomplete solution. I suspect GP meant LSP. (as long as compile_commands.json or equivalent is avilable).

Many of those other caveats are non-issues once LSPs are widespread. Even Github has lsp-like go-to-def/go-to-ref, though it's not perfect.

codedokode · a year ago
"Go to definition" often doesn't work in dynamic languages like Python without type hints; it might not work when the code is dynamically generated.
umanwizard · a year ago
> Your language has #ifdef or equivalent syntax which does conditional compilation making syntactic tools incomplete.

Your other points make sense, but in this case, at least for C/C++, you can generate a compile_commands.json that will let clangd interpret your code accurately.

If building with make just do `bear -- make` instead of `make`. If building with cmake pass `-DCMAKE_EXPORT_COMPILE_COMMANDS=1`.

gpderetta · a year ago
- you just switched branch/rebased and the index is not up to date.

- the project is large enough that the IDE can't cope.

- you want to also match comments, commented out code or in-project documentation

- you want fuzzy search and match similarly named functions

I use clangd integration in my IDE all the time, but often brute force is the right solution.

gregjor · a year ago
I abandoned VSCode and went back to vim + ctags + ripgrep after a year with the most popular IDE. I miss some features but it didn’t give me a 10x or even 1.5x improvement in my own work along any dimension.

I attribute that mostly to my several decades of experience with vi(m) and command line tools, not to anything inherently bad about VSCode.

What counts as “better” tools has a lot of subjectivity and circumstances implied. No one set of tools works for everyone. I very often have to work over ssh on servers that don’t allow installing anything, much less Node and npm for VSCode, so I invest my time in the tools that always work everywhere, for the work I do.

The main project I’ve worked on for the last few years has a little less than 500,000 lines of code. VSCode’s LSP takes a few seconds fairly often to maintain the LSP indexes. Running ctags over the same code takes about a second and I can control when that happens. vim has no delays at all, and ripgrep can search all of the files in a second or two.

kelnos · a year ago
I have similar feelings... I still use IntelliJ IDEA for JVM languages, but for C, Rust, Go, Python, etc., I've been using vim for years (decades?), and that's just how I prefer to write code in those languages. I do have LSP plugins installed in vim for the languages I work in, and do have a key sequence mapped for jump-to-definition... but I still find myself (rip)grepping through the source at least as often as I j-t-d, maybe more often.
wrasee · a year ago
Did you consider Neovim? You get the benefit of vim while also being able to mix in as much LSP tooling as you like. The tradeoff is that it takes some time to set up, although that is getting easier.

That won’t make LSP go any faster though. There’s still something interesting in the fact that a ripgrep of every line in the codebase can still be faster than a dedicated tool.

joe-six-pack · a year ago
VSCode is not an IDE, it's an extensible text editor. IDEs are integrated (it's in the name) and get developed as a whole. I'm 99% certain that if you were forced to spend a couple of months in a real IDE (like IDEA or Rider), you would not want to go back to vim, or any other text editor. Speaking as a long time user of both.
heisenbit · a year ago
A good IDE can be so much better iff it understands the code. However this requires the IDE to be able to understand the project structure, dependencies etc. which can be considerable effort. In a codebase with many projects employing several different languages it becomes hard to get and maintain the IDE understands everything state.
amichal · a year ago
And an IDE would also fail to find references for most of the cases described in the article: name composition/manipulation, naming consistency across language barriers, and flat namespaces in serialization. And file/path folder naming seems to be irrelevant to the smart IDE argument. "Naming things is hard"
carlmr · a year ago
And especially in large monorepos anything that understands the code can become quite sluggish. While ripgrep remains fast.

A kind of in-between I've found for some search and replace action is comby (https://comby.dev/). Having a matching braces feature is a godsend for doing some kind of replacements properly.

brain5ide · a year ago
I think the first sentence of the author counters your comment. What you described works best in a familiar codebase where the organizing principles have been maintained well and are familiar to the reader and the tools are just the extension of those organizing principles. Even then a deviation from those rules might produce gaps in understanding of what the codebase does.

And grep cuts right through that in a pretty universal way. What the post describes are just ways to not work against grep to optimize for something ephemeral.

ricardo81 · a year ago
Agree. Not just because it's unfamiliar code, you can also get a feel for how the program/programmer(s) structured the whole thing.
zarzavat · a year ago
Go to definition and find usages only work one symbol at a time. I use both, but I still use global find/replace for groups of symbols sharing the same concept.

For example if I want to rename all “Dog” (DogModel, DogView, DogController) symbols to “Wolf”, find/replace is much better at that because it will tell me about symbols I had forgotten about.

sandermvanvliet · a year ago
Jetbrains ReSharper (and Rider) is smart enough to handle these things. It’ll suggest renames across other symbols even ones that have related names
f1shy · a year ago
For that use case I think you can use treesitter[1] you can find Dog.* but only if it is a variable name, for example. Avoiding replacement inside of say literals.

[1] https://www.youtube.com/watch?v=MZPR_SC9LzE

turboponyy · a year ago
There's no reason they have to work one symbol at a time - that's just a missing feature in your language server implementation.

Some language servers support modifying the symbols in contexts like docstrings as well.

gugagore · a year ago
I am familiar with the situation you describe, and it's a good point.

However, it does suggest that there is an opportunity for factoring "Dog" out in the code, at least by name spacing (e.g. Dog.Model).

citrin_ru · a year ago
Not everything you need to look for is a language identifier. I often grep for configuration option names in the code to see what the option actually does - sometimes it is easy to grep, sometimes there are too many matches, sometimes they cannot be found because option name composed in the code from separate unrepeatable (because of too many matches) parts. It's not hard to make config options greppable but some coders just don't care about this property.
sauercrowd · a year ago
strongly disagree here. This works if - your IDE/language server is performant - all the tools are fully set up - you know how to query the specific semantic entity you're looking for (remembering shortcuts) - you are only interested in a single specific semantic entity - mixing entities is rarely supported

I dont map out projects in terms of semantics, I map out projects in files and code - That makes querying intuitive and I can easily compose queries that match the specificity of what I care about (e.g. I might want to find a `Server` but I want to show both classes, interfaces and abstract classes).

For the specific toolchain I'm using - typescript - the symbol search is also unusable once it hits a certain project size, it's just way too slow for it to be part of my core workflow

underdeserver · a year ago
Unfortunately in larger codebases or dynamic languages these tools are just not good enough today. At least not those I and my employers have tried.

They're either incomplete (you don't get ALL references or you get false references) or way too slow (>10 seconds when rg takes 1-2).

Recommendations are most welcome.

jimmaswell · a year ago
Only thing I can recommend is using C# (obviously not always possible). Never had an issue with these functions in Visual Studio proper no matter how big the project.
leni536 · a year ago
I can't use an IDE on my entire git history, but git can grep.
aa-jv · a year ago
On the flipside, IDE's can turn you into lazy, inefficient programmers by doing all the hand-holding for you.

If your feelings are anemic when tasked with doing a grep, its because you have lost a very valuable skill by delegating it to a computer. There are some things the IDE is never going to be able to find - lest it becomes the development environment - so keeping your grep fu sharpened is wise beyond the decades.

(Disclaimer: 40 years of software development, and vim+cscope+grep/silversearcher are all I really need, next to my compiler..)

throwaway2037 · a year ago

    > lazy... programmers
Since when was that a bad thing? Since time immemorial, it has been hailed as a universal good for programmers to be lazy. I'm pretty sure Larry Wall has lots of jokes about this on Usenet.

Also, I can clearly remember switching from vim/emacs to Microsoft Visual Studio (please, don't throw your tomatoes just yet!). I was blown away by IntelliSense. Suddenly, I was focusing more on writing business logic, and less time searching for APIs.

winwang · a year ago
I count the IDE and stuff like LSP as natural extensions of the compiler. For sure I grep (or equivalent) for stuff, but I highly prefer statically typed languages/ecosystems.

At the end of the day, I'm here to solve problems, and there's no end to them -- might as well get a head start.

lucumo · a year ago
> If your feelings are anemic

I'm not feeling anemic. The tool is anemic, as in, underpowered. It returns crap you don't want, and doesn't return stuff you do want.

My grep-fu is fine. It's a perfectly good tool if you have nothing better. But usually you do have something better.

Using the wrong tool to make yourself feel cool is stupid. Using the wrong tool because a good tool could make you lazy shows a lack of respect for the end result.

high_na_euv · a year ago
Leveraging technology is good thing
HdS84 · a year ago
Huh? I have an old hand-powered drill from my Grandpa in my workshop. I used it once for fun. For all other tasks I use a powered drill. Same for IDEs. They help your refactor and reason about code - both properties I value. Sure, I could print it and use a textmarker, but I'm not Grandpa
phyrex · a year ago
This breaks down at scale and across languages. All the FAANGs make heavy use of the equivalent of grepping in their code base
db48x · a year ago
True, but IDEs are fragile tools. Sometimes you want to fall back to simpler tools that will always work, and grep is not fragile.
cxr · a year ago
The basis if this article (and its forebear "Too DRY - The Grep Test"[1]) is that grep is fragile. It's just fragile in a way that's different from the way that IDEs are fragile.

1. <http://jamie-wong.com/2013/07/12/grep-test/>

kelnos · a year ago
Even with IDEs, I find that I grep through source trees fairly often.

Sometimes it's because I don't completely trust the IDE to find everything I'm interested in (justifiably; sometimes it doesn't). Sometimes it's because I'm not looking to dive into the code and do serious work on it; I'm just doing a quick drive-by check/lookup for something. Sometimes it's because I'm ssh'd into another machine and I don't have the ability to easily open the sources in an IDE.

a_e_k · a year ago
I've come to really like language servers for big personal and work projects where I already have my tools configured and tuned for efficiently working with it.

But being able to grep is really nice when trying to figure out something out about a source tree that I don't yet have set up to compile, nor am I a developer of. I.e., I've downloaded the source for a tool I've been using pre-built binaries of and am now trying to trace why I might be getting a particular error.

kragen · a year ago
posts like this sound like the author routinely solves harder problems than you are, because the solutions you suggest don't work in the cases the post is about. we've had 'go to definition' since 01978 and 'find usages' since 01980, and you should definitely use them for the cases where they work
mjr00 · a year ago
From the article,

- dynamically built identifiers is 100% correct, never do this. Breaks both text search and symbol search, results in complete garbage code. I had to deal with bugs in early versions of docker-compose because of this.

- same name for things across the stack? Shouldn't matter, just use find usages on `getAddressById`. Also easy way to bait yourself because database fields aren't 1:1 with front-end fields in anything but the simplest of CRUD webshit.

- translation example: the fundamental problem is using strings as keys when they should be symbols. Flat vs nested is irrelevant here because you should be using neither.

- react component example: As I mentioned in another comment, trivially managed with Find Usages.

Nothing in here strikes me as "routinely solves harder problems," it's just standard web dev.

brooke2k · a year ago
with all due respect, it sounds like you have the privilege of working in some relatively tidy codebases (and I'm jealous!)

with a legacy codebase, or a fork of a dependency that had to be patched which uses an incompatible buildsystem, or any C/C++/obj-c/etc that heavily uses the preprocessor or nonstandard build practices, or codebases that mix lots of different languages over awkward FFI boundaries and so on and so forth -- there are so many situations where sometimes an IDE just can't get you 100% of the way there and you have to revert to grepping to do any real work

that being said, I don't fully support the idea of handcuffing your code in the name of greppability, but I think dismissing it as a metric under the premise that IDEs make grepping "obsolete" is a little bit hasty

lucumo · a year ago
> with all due respect, it sounds like you have the privilege of working in some relatively tidy codebases (and I'm jealous!)

I wish, but no. I've found people will make a mess of everything. Which is why I don't trust solutions that rely on humans having more discipline, like what this article advocates.

In any situation where grep is your last saviour, you cannot rely on the greppability of the code. You'll have to check and double check everything, and still accept the risk of errors.

groby_b · a year ago
Working on a 32MLOC project, text search is still the quickest way to find a hook that gets you to the deeper investigation. From there, finding definitions/usage definitely matters.

You can maybe skip the greppability if the code base is of a size that you can hold the rough shape and names in your head, but a "get a list of things that sound like they might be related to my problem" operation is still extremely helpful. And it's also worth keeping in mind that greppability matters to onboarding.

Does that mean it should be an overriding design concern? No. But it does mean that if it's cheap to build greppable, you probably should, because it's a net positive.

jmmv · a year ago
Sure, if you have the luxury of having a functional IDE for all of your code.

You can't imagine how much faster I was than everybody else at answering questions about a large codebase just because I knew how to use ripgrep (on Windows). "Knowing how to grep" is a superpower.

wglb · a year ago
A bit on the other side of the argument, I use grep plus find plus some shell work to do source code analysis for security reviews. grep doesn't really understand the syntax of languages, and that is mostly OK.

I've used this technique on auditing many code bases including the C family, perl, Visual Basic, C# and SQL.

With this sort of tool, I don't need to look for language-particular parsers--so long as the source is in a text file, this works well.

PhilipRoman · a year ago
IDEs are cool and all, but there is no way I'm gonna let VSCode index my 80GB yocto tmp directory. Ctags can crunch the whole thing in a few minutes, and so can grep.

Plus there are cases where grep is really what you need, for example after updating a particular command line tool whose output changed, I was able to find all scripts which grepped the output of the tool in a way that was broken.

EasyMark · a year ago
It seems like the law of diminishing returns; while I'm sure in a few cases this characteristic of a code writing style is extremely useful, it cuts into other things such as readability and conciseness. Fewer lines can mean fewer bugs, within reason, if you aren't in lisp and are using more than 3 parentheses, you might want to split it up because the compiler/JIT/interpreter is going to anyway.
k__ · a year ago
Honestly, in my 18 years of software development, I haven't "greped" code once.

I only use grep to filter the output of CLI tools.

For code, I use my IDE or repository features.

yCombLinks · a year ago
Do you use the find feature in your IDE? IE not find by reference, just text matching? That's the same as greppability.
umvi · a year ago
Interface-heavy languages break IDEs. In .NET at least, "go to definition" jumps you to the interface definition which you probably aren't interested in (vs. the specific implementation you are trying to dig into). Also with .NET specifically XAML breaks IDE traceability as well.
hyperpape · a year ago
I can run rg over my project faster than I can do anything in my IDE. Both tools have their places.
IshKebab · a year ago
Definitely true when you can use static typing.

Unfortunately sometimes you can't, and sometimes you can but people can't be arsed, so this is still a consideration.

mihaaly · a year ago
"A good IDE"

I am also waiting for world peace! ; )

ilrwbwrkhv · a year ago
I tried a good IDE recently: Jetbrains IntelliJ and Webstorm. Considered the topdog of IDEs. Was working on a typescript project which uses npm link to symlink another local project into the node_modules of current project.

The great IDEs IntelliJ and Webstorm stopped autosuggesting completions from the symlinked project.

Open up Sublime Text again. Worked perfectly. That is why Jetbrains and their behemoth IDEs are utter shite.

Write your code to have symmetry and make it easy to grep.

71bw · a year ago
>I tried a good IDE recently: Jetbrains IntelliJ

Having dealt with IntelliJ for 3 years due to education stuff - I laughed out here. Even VS is better than ideaj.

Dead Comment

jakub_g · a year ago
Your observation does not help with the majority of the points in the article. How do you find all usages of a parameter value literal?
CrimsonRain · a year ago
By not using literals everywhere. All literals are defined somewhere (start of function, class etc) as enums or vars and used.

Just because I have 20 usage of 'shipping_address' doesn't mean I'll have this string 20 times in different places.

Grep has its place and I often need to grep code base which have been written without much thoughts towards DX. But writing it nicely allows LSP to take over.

troupo · a year ago
This is what the article starts with: "Even in projects exclusively written by myself, I have to search a lot: function names, error messages, class names, that kind of thing."

All of that is trivial to search for with a tool that understands the language.

mjr00 · a year ago
> Honestly, posts like this sound like the author needs to invest some time in learning about better tools for his language. A good IDE alone will save you so much time.

Completely agreed. The React component example in the article is trivial solvable with any modern IDE; right click on class name, "Find Usages" (or use the appropriate hotkey, of course). Trying to grep for a class name when you could just do that is insane.

I mainly see this from juniors who don't know any better, but as seen in this thread and the article, there are also experienced engineers who are stubborn and refuse to use tools made after 1990 for some reason.

gpderetta · a year ago
I worked on codebases large enough where enabling autocomplete/indexing would lock the IDE and cause the workstation to swap hard.
gregjor · a year ago
> experienced engineers who are stubborn and refuse to use tools made after 1990 for some reason.

Before calling people stubborn or assuming they got left behind out of ignorance, consider your assumptions. 40+ years experience, senior in both experience and age at this point. Long-term vim + command line tools user.

Do you have any evidence that shows "A good IDE alone will save you so much time?" Have you seen studies comparing productivity or code quality or any metric written by people using IDEs vs those using a plain editor with grep?

By "so much faster" what do you mean exactly? I have decades of experience with vim + ctags + grep (rg these days, because I don't want to get called a stubborn stick in the mud). I can find and change things in large codebases pretty fast. I used VSCode for a year on the same codebases and I didn't feel "so much faster," and I committed to it and watched numerous how-to videos and learned the tool well enough to train other programmers on it. No 10x improvement, not even 1.5x. For most tasks I would call it close to the same in terms of time taken to write code. After getting burned a couple times with "Replace symbol" in VSCode I stopped trusting it. After noticing the LSP failed to find some references I trusted it less. I know grep/ack/rg/ctags aren't perfect, but I also know their weaknesses and how to work with them to get them to do what I want. After a year I went back to vim + ctags + rg.

We might have more productive (and friendly) interactions as programmers if we remembered that not everyone works the same way, or on the same kind of code and projects. What we call "best practices" or "modern tools" largely come down to familiarity, received wisdom, opinion, and fashion -- almost never from rigorous metrics and testing. You like your IDE? Great! I like my tools too. Would either of us get "so much faster" using a different set of tools? Probably not. Trying to find the silver bullet that reduces accidental complexity in software development presents an ongoing challenge, but history shows that editors and IDEs don't do much because if they did programmers today would outperform old guys like me by 10x in a measurable way.

At the last full-time job I had, at an educational software company with 30+ programmers, everyone used Eclipse. My first day I got a new desktop with two big monitors, Eclipse installed, ready to go. I installed vim and the CLI subversion client and some other stuff and worked from the command line, as I usually do. I left one of the monitors off, I don't need that much screen space, and I don't have Twitter and Facebook and other junk running on a second monitor all day like most of the other people did. I got made fun of, old man using old tools. Then once a week, like clockwork, Eclipse would auto-install some updates and everyone came to a halt trying to resolve plugin version conflicts, getting the team in sync. Hours and hours wasted regularly just getting the IDE to work. That didn't affect me, I never opened Eclipse. Watching the other programmers it seemed really slow. So just maybe Eclipse could jump to a definition faster than vim + ctags (I doubt it), but amortized over a month Eclipse all by itself wasted more time than anyone possibly saved with the more powerful tool. Anecdote, I know, but I've seen this play out in similar ways at more than one shop.

Just last year a new hire at a place I freelance for spent days trying to get Jetbrains PHPStorm working on a shared remote dev server. Like VSCode it runs a heavy process on the server (including the LSP). Unlike VSCode, PHPStorm can actually kill the whole server, wasting everyone's time and maybe losing work. I have never seen vim or grep bring a whole server down. I could add up how much "faster" PHPStorm might turn out compared to vim, but it will have to recoup the days lost trying to get it to work at all first.

skrebbel · a year ago
The second point here made me realize that it'd be super useful for a grep tool to have a "super case insensitive" mode which expands a search for, say, "FooBar|first_name" to something like /foo[-_]?bar|first[-_]?name/i, so that any camel/snake/pascal/kebab/etc case will match. In fact, I struggle to come up with situations where that wouldn't be a great default.
msmolkin · a year ago
Hey, I just created a new tool called Super Grep that does exactly what you described.

I implemented a format-agnostic search that can match patterns across various naming conventions like camelCase, snake_case, PascalCase, kebab-case. If needed, I'll integrate in space-separated words.

I've just published the tool to PyPI, so you can easily install it using pip (`pip install super-grep`), and then you just run it from the command line with `super-grep`. You can let me know if you think there's a smarter name for it.

Source: https://www.github.com/msmolkin/super-grep

dang · a year ago
You should post this as a Show HN! But maybe wait a while (like a couple weeks or something) for the current thread to get flushed out of the hivemind cache.

If you do, email a link to hn@ycombinator.com and we'll put it in the second-chance pool (https://news.ycombinator.com/pool, explained at https://news.ycombinator.com/item?id=26998308), so it will get a random placement on HN's front page.

rldjbpin · a year ago
pretty cool and to me a better approach than the prescriptive advice from the OP. to me the crux of the argument is to make the code more readable from a popular tool. but if this can be well-integrated into common ide (or even grep perhaps), it would take away most of the argument down to personal preference.
skrebbel · a year ago
wow this is so cool!! it feels super amazing to dump a random idea on HN and then somebody makes it! i'm installing python as we speak just so i can use this.
crazygringo · a year ago
Adding to that, I'm often bitten trying to search for user strings because they're split across lines to adhere to 80 characters.

So if I'm trying to locate the error message "because the disk is full" but it's in the code as:

  ... + " because the " + 
    "disk is full")
then it will fail.

So really, combining both our use cases, what would be great is to simply search for a given case-insensitive alphanumeric string in files that skips all non-alphanumeric characters.

So if I search for:

  Foobar2
it would match all of:

  FooBar2
  foo_bar[2]
  "Foo " + \
    ("bar 2")
  foo.bar.2
And then in the search results, even if you get some accidental hits, you can be happy knowing that you didn't miss anything.

lathiat · a year ago
These are both of the problems I regularly have. The first one I immediately saw when reading the title of this submissionw as the "super case insensitive" that I often see when working on Go Codebases particularly when using a combination of Go Classes and YAML or JSON. Also happens with command line arguments being converted to variables.

But the string split thing you mentioned happens a lot when searching for OpenStack error messages in Python that is often split across lines like you showed. My current solution is to randomly shift what I'm searching for, or try pick the most unique line.

Groxx · a year ago
fwiw I pretty frequently use `first.?name` - the odds of it matching something like "FirstSname" are low enough that it's not an issue, and it finds all cases and all common separators in one shot.

(`first\S?name` is usually better, by ignoring whitespace -> better ignores comments describing a thing, but `.` is easier to remember and type so I usually just do that)

hnben · a year ago
> "super case insensitive"

lets say someone would make a plugin for their favorite IDE for this kind of search. How would the details look like?

To keep it simple, lets assume we just do the super-case-insensitivity, without the other regex condition. Lets say the user searches for "first_name" and wants to find "FirstName".

one simple solution would be to have a convention where a word starts or ends, e.g. with " ". So the user would enter "first name" into the plugin's search field. The plugin turns it into "/first[-_]?name/i" and gives this regexp to the normal search of the IDE.

another simple solution would be to ignore all word boundaries. So when the user enters "first name", the regexp would become "/f[-_]?i[-_]?r[-_]?s[-_]?t[-_]?n[-_]?a[-_]?m[-_]?e[-_]?/i". Then the search would not only be super-case-insensitive, but super-duper-case-insensitive. I guess the biggest downside would be, that this could get very slow.

I think implementing a plugin like this would be trivial for most IDEs, that support plugins.

Am I missing something?

skrebbel · a year ago
Hm I'd go even simpler than that. Notably, I'd not do this:

> So the user would enter "first name" into the plugin's search field.

Why wouldn't the user just enter "first_name" or "firstName" or something like that? I'm thinking about situations like, you're looking at backend code that's snake_cased, but you also want it to catch frontend code that's camelCased. So when you search for "first_name" you automagically also match "firstName" (and "FirstName" and "first-name" and so on). I wouldn't personally introduce some convention that adds spaces into the mix, I'd simply convert anything that looks snake/kebab/pascal/camel-cased into a regex that matches all 4 forms.

Could even be as stupid as converting "first_name" or "firstName", or "FirstName" etc into "first_name|firstname|first-name", no character classes needed. That catches pretty much every naming convention right? (assuming it's searched for with case insensitivity)

__MatrixMan__ · a year ago
Shame on me for jumping past the simple solutions, but...

If you're going that far, and you're in a context which probably has a parser for the underlying language ready at hand, you might as well just convert all tokens to a common format and do the same with the queries. So searches for foo-bar find strings like FooBar because they both normalize to foo_bar.

Then you can index by more than just line number. For instance you might find "foo" and "bar" even when "foo = 6" shows up in a file called "bar.py" or when they show up on separate lines but still in the same function.

inanutshellus · a year ago
IIUC, you're not missing anything though your interpretation is off from mine*. He wasn't saying it'd be hard, he was saying it should be done.

* my understanding was simply that the regex would (A) recognize `[a-z][A-Z]` and inject optional _'s and -'s between... and (B) notice mid-word hyphens or underscores and switch them to search for both.

marcosdumay · a year ago
The best way would be to make an escape code that matches zero or one punctuation.

So you's search for "/first\_name/i".

kiitos · a year ago
It would be a mistake to try to solve this problem with regexes.
WizardClickBoy · a year ago
This reminds me of the substitution mode of Tim Pope's amazing vim plugin [abolish](https://github.com/tpope/vim-abolish?tab=readme-ov-file#subs...)

Basically in vim to substitute text you'd usually do something with :substitute (or :s), like:

:%s/textToSubstitute/replacementText/g

...and have to add a pattern for each differently-cased version of the text.

With the :Subvert command (or :S) you can do all three at once, while maintaining the casing for each replacement. So this:

textToSubstitute

TextToSubstitute

texttosubstitute

:%S/textToSubstitute/replacementText/g

...results in:

replacementText

ReplacementText

replacementtext

User23 · a year ago
The Emacs replace command[1] defaults to preserving UPCASE, Capitalized, and lowercase too.

[1] https://www.gnu.org/software/emacs/manual/html_node/emacs/Re...

WizardClickBoy · a year ago
Also just realised while looking at the docs it works for search as well as replacement, with:

:S/textToFind

matching all of textToFind TextToFind texttofind TEXTTOFIND

But not TeXttOfFiND.

Golly!

boxed · a year ago
I think Nim has this?
archargelod · a year ago
Nim comes bundled with a `nimgrep` tool [0], that is essentially grep on steroids. It has `-y` flag for style insensitive matching, so "fooBar", "foo_bar" and even "Foo__Ba_R" can be matched with a simple "foobar" pattern.

The other killer feature of nimgrep is that instead of regex, you can use PEG grammar [1]

  [0] - https://nim-lang.github.io/Nim/nimgrep.html
  [1] - https://nim-lang.org/docs/pegs.html

adammarples · a year ago
Fzf?
setopt · a year ago
Fuzzy search is not the same. For instance, it might by default match not only “FooBar” and “foo_bar” but also e.g. “FooQux(BarQuux)”, which in a large code base might mean hundreds of false positives.
dominicrose · a year ago
Let's say you have a FilterModal component and you're using it like this: x-filter-modal

Improving the IDE to find one or the other by searching for one or the other is missing the point or the article, that consistency is important.

I'd rather have a simple IDE and a good codebase than the opposite. In the example that I gave the worst thing is that it's the framework which forces you do use these two names for the same thing.

skrebbel · a year ago
My point is that if grep tools were more powerful we wouldn't need this very particular kind of consistency, which gives us the very big benefit of being allowed to keep every part of the codebase in its idiomatic naming convention.

I didn't miss the point, I disagreed with the point because I think it's a tool problem, not a code problem. I agree with most other points in the article.

VoxPelli · a year ago
I advocate for greppability as well – and in Swedish it becomes extra fun – as the equivalent phrase in Swedish becomes "grep-bar" or "grep-barhet" and those are actual words in Swedish – "greppbar" roughly means "understandable", "greppbarhet" roughly means "the possibility to understand"
sshine · a year ago
How many other UNIX commands did the Swedes adopt into their language?

I know that they invented "curl". Do you tar xfz?

scbrg · a year ago
We do tar, for xfz I think you have to look to the Slavic languages :)

Anyway, to answer your question:

  $ grep -Fxf <(ls -1 /bin) /usr/share/dict/swedish 
  ack
  ar
  as
  black
  dialog
  dig
  du
  ebb
  ed
  editor
  finger
  flock
  gem
  glade
  grep
  id
  import
  last
  less
  make
  man
  montage
  pager
  pass
  pc
  plog
  red
  reset
  rev
  sed
  sort
  sorter
  split
  stat
  tar
  test
  transform
  vi
:)

[edit]: Ironically, grep in that list is not the same word as the one OP is talking about. That one is actually based on grepp, with the double p. grep means pitchfork.

tripzilch · a year ago
I learned from bash.org that "tar -xzvf" is in German accent for "xtract ze vucking files".
lukan · a year ago
As far as I understood, it was part of the language before.

The german equivalent of the word would be probably "greifbar". Being able to hold something, usually used metaphorically.

elygre · a year ago
Could I suggest that greppbarhet is more precisely translated as “the ability of being understood”?

(Norwegian here. Our languages are similar, but we miss this one.)

medstrom · a year ago
Norwegian still translates grep as "grip"/"grab". I always thought of grepping as reaching in with a hand into the text and grabbing lines. That association is close at hand (insert lame chuckle) for German and English speakers too.
psychoslave · a year ago
So, at the extrem opposite of the esoteric "general regular expression print" that grep stands for with few ever knowing it?
vanschelven · a year ago
Begreppelijk (begrijpelijk) in Dutch

Deleted Comment

Cthulhu_ · a year ago
or "Grijpbaar" (grabbable)
octocop · a year ago
And we also have "begrepp", which is also a spin on content and understanding it's content.
majewsky · a year ago
Oh, that's like German "begreifen", no? (Which means "to grok".)
TeMPOraL · a year ago
Which is ironic, given that the article is about making it easier to use grep in order to avoid having to understand anything.
bob88jg · a year ago
Nah, you've got it backwards. The article isn't about dodging understanding - it's about making it way easier to spot patterns in your code. And that's exactly how you start to really get what's going on under the hood. Better searching = faster learning. It's like having a good map when you're exploring a new city
layer8 · a year ago
Graspability. ;)

More customarily: intelligibility.

mettamage · a year ago
greppbarhet

Grijpbaarheid

I never saw grep as grijp

I guess I do now

(Dutch btw)

adpirz · a year ago
I've seen some pretty wild conditional string interpolation where there were like 3-4 separate phrases that each had a number of different options, something akin to `${a ? 'You' : 'we'} {b ? 'did' : 'will do' } {c ? 'thing' : 'things' }`.

When I was first onboarding to this project, I was tasked with updating a component and simply tried to find three of the words I saw in the UI, and this was before we implemented a straightforward path-based routing system. It took me far too long just to find what I was going to be working on, and that's the day I distinctly remember learning this lesson. I was pretty junior, but I'd later return to this code and threw it all away for a number of easily greppable strings.

ctxc · a year ago
Tangential: I love it when UIs say "1 object" and "2 objects". Shows attention to detail.

As opposed to "1 objects" or "1 object(s)". A UI filled with "(s)", ughh

gnuvince · a year ago
I like the more robotic "Objects: 1" or "Objects: 2", since it avoids the pluralization problems entirely (e.g., in French 0 is singular, but in English it's plural; some words have special when pluralized, such as child -> children or attorney general -> attorneys general). And related to this article, it's more greppable/awkable, e.g. `awk /^Objects:/ && $2 > 10`.
ajuc · a year ago
Fun fact - I had to localize this kind of logic to my language (Polish). I realized quickly it's fucked up.

This is roughly the logic:

    function strFromNumOfObjects(n) {
      if (n === 1) {
          return "obiekt";
      }
      let last_digit = (n%10);
      let penultimate_digit = Math.trunc((n%100)/10);
      if ((penultimate_digit == 0 || penultimate_digit >= 2) && last_digit > 1 && last_digit <= 4) {
          return "obiekty";
      }
      return "obiektów";
    }
Basically pluralizing words in Polish is a fizz-buzz problem :) In other Slavic languages it should be similar BTW

petepete · a year ago
Moreso when it's not tripped up by "1 sheeps" or "1 diagnoses".
nox101 · a year ago
Sounds like you're going to have a bad time

https://www.foo.be/docs/tpj/issues/vol4_1/tpj0401-0013.html

JoshTriplett · a year ago
This is the reason many coding styles and tools (including the Linux kernel coding style and the default Rust style as implemented in rustfmt) do not break string constants across lines even if they're longer than the desired line length: you might see the string in the program's output, and want to search for the same string in the code to find where it gets shown.
knodi123 · a year ago
My team drives me bonkers with this. They hear the general principle "really long lines of code are bad", but extrapolate it to "no characters shall pass the soft gutter no matter what".

Even if you have, say, 5 sequential related structs, that are all virtually identical, all written on one line so that the similarities and differences are obvious at a mere glance... Then someone comes through and touches my file, and while they're at it, "fix" the line that went 2 characters past the 80 mark by reformatting the 4th struct to span several lines. Now when you see that list of structs, you wonder "why is this one different?" and you have to read carefully to determine, nope, it just contained one longer string. Or god forbid the reformat all the structs to match, turning a 1-page file into 3 pages, and making it so you have to read and understand each element of each struct just to see what's going on.

If I could have written the rule of thumb, I would have said "No logic or control shall happen after the end of the gutter." But if there's a paragraph-long string on one line- who cares?? We all have a single keystroke that can toggle soft-wrap, and the odds that you're going to need to know anything about that string other than "it's a long string" are virtually nil.

Sorry. I got triggered. :-)

BigJono · a year ago
Yep this triggers the fuck out of me too. It drives me absolutely insane when I'm taking the time and effort to write good test cases that use inline per test data that I've taken the time to format so it's nice and readable for the next person, then the next person comes along, spends 30 seconds writing some 2 line rubbish to hit a code coverage metric, then spends another 60 seconds adding a linter rule that blows all the test data out to 400 lines of unreadable dogshit that uses only the left 15% of screen real estate.
yas_hmaheshwari · a year ago
My team also had a similar thing in place. I am saving this article in my pocket saves, so that I can give "proofs" of why this is better

From Zen of Python: ``` Special cases aren't special enough to break the rules. Although practicality beats purity. ``` https://peps.python.org/pep-0020/

arp242 · a year ago
This is why autoformatters that frob with line endings are just terrible and fundamentally broken.

I'm fairly firmly in the "wrap at 80" camp by the way; but sometimes a tad longer just makes sense. Or shorter for that matter: forced removal of line breaks is just as bad.

edflsafoiewq · a year ago
This is world autoformatters have wrought. The central dogma of the autoformatter is that "formatting" is based on dumb syntactic rules with no inflow of imprecise human judgements.
EasyMark · a year ago
I have been places where we allow long strings, but other things aren’t allowed and generally 80 to 100 char limits otherwise. I like 100 for c++/java and 80 for C. If it gets much longer than that (not being strings) then it’s time for a rethink in most cases, grouping/scoping symbols are getting too deep. I’m sure other languages may or may not have that as a reasonable argument. It is just a rule of thumb though.
bobbylarrybobby · a year ago
If I recall, rustfmt had a bug where long string literals (say, over 120 chars or so — or maybe if it was that the string was long enough to extend beyond the gutter when properly indented?) would prevent formatting of the entire file they were in. Has this been fixed?
JoshTriplett · a year ago
Not the whole file, but sufficiently long un-line-breakable code in a complex statement can cause rustfmt to give up on trying to format that statement. That's a known issue that needs fixing.
db48x · a year ago
Rust and Javascript and Lisp all get extra points because they put a keyword in front of every function definition. Searching for “fn doTheThing” or “defun do-the-thing” ensures that you find the actual definition. Meanwhile C lacks any such keyword, so the best you can do is search for the name. That gets you a sea of callers with the declarations and definitions mixed in. Some C coding conventions have you split the definition into two lines, first the return type on a line followed by a second line that starts with the function name. It looks ugly, but at least you can search for “^doTheThing” to find just the definition(s).
koito17 · a year ago
Golang has a similar property as a side-effect of the following design decision.

  ... the language has been designed to be easy to analyze and can be parsed without a symbol table
Taken from https://go.dev/doc/faq

The "top-level declarations" in source files are exactly: package, import, const, var, type, func. Nothing else. If you're searching for a function, it's always going to start with "func", even if it's an anonymous function. Searching for methods implemented by a struct similarly only needs one to know the "func" keyword and the name of the struct.

Coming from a background of mostly Clojure, Common Lisp, and TypeScript, the "greppability" of Go code is by far the best I have seen.

Of course, in any language, Go included, it's always better to rely on static analysis tools (like the IDE or LSP server) to find references, definitions, etc. But when searching code of some open source library, I always resort to ripgrep rather than setting up a development environment, unless I found something that I want to patch (which in case I set up the devlopment environment and rely on LSP instead of grep to discover definitions and references).

vitus · a year ago
I'm not so sure about greppability in the context of Go. At least at Google (where Go originates, and whose style guide presumably has strong influence on other organizations' use of the language), we discourage "stuttering":

> A piece of Go source code should avoid unnecessary repetition. One common source of this is repetitive names, which often include unnecessary words or repeat their context or type. Code itself can also be unnecessarily repetitive if the same or a similar code segment appears multiple times in close proximity.

https://google.github.io/styleguide/go/decisions#repetitive-...

(see also https://google.github.io/styleguide/go/best-practices#avoid-...)

This is the style rule that motivates the sibling comment about method names being split between method and receiver, for what it's worth.

I don't think this use case has received much attention internally, since it's fairly rare at Google to use grep directly to navigate code. As you suggest, it's much more common to either use your IDE with LSP integration, or Code Search (which you can get a sense of via Chromium's public repository, e.g. https://source.chromium.org/search?q=v8&sq=&ss=chromium%2Fch...).

madeofpalk · a year ago
The culture of single letter variables in golang, at least in the codebases I've seen, undoes this.
eptcyka · a year ago
Golang gets zero points from me because function receivers are declared between func and the name of the function. God ai hate this design choice and boy am I glad I can use golsp.
tuetuopay · a year ago
Go is horrible due to the absence of specific "interface implementation" markers. Gets pretty hard to find where or how a type implements an interface.
remram · a year ago
In golang you get `func (someName someType) funcname`, so it's much less greppable than languages using `func funcname`
bryanrasmussen · a year ago
JavaScript has multiple ways to define a function so you sort of lose that getting the actual definition benefit.

on edit: I see someone discussed that you can grep for both arrow functions and named function at the same time and I suppose you can also construct a query that handles a function constructor as well - but this does not really handle curried functions or similar patterns - I guess at that point one is letting the perfect become the enemy of the good.

Most people grepping know the code base and the patterns in use, so they probably only need to grep for one type of function declaration.

zarzavat · a year ago
C is so much worse than that. Many people declare symbols using macros for various reasons, so you end up with things like DEFINE_FUNCTION(foo) {. In order to get a complete list of symbols you need to preprocess it, this requires knowing what the compiler flags are. Nobody really knows what their compiler flags are because they are hidden between multiple levels of indirection and a variety of build systems.
skissane · a year ago
> C is so much worse than that. Many people declare symbols using macros for various reasons, so you end up with things like DEFINE_FUNCTION(foo) {.

That’s not really C; that’s a C-based DSL. The same problem exists with Lisp, except even worse, since its preprocessor is much more powerful, and hence encourages DSL-creation much more than C does. But in fact, it can happen with any language - even if a language lacks any built-in processor or macro facility, you can always build a custom one, or use a general purpose macro processor such as M4.

If you are creating a DSL, you need to create custom tooling to go along with it - ideal scenario, your tools are so customisable that supporting a DSL is more about configuration than coding something from scratch.

db48x · a year ago
Yes, the usefulness of macros always has to be balanced against their cost. I know of only one codebase that does this particular thing though, Emacs. It is used to define Lisp functions that are implemented in C.
CGamesPlay · a year ago
Not JavaScript. Cool kids never write “function” any more, it’s all arrow functions. You can search for const, which will typically work, but not always (could be a let, var, or multi-const intializer).
pjerem · a year ago
Yes but that’s an anti pattern. Arrow functions aren’t there to look cool, they’re how you define lambdas / anonymous functions.

Other than that, functions should be defined by the keyword.

lispisok · a year ago
Am I the only one who hates arrow functions?
albedoa · a year ago
I want to talk to the developer who considers greppability when deciding whether to use the "function" keyword but requires his definitions to be greppable by distancing them from their call locations. I just have a few questions for him.
spartanatreyu · a year ago
Yes JavaScript.

You can search for both: "function" and "=>" to find all function expressions and arrow function expressions.

All named functions are easily searchable.

All anonymous functions are throw away functions that are only called in one place so you don't need to search for them in the first place.

As soon as an anonymous function becomes important enough to receive a label (i.e. assigning it to a variable, being assigned to a parameter, converting to function expression), it has also become searchable by that label too.

Pxtl · a year ago
You can still look for `(funcname)\s*=` can't you? I mean it's not like functions get re-declared a lot.
supriyo-biswas · a year ago
You can still search for `<keyword> = \(.*\) => `, albeit it's a bit cumbersome.
eddieh · a year ago
I used to define functions as `funcname (arglist)`

And always call the function as `funcname(args)`

So definitions have a space between the name and arg parentheses, while calls do not. Seemed to work well, even in languages with extraneous keywords before definitions since space + paren is shorter than most keywords.

Now days I don’t bother since it really isn’t that useful especially with tags or LSP.

I still put the return type on a line of its own, not for search/grep, but because it is cleaner and looks nice to me—overly long lines are the ugliest of coding IMO. Well that and excessive nesting.

wruza · a year ago
Meanwhile C lacks any such keyword, so the best you can do is search for the name. That gets you a sea of callers with the declarations and definitions mixed in

That’s why in my personal projects I follow classic “type\nname” and grep with “^name\>”.

looks ugly

Single line definitions with long, irregular type names and unaligned function names look ugly. Col 1 names are not only greppable but skimmable. I can speedscroll through code and still see where I am.

skywal_l · a year ago
Yet you reply to an article that defines functions as variables, which I've seen a lot of developers do usually for no good reason at all.

To me, that's a much common and worse practice with regards to greppability than splitting identifiers using string which I haven't seen much in the wild.

mav3ri3k · a year ago
Although in rust, function like macros make it super hard to trace code. I like them when I am writing the code and hate then when I have to read others macros.
wpollock · a year ago
In the bygone days of ctags, C function definitions included a space before opening parenthesis, while function calls never had that space. I have a hard time remembering that modern coding styles never have that space and my IDE complains about it. (AFAIK, the modern gtags doesn't rely on that space to determine definitions.) Even without *tags, the convention made it easy to grep for definitions.
mzs · a year ago
space after builtin was recommended instead:

  if (x == 0) { ...
  sizeof (buf);
  return (-1);
  exit(0);

drewg123 · a year ago
In terms of C, that's one reason I prefer the BSD coding style:

int

foo(void) { }

vs the Linux coding style:

int foo(void) { }

The BSD style allows me to find function definitions using git grep ^foo.

marcosdumay · a year ago
Those also make your language easier to parse, and to read.

Many people insist that IDEs make the entire point moot, but that's the kind of thing that make IDEs easier to write and debug, so I disagree.

johannes1234321 · a year ago
One thing which works for C is to search something like `[a-z] foo\(.+\) \{` assuming that spacing matches the coding style, often the shorter form `[a-z] foo\(` works well, which tries to ensure there is a type definition and bin assignment or something before name. Then there is only a handful false positives.
veltas · a year ago
For most functions ^\S.*name( will find declarations and definitions.

Most of us use exuberant ctags to allow jumping to definitions.

dan-robertson · a year ago
Not sure this is very true for Common Lisp. Classic example are accessor functions where the generic function is created by whichever class is defined first and the method where the class is defined. Other macros will construct new symbols for function names (or take them from the macro arguments).
db48x · a year ago
That’s true, but I regard it as fairly minor. Accessor functions don't have any logic in them, so in practice you don’t have to grep for them. But it can be confusing for new players, since they don't know ahead of time which ones are accessors and which are not.
f1shy · a year ago
Still you can extend the concept without a lot of work, couldn't you?
akira2501 · a year ago
> so the best you can do is search for the name

This is why in C projects libs go in "lib/" and sources go in "src/". If your header files have the same directory structure as libs, then "include/" is a also a decent way to find definitions.

kazinator · a year ago
C has "classical" tooling like Cscope and Exuberant Ctags. The stuff works very well, except on the odd weird code that does idiotic things that should not be done with preprocessing.

Even for Lisp, you don't want to be grepping, or at least not all the time for basic things.

For TXR Lisp, I provide a program that will scan code and build (or add to) your tags file (either a Vim or Emacs compatible one).

Given

  (defstruct point ()
    x
    y)
it will let your editor jump to the definition of point, x and y.

throwawayffffas · a year ago
> Meanwhile C lacks any such keyword

It's a hassle. But not the end of the world.

I usually search for "doTheThing\(.+?\) \{" first.

If I don't get a hit, or too many hits I move to "doTheThing\([^\)]*?\) \{" and so on.

leogout · a year ago
Javascript is a bit trickier i think nowadays with the fat arrow notation : const myFunc = () => console. log("can't find me :p");
fsckboy · a year ago
C, starting with K&R, has all declarations and definitions on lines at the left margin, and little else. this is easy to grep for.
bionsystem · a year ago
Doesn't cscope fit this usecase ?
hgomersall · a year ago
Though glob imports in rust can hide a source, so those should be avoided.
mre · a year ago
Exactly. I wrote an entire blog post about that: https://corrode.dev/blog/dont-use-preludes-and-globs/
semiinfinitely · a year ago
python also!
jsjohnst · a year ago
Python is the only one mentioned that “actually works” without endless exceptions to the rule in the normal case. The ones mentioned (Rust/Javascript/Lisp/Go) all have specific syntax that is commonly enough used which makes it harder to search. Possible, absolutely, but still harder.
andersa · a year ago
Do people really use text search for this rather than an IDE that parses all of the code and knows exactly where each declaration is, able to instantly jump to them from a key press on any usage...? Wild.
iamwil · a year ago
Yes. Not everyone uses or likes an IDE. Also, when you lean on an IDE for navigation, there is a tendency to write more complicated code, since it feels easy to navigate, you don't feel the pain.
akritid · a year ago
Looks fine (subjective) and there is also ctags
darepublic · a year ago
There is arrow syntax with js
sva_ · a year ago
People don't use LSP?
gregjor · a year ago
That’s right, not everyone uses an LSP. Nothing wrong with LSPs, very useful tools. I use ripgrep, or plain grep if I have to, far more often than an LSP.

Working with legacy code — the scenario the author describes — I often can’t install anything on the server.

menaerus · a year ago
LSP doesn't always work without issues with large C and C++ codebases which is why one needs to fallback to grep techniques.
suprjami · a year ago
> Meanwhile C lacks any such keyword, so the best you can do is...

...use source code tagging or LSP.

gregjor · a year ago
ctags.

Dead Comment

jampekka · a year ago
Rust though does lose some of those points by more or less forcing[1] snake_case. It's really annoying to navigate bindings which are converted from camelCase.

I don't care which case is used. It's a trivial superficial thing, and tribal zealotry about such doesn't reflect well on the language and community.

[1] The warnings can be turned off, but in some cases it requires ugly hacks, and the community seems to be actively hostile to making it easier)

kibwen · a year ago
The Rust community is no more zealous about naming conventions than any other language which has naming conventions. Perhaps you're arguing against the concept of naming conventions in general, but that's not a Rust thing, every language of the past 20 years suggests naming conventions if for no other reason than every language provides a standard library which needs to follow some sort of naming conventions itself. Turning off the warnings emitted by the Rust compiler takes two lines of code, either at the root of the crate or in the crate manifest.
dblotsky · a year ago
Hard agree with the idea of greppability, but hard disagree about keeping names the same across boundaries.

I think the benefit of having one symbol exist in only one domain (e.g. “user_request” only showing up in the database-handling code, where it’s used 3 times, and not in the UI code, where it might’ve been used 30 times) reduces more cognitive load than is added by searching for 2 symbols instead of 1 common one.

plorkyeran · a year ago
I’ve also found that I sometimes really like when I grep for a symbol and hit some mapping code. Just knowing that some value goes through a specific mapping layer and then is never mentioned again until the spot where it’s read often answers the question I had by itself, while without the mapping code there’d just be no occurrences of the symbol in the current code base and I’d have no clue which external source it’s coming from.
runevault · a year ago
Probably depends on how your system is structured. if you know you only want to look in the DB code, hopefully it is either all together or there is something about the folder naming pattern you can take advantage of when saying where to search to limit it.

The upside to doing it this way is it makes your grepping more flexible by allowing you to either only search the one part of the codebase to see say DB code or see all the DB and UI things using the concept.

gregjor · a year ago
I have mixed thoughts on this too. Fortunately grep (rg in my case) easily handles it:

rg -i ‘foo.?bar’ finds all of foo_bar, fooBar, and FooBar.

Noumenon72 · a year ago
Not to mention the readability hit from identifiers like foo.user_request in JavaScript, which triggers both linters and my own sense of language convention.
emn13 · a year ago
Both of those are easy to fix. You'll adapt quickly if you pick a different convention.

Additionally, I find that in practice such "unusual" code is actually beneficial - it often makes it easy to see at a glance that the code is somehow in sync with some external spec. Especially when it comes to implicit usages such as in (de)serialization, noticing that quickly is quite valuable.

I'd much rather trash every languages' coding conventions than use subtly different names for objects serialized and shared across languages. It's just a pain.

amingilani · a year ago
I agree that code searchability is a good thing but I disagree with those examples. They intentionally increase the chance of errors.

Maybe there’s an alternative way to achieve what the author set out but increasing searchability at the cost of increasing brittleness isn’t it for me.

In this example:

const getTableName = (addressType: 'shipping' | 'billing') => { return `${addressType}_addresses` }

The input string and output are coupled. If you add string conditionals as the author did, you introduce the chance of a mismatch between the input and output.

const getTableName = (addressType: 'shipping' | 'billing') => { if (addressType === 'shipping') { return 'shipping_addresses' } if (addressType === 'billing') { return 'billing_addresses' } throw new TypeError('addressType must be billing or shipping') }

Similarly, flattening dictionaries for readability introduces the chance of a random typo making our lives hell. A single typo in the repetitions below will be awful.

{ "auth.login.title": "Login", "auth.login.emailLabel": "Email", "auth.login.passwordLabel": "Password", "auth.register.title": "Login", "auth.register.emailLabel": "Email", "auth.register.passwordLabel": "Password", }

Typos aren’t unlikely. In a codebase I work with, we have a perpetually open ticket about how ARTISTS is mistyped as ATRISTS in a similarly flat enum.

The issue can’t be solved easily because the enum is now copied across several codebases. But the ticket has a counter for the number of developers that independently discovered the bug and it’s in the mid two digits.

Noumenon72 · a year ago
Typos are find-and-fix-once, while unsearchability is a maintenance burden forever.

I don't think coupling variable names by making sure they contain the same strings is the best way to show they're related, compared to an actual map from address type to table name. There might be a lot of things called 'shipping' in my app, only some of which are coupled to `shipping_addresses`.

Shouldn't a linter be able to catch that there is no enum member called MyEnum.ATRISTS, or is it not an actual enum?

peeters · a year ago
> The input string and output are coupled. If you add string conditionals as the author did, you introduce the chance of a mismatch between the input and output.

I think it depends on whether the repetition is accidental or intrinsic. Does the table name happen to contain the address type as a prefix, or does it intrinsically have to? Greppability aside, when things are incidentally related, it's often better to repeat yourself to not give the wrong impression that they're intrinsically related. Conversely, if they are intrinsically related (i.e. it's an invariant of the system that the table name starts with the address type as a prefix) then it's better for the code to align with that.

ctxc · a year ago
Agree with you.

What happens when translation files get too big and you want to split and send only relevant parts? Like send only auth keys when user is unauthenticated?

`return translations[auth][login]` is no longer possible.

Or just imagine you want to iterate through `auth` keys. _shudders_

usrusr · a year ago
Entrenched typos like ATRISTS are actually a greppability goldmine. Chances are there are more occurrences of pluralized people who are making art in the codebase, but only ATRISTS is the one from that enum.

I certainly would not suggest deliberately mistyping, but there are places where the benefit is approaching the cost. Certain log messages can absolutely benefit from subtle letter garbling that retains readability while adding uniqueness.

kaelwd · a year ago
REFERER moment.