Greppability is an underrated code metric

Grepping for symbols like function names and class names feels so anemic compared to using a tool that has a syntactic understanding of the code. Just "go to definition" and "find usages" alone reduce the need for text search enormously.

For the past decade-plus I have mostly only searched for user facing strings. Those have the advantage of being longer, so are more easily searched.

Honestly, posts like this sound like the author needs to invest some time in learning about better tools for his language. A good IDE alone will save you so much time.

laserbeam · a year ago

Scenarios where an IDE with full syntactic understanding is better:

- It's your day to day project and you expect to be working in it for a long time.

Scenarios where grepping is more useful:

- Your language has #ifdef or equivalent syntax which does conditional compilation making syntactic tools incomplete.

- You just opened the project for the first time.

- It's in a language you don't daily drive (you write backend but have to delve in frontend code, it's a 3rd party library, it's configuration files, random json/xml files or data)

- You're editing or searching through documentation.

- You haven't even downloaded the project and are checking things out in github (or some similar site for your project).

- You're providing remote assistance to someone and you are not at your main development machine.

- You're remoting via SSH and have access to code there (say it's a python server).

Yes, an IDE will save you time daily driving. But there's no reason to sabotage all the other usecases.

emn13 · a year ago

Further important (to me) scenarios that also argue for greppability:

- greppability does not preclude IDE or language server tooling; there's often special cases where only certain e.g. context-dependant usages matter, and sometimes grep is the easiest way to find those.

- projects that include multiple languages, such as for instance the fairly common setup of HTML, JS, CSS, SQL, and some server-side language.

- performance in scenarios with huge amounts of code, or where you're searching very often (e.g. in each git commit for some amount of history)

- ease of use across repositories (e.g. a client app, a spec, and a server app in separate repos).

I treat greppability as an almost universal default. I'd much rather have code in a "weird" naming style in some language but have consistent identifiers across languages, than have normal-style-guide default identifiers in each language, but differing identifiers across languages. If code "looks weird", if anything that's often actually a _benefit_ in such cases, not a downside - most serialization libraries I use for this kind of stuff tend to do a lot of automagic mapping that can break in ways that are sometimes hard to detect at compile time if somebody renames something, or sometimes even just for a casing change or type change. Having a hint as to this fragility immediate at a glance even in dynamically typed languages is sometimes a nice side-effect. Very speculatively, I wouldn't be surprised if AI coding tools can deal with consistent names better than context-dependent ones too; greppability is likely not specifically about merely the tool grep.

And the best part is that there's almost no downside; it's not like you need to pick either a language server, IDE or grep - just use whatever is most convenient for each task.

popinman322 · a year ago

Grep is also useful when IDE indexing isn't feasible for the entire project. At past employers I worked in monorepos where the sheer size of the index caused multiple seconds of delay in intellisense and UI stuttering; our devex team's preferred approach was to better integrate our IDE experience with the build system such that only symbols in scope of the module you were working on would be loaded. This was usually fine, and it works especially well for product teams, but it's a headache when you're doing cross-cutting work (e.g. for infrastructure projects/overhauls).

We also had a livegrep instance that we could use to grep any corporate repo, regardless of where it was hosted. That was extremely useful for investigating failures in build scripts that spanned multiple repositories (e.g. building a Go sidecar that relies on a service config in the Java monorepo).

lolinder · a year ago

> It's your day to day project and you expect to be working in it for a long time.

I don't think we need to restrict the benefits quite that much—if it's a project that isn't my day-to-day but is in a language I already have set up in my IDE, I'd much prefer to open it up in my IDE and use jump to definition and friends than to try to grep and hope that the developers made it grepable.

Going further, I'd equally rather have plugins ready to go for every language my company works in and use them for exploring a foreign codebase. The navigation tools all work more or less the same, so it's not like I need to invest effort learning a new tool in order to benefit from navigation.

> Yes, an IDE will save you time daily driving. But there's no reason to sabotage all the other usecases.

Certainly don't sabotage, but some of these suggestions are bad for other reasons that aren't about grep.

For example: breaking the naming conventions of your language in order to avoid remapping is questionable at best. Operating like that binds your business logic way too tightly to the database representation, and while "just return the db object" sounds like a good optimization in theory, I've never not regretted having frontend code that assumes it's operating directly on database objects.

cxr · a year ago

- You're fully aware that it would be better to be able to use tooling for $THING, but tooling doesn't exist yet or is immature.

joe-six-pack · a year ago

You forgot massive codebases. Language servers really struggle with anything on the order of the Linux kernel, FreeBSD, or Chromium.

jollyllama · a year ago

>It's your day to day project and you expect to be working in it for a long time.

Bold of everyone here to assume that everyone has a day to day project. If you're a consultant or for other reasons you're switching projects on a month to month basis, greppability is probably the top metric second to UT coverage.

beeboobaa3 · a year ago

> - Your language has #ifdef or equivalent syntax which does conditional compilation making syntactic tools incomplete.

You need a better IDE.

> - You just opened the project for the first time.

Go grab a coffee

> - It's in a language you don't daily drive

Jetbrains all products pack, baby.

> - You haven't even downloaded the project and are checking things out in github (or some similar site for your project).

On GitHub, press `.` to open it in a web-based vscode. Download it & open it in your IDE while you are doing this.

> - You're remoting via SSH and have access to code there (say it's a python server).

Don't do this. Check the git hash that was deployed and checkout the code locally.

jvanderbot · a year ago

> - Your language has #ifdef or equivalent syntax which does conditional compilation making syntactic tools incomplete.

LSP-based tools are fine with this, generally. A syntactic understanding is an incomplete solution. I suspect GP meant LSP. (as long as compile_commands.json or equivalent is avilable).

Many of those other caveats are non-issues once LSPs are widespread. Even Github has lsp-like go-to-def/go-to-ref, though it's not perfect.

codedokode · a year ago

"Go to definition" often doesn't work in dynamic languages like Python without type hints; it might not work when the code is dynamically generated.

umanwizard · a year ago

> Your language has #ifdef or equivalent syntax which does conditional compilation making syntactic tools incomplete.

Your other points make sense, but in this case, at least for C/C++, you can generate a compile_commands.json that will let clangd interpret your code accurately.

If building with make just do `bear -- make` instead of `make`. If building with cmake pass `-DCMAKE_EXPORT_COMPILE_COMMANDS=1`.

gpderetta · a year ago

- you just switched branch/rebased and the index is not up to date.

- the project is large enough that the IDE can't cope.

- you want to also match comments, commented out code or in-project documentation

- you want fuzzy search and match similarly named functions

I use clangd integration in my IDE all the time, but often brute force is the right solution.

gregjor · a year ago

I abandoned VSCode and went back to vim + ctags + ripgrep after a year with the most popular IDE. I miss some features but it didn’t give me a 10x or even 1.5x improvement in my own work along any dimension.

I attribute that mostly to my several decades of experience with vi(m) and command line tools, not to anything inherently bad about VSCode.

What counts as “better” tools has a lot of subjectivity and circumstances implied. No one set of tools works for everyone. I very often have to work over ssh on servers that don’t allow installing anything, much less Node and npm for VSCode, so I invest my time in the tools that always work everywhere, for the work I do.

The main project I’ve worked on for the last few years has a little less than 500,000 lines of code. VSCode’s LSP takes a few seconds fairly often to maintain the LSP indexes. Running ctags over the same code takes about a second and I can control when that happens. vim has no delays at all, and ripgrep can search all of the files in a second or two.

kelnos · a year ago

I have similar feelings... I still use IntelliJ IDEA for JVM languages, but for C, Rust, Go, Python, etc., I've been using vim for years (decades?), and that's just how I prefer to write code in those languages. I do have LSP plugins installed in vim for the languages I work in, and do have a key sequence mapped for jump-to-definition... but I still find myself (rip)grepping through the source at least as often as I j-t-d, maybe more often.

wrasee · a year ago

Did you consider Neovim? You get the benefit of vim while also being able to mix in as much LSP tooling as you like. The tradeoff is that it takes some time to set up, although that is getting easier.

That won’t make LSP go any faster though. There’s still something interesting in the fact that a ripgrep of every line in the codebase can still be faster than a dedicated tool.

joe-six-pack · a year ago

VSCode is not an IDE, it's an extensible text editor. IDEs are integrated (it's in the name) and get developed as a whole. I'm 99% certain that if you were forced to spend a couple of months in a real IDE (like IDEA or Rider), you would not want to go back to vim, or any other text editor. Speaking as a long time user of both.

heisenbit · a year ago

A good IDE can be so much better iff it understands the code. However this requires the IDE to be able to understand the project structure, dependencies etc. which can be considerable effort. In a codebase with many projects employing several different languages it becomes hard to get and maintain the IDE understands everything state.

amichal · a year ago

And an IDE would also fail to find references for most of the cases described in the article: name composition/manipulation, naming consistency across language barriers, and flat namespaces in serialization. And file/path folder naming seems to be irrelevant to the smart IDE argument. "Naming things is hard"

carlmr · a year ago

And especially in large monorepos anything that understands the code can become quite sluggish. While ripgrep remains fast.

A kind of in-between I've found for some search and replace action is comby (https://comby.dev/). Having a matching braces feature is a godsend for doing some kind of replacements properly.

brain5ide · a year ago

I think the first sentence of the author counters your comment. What you described works best in a familiar codebase where the organizing principles have been maintained well and are familiar to the reader and the tools are just the extension of those organizing principles. Even then a deviation from those rules might produce gaps in understanding of what the codebase does.

And grep cuts right through that in a pretty universal way. What the post describes are just ways to not work against grep to optimize for something ephemeral.

ricardo81 · a year ago

Agree. Not just because it's unfamiliar code, you can also get a feel for how the program/programmer(s) structured the whole thing.

zarzavat · a year ago

Go to definition and find usages only work one symbol at a time. I use both, but I still use global find/replace for groups of symbols sharing the same concept.

For example if I want to rename all “Dog” (DogModel, DogView, DogController) symbols to “Wolf”, find/replace is much better at that because it will tell me about symbols I had forgotten about.

sandermvanvliet · a year ago

Jetbrains ReSharper (and Rider) is smart enough to handle these things. It’ll suggest renames across other symbols even ones that have related names

f1shy · a year ago

For that use case I think you can use treesitter[1] you can find Dog.* but only if it is a variable name, for example. Avoiding replacement inside of say literals.

[1] https://www.youtube.com/watch?v=MZPR_SC9LzE

turboponyy · a year ago

There's no reason they have to work one symbol at a time - that's just a missing feature in your language server implementation.

Some language servers support modifying the symbols in contexts like docstrings as well.

gugagore · a year ago

I am familiar with the situation you describe, and it's a good point.

However, it does suggest that there is an opportunity for factoring "Dog" out in the code, at least by name spacing (e.g. Dog.Model).

citrin_ru · a year ago

Not everything you need to look for is a language identifier. I often grep for configuration option names in the code to see what the option actually does - sometimes it is easy to grep, sometimes there are too many matches, sometimes they cannot be found because option name composed in the code from separate unrepeatable (because of too many matches) parts. It's not hard to make config options greppable but some coders just don't care about this property.

sauercrowd · a year ago

strongly disagree here. This works if - your IDE/language server is performant - all the tools are fully set up - you know how to query the specific semantic entity you're looking for (remembering shortcuts) - you are only interested in a single specific semantic entity - mixing entities is rarely supported

I dont map out projects in terms of semantics, I map out projects in files and code - That makes querying intuitive and I can easily compose queries that match the specificity of what I care about (e.g. I might want to find a `Server` but I want to show both classes, interfaces and abstract classes).

For the specific toolchain I'm using - typescript - the symbol search is also unusable once it hits a certain project size, it's just way too slow for it to be part of my core workflow

underdeserver · a year ago

Unfortunately in larger codebases or dynamic languages these tools are just not good enough today. At least not those I and my employers have tried.

They're either incomplete (you don't get ALL references or you get false references) or way too slow (>10 seconds when rg takes 1-2).

Recommendations are most welcome.

jimmaswell · a year ago

Only thing I can recommend is using C# (obviously not always possible). Never had an issue with these functions in Visual Studio proper no matter how big the project.

leni536 · a year ago

I can't use an IDE on my entire git history, but git can grep.

aa-jv · a year ago

On the flipside, IDE's can turn you into lazy, inefficient programmers by doing all the hand-holding for you.

If your feelings are anemic when tasked with doing a grep, its because you have lost a very valuable skill by delegating it to a computer. There are some things the IDE is never going to be able to find - lest it becomes the development environment - so keeping your grep fu sharpened is wise beyond the decades.

(Disclaimer: 40 years of software development, and vim+cscope+grep/silversearcher are all I really need, next to my compiler..)

throwaway2037 · a year ago

    > lazy... programmers

Since when was that a bad thing? Since time immemorial, it has been hailed as a universal good for programmers to be lazy. I'm pretty sure Larry Wall has lots of jokes about this on Usenet.

Also, I can clearly remember switching from vim/emacs to Microsoft Visual Studio (please, don't throw your tomatoes just yet!). I was blown away by IntelliSense. Suddenly, I was focusing more on writing business logic, and less time searching for APIs.

winwang · a year ago

I count the IDE and stuff like LSP as natural extensions of the compiler. For sure I grep (or equivalent) for stuff, but I highly prefer statically typed languages/ecosystems.

At the end of the day, I'm here to solve problems, and there's no end to them -- might as well get a head start.

lucumo · a year ago

> If your feelings are anemic

I'm not feeling anemic. The tool is anemic, as in, underpowered. It returns crap you don't want, and doesn't return stuff you do want.

My grep-fu is fine. It's a perfectly good tool if you have nothing better. But usually you do have something better.

Using the wrong tool to make yourself feel cool is stupid. Using the wrong tool because a good tool could make you lazy shows a lack of respect for the end result.

high_na_euv · a year ago

Leveraging technology is good thing

HdS84 · a year ago

Huh? I have an old hand-powered drill from my Grandpa in my workshop. I used it once for fun. For all other tasks I use a powered drill. Same for IDEs. They help your refactor and reason about code - both properties I value. Sure, I could print it and use a textmarker, but I'm not Grandpa

phyrex · a year ago

This breaks down at scale and across languages. All the FAANGs make heavy use of the equivalent of grepping in their code base

db48x · a year ago

True, but IDEs are fragile tools. Sometimes you want to fall back to simpler tools that will always work, and grep is not fragile.

cxr · a year ago

The basis if this article (and its forebear "Too DRY - The Grep Test"[1]) is that grep is fragile. It's just fragile in a way that's different from the way that IDEs are fragile.

1. <http://jamie-wong.com/2013/07/12/grep-test/>

kelnos · a year ago

Even with IDEs, I find that I grep through source trees fairly often.

Sometimes it's because I don't completely trust the IDE to find everything I'm interested in (justifiably; sometimes it doesn't). Sometimes it's because I'm not looking to dive into the code and do serious work on it; I'm just doing a quick drive-by check/lookup for something. Sometimes it's because I'm ssh'd into another machine and I don't have the ability to easily open the sources in an IDE.

a_e_k · a year ago

I've come to really like language servers for big personal and work projects where I already have my tools configured and tuned for efficiently working with it.

But being able to grep is really nice when trying to figure out something out about a source tree that I don't yet have set up to compile, nor am I a developer of. I.e., I've downloaded the source for a tool I've been using pre-built binaries of and am now trying to trace why I might be getting a particular error.

kragen · a year ago

posts like this sound like the author routinely solves harder problems than you are, because the solutions you suggest don't work in the cases the post is about. we've had 'go to definition' since 01978 and 'find usages' since 01980, and you should definitely use them for the cases where they work

mjr00 · a year ago

From the article,

- dynamically built identifiers is 100% correct, never do this. Breaks both text search and symbol search, results in complete garbage code. I had to deal with bugs in early versions of docker-compose because of this.

- same name for things across the stack? Shouldn't matter, just use find usages on `getAddressById`. Also easy way to bait yourself because database fields aren't 1:1 with front-end fields in anything but the simplest of CRUD webshit.

- translation example: the fundamental problem is using strings as keys when they should be symbols. Flat vs nested is irrelevant here because you should be using neither.

- react component example: As I mentioned in another comment, trivially managed with Find Usages.

Nothing in here strikes me as "routinely solves harder problems," it's just standard web dev.

brooke2k · a year ago

with all due respect, it sounds like you have the privilege of working in some relatively tidy codebases (and I'm jealous!)

with a legacy codebase, or a fork of a dependency that had to be patched which uses an incompatible buildsystem, or any C/C++/obj-c/etc that heavily uses the preprocessor or nonstandard build practices, or codebases that mix lots of different languages over awkward FFI boundaries and so on and so forth -- there are so many situations where sometimes an IDE just can't get you 100% of the way there and you have to revert to grepping to do any real work

that being said, I don't fully support the idea of handcuffing your code in the name of greppability, but I think dismissing it as a metric under the premise that IDEs make grepping "obsolete" is a little bit hasty

lucumo · a year ago

> with all due respect, it sounds like you have the privilege of working in some relatively tidy codebases (and I'm jealous!)

I wish, but no. I've found people will make a mess of everything. Which is why I don't trust solutions that rely on humans having more discipline, like what this article advocates.

In any situation where grep is your last saviour, you cannot rely on the greppability of the code. You'll have to check and double check everything, and still accept the risk of errors.

groby_b · a year ago

Working on a 32MLOC project, text search is still the quickest way to find a hook that gets you to the deeper investigation. From there, finding definitions/usage definitely matters.

You can maybe skip the greppability if the code base is of a size that you can hold the rough shape and names in your head, but a "get a list of things that sound like they might be related to my problem" operation is still extremely helpful. And it's also worth keeping in mind that greppability matters to onboarding.

Does that mean it should be an overriding design concern? No. But it does mean that if it's cheap to build greppable, you probably should, because it's a net positive.

jmmv · a year ago

Sure, if you have the luxury of having a functional IDE for all of your code.

You can't imagine how much faster I was than everybody else at answering questions about a large codebase just because I knew how to use ripgrep (on Windows). "Knowing how to grep" is a superpower.

wglb · a year ago

A bit on the other side of the argument, I use grep plus find plus some shell work to do source code analysis for security reviews. grep doesn't really understand the syntax of languages, and that is mostly OK.

I've used this technique on auditing many code bases including the C family, perl, Visual Basic, C# and SQL.

With this sort of tool, I don't need to look for language-particular parsers--so long as the source is in a text file, this works well.

PhilipRoman · a year ago

IDEs are cool and all, but there is no way I'm gonna let VSCode index my 80GB yocto tmp directory. Ctags can crunch the whole thing in a few minutes, and so can grep.

Plus there are cases where grep is really what you need, for example after updating a particular command line tool whose output changed, I was able to find all scripts which grepped the output of the tool in a way that was broken.

EasyMark · a year ago

It seems like the law of diminishing returns; while I'm sure in a few cases this characteristic of a code writing style is extremely useful, it cuts into other things such as readability and conciseness. Fewer lines can mean fewer bugs, within reason, if you aren't in lisp and are using more than 3 parentheses, you might want to split it up because the compiler/JIT/interpreter is going to anyway.

k__ · a year ago

Honestly, in my 18 years of software development, I haven't "greped" code once.

I only use grep to filter the output of CLI tools.

For code, I use my IDE or repository features.

yCombLinks · a year ago

Do you use the find feature in your IDE? IE not find by reference, just text matching? That's the same as greppability.

umvi · a year ago

Interface-heavy languages break IDEs. In .NET at least, "go to definition" jumps you to the interface definition which you probably aren't interested in (vs. the specific implementation you are trying to dig into). Also with .NET specifically XAML breaks IDE traceability as well.

hyperpape · a year ago

I can run rg over my project faster than I can do anything in my IDE. Both tools have their places.

IshKebab · a year ago

Definitely true when you can use static typing.

Unfortunately sometimes you can't, and sometimes you can but people can't be arsed, so this is still a consideration.

mihaaly · a year ago

"A good IDE"

I am also waiting for world peace! ; )

ilrwbwrkhv · a year ago

I tried a good IDE recently: Jetbrains IntelliJ and Webstorm. Considered the topdog of IDEs. Was working on a typescript project which uses npm link to symlink another local project into the node_modules of current project.

The great IDEs IntelliJ and Webstorm stopped autosuggesting completions from the symlinked project.

Open up Sublime Text again. Worked perfectly. That is why Jetbrains and their behemoth IDEs are utter shite.

Write your code to have symmetry and make it easy to grep.

71bw · a year ago

>I tried a good IDE recently: Jetbrains IntelliJ

Having dealt with IntelliJ for 3 years due to education stuff - I laughed out here. Even VS is better than ideaj.

Dead Comment

jakub_g · a year ago

Your observation does not help with the majority of the points in the article. How do you find all usages of a parameter value literal?

CrimsonRain · a year ago

By not using literals everywhere. All literals are defined somewhere (start of function, class etc) as enums or vars and used.

Just because I have 20 usage of 'shipping_address' doesn't mean I'll have this string 20 times in different places.

Grep has its place and I often need to grep code base which have been written without much thoughts towards DX. But writing it nicely allows LSP to take over.

troupo · a year ago

This is what the article starts with: "Even in projects exclusively written by myself, I have to search a lot: function names, error messages, class names, that kind of thing."

All of that is trivial to search for with a tool that understands the language.

mjr00 · a year ago

> Honestly, posts like this sound like the author needs to invest some time in learning about better tools for his language. A good IDE alone will save you so much time.

Completely agreed. The React component example in the article is trivial solvable with any modern IDE; right click on class name, "Find Usages" (or use the appropriate hotkey, of course). Trying to grep for a class name when you could just do that is insane.

I mainly see this from juniors who don't know any better, but as seen in this thread and the article, there are also experienced engineers who are stubborn and refuse to use tools made after 1990 for some reason.

gpderetta · a year ago

I worked on codebases large enough where enabling autocomplete/indexing would lock the IDE and cause the workstation to swap hard.

gregjor · a year ago

> experienced engineers who are stubborn and refuse to use tools made after 1990 for some reason.

Before calling people stubborn or assuming they got left behind out of ignorance, consider your assumptions. 40+ years experience, senior in both experience and age at this point. Long-term vim + command line tools user.

Do you have any evidence that shows "A good IDE alone will save you so much time?" Have you seen studies comparing productivity or code quality or any metric written by people using IDEs vs those using a plain editor with grep?

By "so much faster" what do you mean exactly? I have decades of experience with vim + ctags + grep (rg these days, because I don't want to get called a stubborn stick in the mud). I can find and change things in large codebases pretty fast. I used VSCode for a year on the same codebases and I didn't feel "so much faster," and I committed to it and watched numerous how-to videos and learned the tool well enough to train other programmers on it. No 10x improvement, not even 1.5x. For most tasks I would call it close to the same in terms of time taken to write code. After getting burned a couple times with "Replace symbol" in VSCode I stopped trusting it. After noticing the LSP failed to find some references I trusted it less. I know grep/ack/rg/ctags aren't perfect, but I also know their weaknesses and how to work with them to get them to do what I want. After a year I went back to vim + ctags + rg.

We might have more productive (and friendly) interactions as programmers if we remembered that not everyone works the same way, or on the same kind of code and projects. What we call "best practices" or "modern tools" largely come down to familiarity, received wisdom, opinion, and fashion -- almost never from rigorous metrics and testing. You like your IDE? Great! I like my tools too. Would either of us get "so much faster" using a different set of tools? Probably not. Trying to find the silver bullet that reduces accidental complexity in software development presents an ongoing challenge, but history shows that editors and IDEs don't do much because if they did programmers today would outperform old guys like me by 10x in a measurable way.

At the last full-time job I had, at an educational software company with 30+ programmers, everyone used Eclipse. My first day I got a new desktop with two big monitors, Eclipse installed, ready to go. I installed vim and the CLI subversion client and some other stuff and worked from the command line, as I usually do. I left one of the monitors off, I don't need that much screen space, and I don't have Twitter and Facebook and other junk running on a second monitor all day like most of the other people did. I got made fun of, old man using old tools. Then once a week, like clockwork, Eclipse would auto-install some updates and everyone came to a halt trying to resolve plugin version conflicts, getting the team in sync. Hours and hours wasted regularly just getting the IDE to work. That didn't affect me, I never opened Eclipse. Watching the other programmers it seemed really slow. So just maybe Eclipse could jump to a definition faster than vim + ctags (I doubt it), but amortized over a month Eclipse all by itself wasted more time than anyone possibly saved with the more powerful tool. Anecdote, I know, but I've seen this play out in similar ways at more than one shop.

Just last year a new hire at a place I freelance for spent days trying to get Jetbrains PHPStorm working on a shared remote dev server. Like VSCode it runs a heavy process on the server (including the LSP). Unlike VSCode, PHPStorm can actually kill the whole server, wasting everyone's time and maybe losing work. I have never seen vim or grep bring a whole server down. I could add up how much "faster" PHPStorm might turn out compared to vim, but it will have to recoup the days lost trying to get it to work at all first.

$ grep -Fxf <(ls -1 /bin) /usr/share/dict/swedish ack ar as black dialog dig du ebb ed editor finger flock gem glade grep id import last less make man montage pager pass pc plog red reset rev sed sort sorter split stat tar test transform vi

function strFromNumOfObjects(n) { if (n === 1) { return "obiekt"; } let last_digit = (n%10); let penultimate_digit = Math.trunc((n%100)/10); if ((penultimate_digit == 0 || penultimate_digit >= 2) && last_digit > 1 && last_digit <= 4) { return "obiekty"; } return "obiektów"; }

Rust and Javascript and Lisp all get extra points because they put a keyword in front of every function definition. Searching for “fn doTheThing” or “defun do-the-thing” ensures that you find the actual definition. Meanwhile C lacks any such keyword, so the best you can do is search for the name. That gets you a sea of callers with the declarations and definitions mixed in. Some C coding conventions have you split the definition into two lines, first the return type on a line followed by a second line that starts with the function name. It looks ugly, but at least you can search for “^doTheThing” to find just the definition(s).

koito17 · a year ago

Golang has a similar property as a side-effect of the following design decision.

  ... the language has been designed to be easy to analyze and can be parsed without a symbol table

Taken from https://go.dev/doc/faq

The "top-level declarations" in source files are exactly: package, import, const, var, type, func. Nothing else. If you're searching for a function, it's always going to start with "func", even if it's an anonymous function. Searching for methods implemented by a struct similarly only needs one to know the "func" keyword and the name of the struct.

Coming from a background of mostly Clojure, Common Lisp, and TypeScript, the "greppability" of Go code is by far the best I have seen.

Of course, in any language, Go included, it's always better to rely on static analysis tools (like the IDE or LSP server) to find references, definitions, etc. But when searching code of some open source library, I always resort to ripgrep rather than setting up a development environment, unless I found something that I want to patch (which in case I set up the devlopment environment and rely on LSP instead of grep to discover definitions and references).

vitus · a year ago

I'm not so sure about greppability in the context of Go. At least at Google (where Go originates, and whose style guide presumably has strong influence on other organizations' use of the language), we discourage "stuttering":

> A piece of Go source code should avoid unnecessary repetition. One common source of this is repetitive names, which often include unnecessary words or repeat their context or type. Code itself can also be unnecessarily repetitive if the same or a similar code segment appears multiple times in close proximity.

https://google.github.io/styleguide/go/decisions#repetitive-...

This is the style rule that motivates the sibling comment about method names being split between method and receiver, for what it's worth.

I don't think this use case has received much attention internally, since it's fairly rare at Google to use grep directly to navigate code. As you suggest, it's much more common to either use your IDE with LSP integration, or Code Search (which you can get a sense of via Chromium's public repository, e.g. https://source.chromium.org/search?q=v8&sq=&ss=chromium%2Fch...).

madeofpalk · a year ago

The culture of single letter variables in golang, at least in the codebases I've seen, undoes this.

eptcyka · a year ago

Golang gets zero points from me because function receivers are declared between func and the name of the function. God ai hate this design choice and boy am I glad I can use golsp.

tuetuopay · a year ago

Go is horrible due to the absence of specific "interface implementation" markers. Gets pretty hard to find where or how a type implements an interface.

remram · a year ago

In golang you get `func (someName someType) funcname`, so it's much less greppable than languages using `func funcname`

bryanrasmussen · a year ago

JavaScript has multiple ways to define a function so you sort of lose that getting the actual definition benefit.

on edit: I see someone discussed that you can grep for both arrow functions and named function at the same time and I suppose you can also construct a query that handles a function constructor as well - but this does not really handle curried functions or similar patterns - I guess at that point one is letting the perfect become the enemy of the good.

Most people grepping know the code base and the patterns in use, so they probably only need to grep for one type of function declaration.

zarzavat · a year ago

C is so much worse than that. Many people declare symbols using macros for various reasons, so you end up with things like DEFINE_FUNCTION(foo) {. In order to get a complete list of symbols you need to preprocess it, this requires knowing what the compiler flags are. Nobody really knows what their compiler flags are because they are hidden between multiple levels of indirection and a variety of build systems.

skissane · a year ago

> C is so much worse than that. Many people declare symbols using macros for various reasons, so you end up with things like DEFINE_FUNCTION(foo) {.

That’s not really C; that’s a C-based DSL. The same problem exists with Lisp, except even worse, since its preprocessor is much more powerful, and hence encourages DSL-creation much more than C does. But in fact, it can happen with any language - even if a language lacks any built-in processor or macro facility, you can always build a custom one, or use a general purpose macro processor such as M4.

If you are creating a DSL, you need to create custom tooling to go along with it - ideal scenario, your tools are so customisable that supporting a DSL is more about configuration than coding something from scratch.

db48x · a year ago

Yes, the usefulness of macros always has to be balanced against their cost. I know of only one codebase that does this particular thing though, Emacs. It is used to define Lisp functions that are implemented in C.

CGamesPlay · a year ago

Not JavaScript. Cool kids never write “function” any more, it’s all arrow functions. You can search for const, which will typically work, but not always (could be a let, var, or multi-const intializer).

pjerem · a year ago

Yes but that’s an anti pattern. Arrow functions aren’t there to look cool, they’re how you define lambdas / anonymous functions.

Other than that, functions should be defined by the keyword.

lispisok · a year ago

Am I the only one who hates arrow functions?

albedoa · a year ago

I want to talk to the developer who considers greppability when deciding whether to use the "function" keyword but requires his definitions to be greppable by distancing them from their call locations. I just have a few questions for him.

spartanatreyu · a year ago

Yes JavaScript.

You can search for both: "function" and "=>" to find all function expressions and arrow function expressions.

All named functions are easily searchable.

All anonymous functions are throw away functions that are only called in one place so you don't need to search for them in the first place.

As soon as an anonymous function becomes important enough to receive a label (i.e. assigning it to a variable, being assigned to a parameter, converting to function expression), it has also become searchable by that label too.

Pxtl · a year ago

You can still look for `(funcname)\s*=` can't you? I mean it's not like functions get re-declared a lot.

supriyo-biswas · a year ago

You can still search for `<keyword> = $.*$ => `, albeit it's a bit cumbersome.

eddieh · a year ago

I used to define functions as `funcname (arglist)`

And always call the function as `funcname(args)`

So definitions have a space between the name and arg parentheses, while calls do not. Seemed to work well, even in languages with extraneous keywords before definitions since space + paren is shorter than most keywords.

Now days I don’t bother since it really isn’t that useful especially with tags or LSP.

I still put the return type on a line of its own, not for search/grep, but because it is cleaner and looks nice to me—overly long lines are the ugliest of coding IMO. Well that and excessive nesting.

wruza · a year ago

Meanwhile C lacks any such keyword, so the best you can do is search for the name. That gets you a sea of callers with the declarations and definitions mixed in

That’s why in my personal projects I follow classic “type\nname” and grep with “^name\>”.

looks ugly

Single line definitions with long, irregular type names and unaligned function names look ugly. Col 1 names are not only greppable but skimmable. I can speedscroll through code and still see where I am.

skywal_l · a year ago

Yet you reply to an article that defines functions as variables, which I've seen a lot of developers do usually for no good reason at all.

To me, that's a much common and worse practice with regards to greppability than splitting identifiers using string which I haven't seen much in the wild.

mav3ri3k · a year ago

Although in rust, function like macros make it super hard to trace code. I like them when I am writing the code and hate then when I have to read others macros.

wpollock · a year ago

In the bygone days of ctags, C function definitions included a space before opening parenthesis, while function calls never had that space. I have a hard time remembering that modern coding styles never have that space and my IDE complains about it. (AFAIK, the modern gtags doesn't rely on that space to determine definitions.) Even without *tags, the convention made it easy to grep for definitions.

mzs · a year ago

space after builtin was recommended instead:

  if (x == 0) { ...
  sizeof (buf);
  return (-1);
  exit(0);

drewg123 · a year ago

In terms of C, that's one reason I prefer the BSD coding style:

int

foo(void) { }

vs the Linux coding style:

int foo(void) { }

The BSD style allows me to find function definitions using git grep ^foo.

marcosdumay · a year ago

Those also make your language easier to parse, and to read.

Many people insist that IDEs make the entire point moot, but that's the kind of thing that make IDEs easier to write and debug, so I disagree.

johannes1234321 · a year ago

One thing which works for C is to search something like `[a-z] foo$.+$ \{` assuming that spacing matches the coding style, often the shorter form `[a-z] foo\(` works well, which tries to ensure there is a type definition and bin assignment or something before name. Then there is only a handful false positives.

veltas · a year ago

For most functions ^\S.*name( will find declarations and definitions.

Most of us use exuberant ctags to allow jumping to definitions.

dan-robertson · a year ago

Not sure this is very true for Common Lisp. Classic example are accessor functions where the generic function is created by whichever class is defined first and the method where the class is defined. Other macros will construct new symbols for function names (or take them from the macro arguments).

db48x · a year ago

That’s true, but I regard it as fairly minor. Accessor functions don't have any logic in them, so in practice you don’t have to grep for them. But it can be confusing for new players, since they don't know ahead of time which ones are accessors and which are not.

f1shy · a year ago

Still you can extend the concept without a lot of work, couldn't you?

akira2501 · a year ago

> so the best you can do is search for the name

This is why in C projects libs go in "lib/" and sources go in "src/". If your header files have the same directory structure as libs, then "include/" is a also a decent way to find definitions.

kazinator · a year ago

C has "classical" tooling like Cscope and Exuberant Ctags. The stuff works very well, except on the odd weird code that does idiotic things that should not be done with preprocessing.

Even for Lisp, you don't want to be grepping, or at least not all the time for basic things.

For TXR Lisp, I provide a program that will scan code and build (or add to) your tags file (either a Vim or Emacs compatible one).

Given

  (defstruct point ()
    x
    y)

it will let your editor jump to the definition of point, x and y.

throwawayffffas · a year ago

> Meanwhile C lacks any such keyword

It's a hassle. But not the end of the world.

I usually search for "doTheThing$.+?$ \{" first.

If I don't get a hit, or too many hits I move to "doTheThing$[^$]*?\) \{" and so on.

leogout · a year ago

Javascript is a bit trickier i think nowadays with the fat arrow notation : const myFunc = () => console. log("can't find me :p");

fsckboy · a year ago

C, starting with K&R, has all declarations and definitions on lines at the left margin, and little else. this is easy to grep for.

bionsystem · a year ago

Doesn't cscope fit this usecase ?

hgomersall · a year ago

Though glob imports in rust can hide a source, so those should be avoided.

mre · a year ago

Exactly. I wrote an entire blog post about that: https://corrode.dev/blog/dont-use-preludes-and-globs/

semiinfinitely · a year ago

python also!

jsjohnst · a year ago

Python is the only one mentioned that “actually works” without endless exceptions to the rule in the normal case. The ones mentioned (Rust/Javascript/Lisp/Go) all have specific syntax that is commonly enough used which makes it harder to search. Possible, absolutely, but still harder.

andersa · a year ago

Do people really use text search for this rather than an IDE that parses all of the code and knows exactly where each declaration is, able to instantly jump to them from a key press on any usage...? Wild.

iamwil · a year ago

Yes. Not everyone uses or likes an IDE. Also, when you lean on an IDE for navigation, there is a tendency to write more complicated code, since it feels easy to navigate, you don't feel the pain.

akritid · a year ago

Looks fine (subjective) and there is also ctags

darepublic · a year ago

There is arrow syntax with js

sva_ · a year ago

People don't use LSP?

gregjor · a year ago

That’s right, not everyone uses an LSP. Nothing wrong with LSPs, very useful tools. I use ripgrep, or plain grep if I have to, far more often than an LSP.

Working with legacy code — the scenario the author describes — I often can’t install anything on the server.

menaerus · a year ago

LSP doesn't always work without issues with large C and C++ codebases which is why one needs to fallback to grep techniques.

suprjami · a year ago

> Meanwhile C lacks any such keyword, so the best you can do is...

...use source code tagging or LSP.

gregjor · a year ago

ctags.

Dead Comment

jampekka · a year ago

Rust though does lose some of those points by more or less forcing[1] snake_case. It's really annoying to navigate bindings which are converted from camelCase.

I don't care which case is used. It's a trivial superficial thing, and tribal zealotry about such doesn't reflect well on the language and community.

[1] The warnings can be turned off, but in some cases it requires ugly hacks, and the community seems to be actively hostile to making it easier)

kibwen · a year ago

The Rust community is no more zealous about naming conventions than any other language which has naming conventions. Perhaps you're arguing against the concept of naming conventions in general, but that's not a Rust thing, every language of the past 20 years suggests naming conventions if for no other reason than every language provides a standard library which needs to follow some sort of naming conventions itself. Turning off the warnings emitted by the Rust compiler takes two lines of code, either at the root of the crate or in the crate manifest.