Readit News logoReadit News
sqs · a year ago
Sourcegraph CEO here. We made our main internal codebase (for our code search product) private. We did this to focus. It added a lot of extra work and risk to have stuff be open source and public. We gotta stay focused on building a great code search/intelligence product for our customers.

That's what ultimately lets us still do plenty of things for devs and the OSS community:

(1) Our super popular public code search is at https://sourcegraph.com/search, which is the same product customers use internally on their own codebases. We spend millions of dollars annually on this public instance with almost 1M OSS repositories to help out everyone using OSS (and we love when they like it so much they bring it into their company :-).

(2) We also have still have a ton of open-source code, like https://sourcegraph.com/github.com/sourcegraph/cody (our code AI tool).

BTW, if any founders out there are wondering whether they should make their own code open-source or public, happy to chat! Email in profile. I think it could make sense for a lot of companies, but more so for infrastructure products or client tools, not so much for full server-side end-user applications.

quantumwoke · a year ago
Been a fan of sourcegraph since 2016 or so, it's been exciting to watch the pivots along the way. That being said, the loss of transparency here is pretty sad, speaking as a large FOSS repo owner. What were the main factors apart from risk that went into the decision?
sqs · a year ago
Thanks for being a fan. And I understand it's a bummer to not have our code be public and open-source anymore. Sorry.

It's a bunch of reasons that add up. I'll give some more details for anyone curious.

(And I know that despite these reasons, lots of HNers probably wish it was not so. I agree! I too wish for a world where all companies could have their code be public and open source.)

- We have a lot of tech around large-scale code graph, indexing, etc., stuff that is very differentiated and hard to build. We were starting to put some of this in separate private repositories and link them in at build time, but that was complex. It added a lot of code complexity, risked bugs, and slowed us down, and if a lot of the awesome stuff was private anyway, what was the point?

- As we've been building Cody (https://cody.dev), our code AI tool, we've seen a LOT more abuse. That's what happens when you offer any free tier of a product with LLM inference. We had to move a lot more of our internal backend abuse logic to private repositories, and it added code complexity to incorporate that private stuff in at build time.

- It confused devs and customers to have 2 releases: an open-source release with less scaley/enterprisey features, and an enterprise release. It was a pain to migrate from one to the other (GitLab also felt this pain with their product) because the open-source build had a subset of the DB schema and other things. It was confusing to have a free tier on the enterprise release (lots of people got that mixed up with the open-source release), and it made our pricing and packaging complex so that lots of our time was spent helping customers understand what is paid and what isn't.

- There were actually very very few companies that were going to pay but then decided to use the open-source version and not pay us. A lot of people probably assume that's why we made this move, but it's not. I think this is because people like the product and see value in it, including all the large-scale code nav/search features that are in our enterprise version.

- Although very very few companies used our open-source version to avoid paying us, we did see it cause a lot of annoyance for devs who were asked by their management to try cloning our product or to research our codebase to give their procurement team ammunition to negotiate down our price. This honestly was just a waste of everyone's time.

- If we got a ton of contributions (we never really solicited any), then it might've changed the calculus. Sourcegraph is an end-user application that you use at work (and when fun-coding, but the primary revenue model is for us to charge companies). For various reason, end-user server-side applications just don't get nearly as many contributions. Maybe it's because you'd need to redeploy your build for a bunch of other users at your company, not just yourself. Maybe it's because they necessarily entail UX, frontend, and scaling stuff, in addition to just adding new features.

- We heard from people who left GitHub that people at GitHub were frequently monitoring our repository to get wind of our upcoming features and launches. Someone from GitHub told me his "job is to clone Sourcegraph". Since then, they obviously deprioritized their code search to re-found GitHub on AI, so we're not seeing this threat anymore. But I didn't love giving Microsoft an unfair advantage, especially since GitHub products are not open source either.

- Since we made our code non-open-source, we've been able to pursue a lot more big partnerships (e.g., with cloud providers and other distribution partners and resellers). This is a valuable revenue stream that helps us make a better product overall. Again, because Sourcegraph is an end-user application with a UI that devs constantly use and care about, we never really had the MongoDB/Redis/CockroachDB risk of AWS/GCP/Azure just deploying our stuff and cutting us out. We're not protecting from downside here, but we are enjoying the upside because now those kinds of distribution partnerships are viable for us. To give a specific example, within ~2 months of making our code non-open-source last year, we signed a $1M+ ARR deal through a distribution partner that would not have happened if our code was open source. This is not our biggest annual deal, but it's still really nice!

We are totally focused on building the best code search/intelligence and appreciate all our customers and all the feedback here. Hope this helps explain a bit more where we're coming from!

rapnie · a year ago
> (1) Our super popular public code search is at https://sourcegraph.com/search,

Correction: Public code on Github.

This looks to be restricted to searching Github only.. even though it had "context:global" on the querystring every hit came from Github, and none seen from Gitlab, Codeberg, Sourcehut and other self-hosted forges (e.g. Forgejo).

cqqxo4zV46cp · a year ago
I’m sure there are 50 other ways you could categorise all the code that it searches. Nobody said that it exhaustively searches all available open-source code. I’m sure you know that that’s an impossible claim. This isn’t a correction at all. It is, at best, an elaboration. Certainly not worthy of the snark you’re giving. The reality is that GitHub hosts >99% of all open-source source code that anyone really cares about. If you have some philosophical issue with it, that’s fine, but don’t shoot the messenger by attacking individuals.
depr · a year ago
I hope code search will one day be offered at a lower price, so small/medium sized companies can use the product. I'll never be able to convince someone to buy it when it's 3 or more time as expensive source code hosting, and would in many cases be most expensive SaaS product per developer seat that the company uses. But it's a great product.
prepend · a year ago
I feel the same way. It’s really interesting and provides cool insights. But it seems hard to explain to myself to spend more on that than GitHub or IDEs.

I’d like to hear more about the value customers get out of it as I wonder if it’s just groups with unlimited budget.

beyang · a year ago
This is in the cards and thank you for the feedback! (Sourcegraph CTO here)
0x1ch · a year ago
$9 to $20 per seat seems pretty average in the grand scheme of SaaS price modelling. I don't work in software development, but IT however.
adhamsalama · a year ago
Why not go the SQLite way? Open source but don't accept external contributions. Literally just dump the code.
cryptonector · a year ago
> Open source but don't accept external contributions.

That's not the key to the SQLite model.

The key to the SQLite model is that their 100% code coverage testsuite is proprietary. You can't credibly fork SQLite3 w/o a similar testsuite because everyone knows that SQLite3 has that testsuite and so it's simply better unless your fork has one too.

This works very well for SQLite because it is the single most widely used piece of software ever. And that is because it solves such an important and universal problem (a local DB RDBMS) with such convenience (embedded, server-less).

The reason that SQLite does not accept contributions is not so much that they don't want to, but that contributors can't contribute changes to the proprietary testsuite, and writing those is harder than writing the contribution, therefore contributions impose a big tax on the SQLite dev team that they prefer not to pay.

Very few other pieces of software have similar universal usage/applicability/convenience stories. Therefore it's not easy to apply the SQLite model to all the things.

Sourcegraph could have a business source-available option. If all you want is to be able to be able to debug problems and/or make contributions and you're a paying customer, then why would that not be enough? SQLite is essentially source-available given that you can neither contribute to nor credibly fork it.

breck · a year ago
I think the term the industry needs to embrace is "Early Source": https://breckyunits.com/earlySource.html

Make everything public domain, fully open source, just delayed by N years.

ezekg · a year ago
There is a term for this, no? https://opensource.org/dosp
mort96 · a year ago
Why?
dudeinjapan · a year ago
Will be lovely to have the source N years after AGI terminates humanity.
a_t48 · a year ago
The open/closed decision is a current weight on my mind right now. Our main competition is an open source product - it feels like it will be a tough sell to not also have the core of the product be free (Robotics framework). I might shoot you an email.
sqs · a year ago
Cool! I’d love to chat.
BaculumMeumEst · a year ago
This thread reminded me to finally try Cody, I've been bouncing on and off Copilot for a few months. I wish I knew how good this was sooner, and I had no idea there was a generous free tier.
wesleyyue · a year ago
If you're open to trying new AI coding assistants, would love if you can give https://double.bot a try! (note: I'm one of the creators) The main philosophical differences is that we are more expensive and are trying to build the best copilot with the technology possible at any given time. For example, we serve a larger, more accurate, and more modern autocomplete model, but it does cost more to serve. We also do a lot of somewhat novel work in getting the details right, like improving the autocomplete model to never screw up closing brackets, and always auto-close them as if you typed them.
jdorfman · a year ago
If you (or anyone here) are an open source maintainer, please sign up for free Cody Pro credits https://sourcegraph.com/supporting-open-source
cryptonector · a year ago
For business Open source is a business tool. Open source can be a goal, naturally, but for-profit entities have a duty to be profitable (or grow, plowing profits into building). I think there's no shame in saying this. You should not need to be elliptical in your public statements about this move. Everyone knows that this is about protecting your ability to monetize the product, and so it should be, and everyone knows this sort of move comes eventually.
bpmooch · a year ago
> (1) Our super popular public code search is at https://sourcegraph.com/search, which is the same product customers use internally on their own codebases. We spend millions of dollars annually on this public instance with almost 1M OSS repositories to help out everyone using OSS (and we love when they like it so much they bring it into their company :-).

If open source wasn't a current marketing fad, you would spend the same amount on other things. You're not doing it because you love open source.

hud_dev · a year ago
| Sourcegraph CEO here.

Seems like you need to get back to your job of CEOing and leave the public outreach to the folks whose job it actually is? If you haven't fired them all? Any publicity person worth their salt will tell you: shut up. Don't talk. Leave it to the professionals. You're making everything worse.

benreesman · a year ago
Your product is really cool. Sometimes it makes sense to iterate in this or that repo.

Obligatory: “Victory has defeated you.”

cxr · a year ago
Yet another person equivocating the concepts of publishing code under an open source license and managing a project in public.
nearlyepic · a year ago
It has to be disingenuous, right? These concepts aren't complicated. I wish they would just say "we want to make more money" and stop polluting open-source discourse.
mort96 · a year ago
Huh in what way does publishing a source tarball alongside a release introduce a lot of work, risk and distraction? Your explanation makes literally no sense

EDIT: I implore the downvoters to think about this for a second. You can, actually, publish source code for a project without also committing to providing support and documentation and testing across a variety of systems. Publishing a tarball takes very little time and effort.

collingreen · a year ago
Doing a great job on an open source codebase requires a higher level of polish, testing, design, ux, documentation, architecture, and general forethought than internal tools just like any internal vs self serve product.

Only solving your own problems on your own hardware while being able to rely on your own well-informed team to bridge the gaps sounds much much faster and easier to me.

Deleted Comment

sixhobbits · a year ago
I used to always point to Sourcegraph as a company that really understood dev culture and what it took to make devs happy, so this slow transition has definitely been painful to watch.

Just yesterday someone asked for an example of a public roadmap for a technical product, so I spent some time looking for Sourcegraph's, only to find out that they've also made most of their docs private. The public handbook was an amazing resource before, now it's been moved to Notion, and most of the interesting bits are links to private Google documents (which they used to do only for financial documents and other stuff that obviously needed to stay private).

Sad!

iknownthing · a year ago
I interviewed with them once, they strung me along for about 6 months then ghosted me.
mdaniel · a year ago
As a counterpoint, they scheduled me within days and I left the office with an offer letter

I'm cognizant that company culture is not one fixed thing, so maybe they're way different when you interacted with them versus when I did, I'm just saying I had the opposite experience so I doubt it's a trend

MzHN · a year ago
They also recently(?) silently destroyed[1] their public search index at sourcegraph.com/search. Since GitHub only recently got a working search and even that is behind login, I used to search a lot using Sourcegraph. It even supported searching GitLab.

Now it seems that all GitLab repos are gone from the index and a huge number of GitHub repos as well. If I can't trust the search I'll just have no choice but to fall back to GitHub.

It's a shame since their index was at some point even better than GitHub's own, although GitHub seems to have caught up.

[1] https://community.sourcegraph.com/t/most-public-repos-no-lon...

sqs · a year ago
We still have tons of repositories searchable at https://sourcegraph.com/search, almost a million. We did cull lots of non-GitHub repositories and repositories with less star. It was very complex to keep up with millions of repositories due to GitHub rate limits and scaling. We tried to keep as many as possible while still being able to focus on making a good product for customers (our biggest customer has ~600k repositories).

We're still spending millions of dollars annually to offer public code search, so our intent is certainly not to "destroy" it! If you have repositories you want us to add that are below the star threshold, please post at https://community.sourcegraph.com/t/most-public-repos-no-lon....

elashri · a year ago
Most of the academic open source projects except big names in scientific computing will not be searchable if you are relying on stars as a criteria.
MzHN · a year ago
I appreciate that it is a free service and thank you for the time it worked for me.

At the same time I am a bit sad to see my use cases break. I often resort to more advanced code search when I have really obscure problems, for which the answers might be some old GitHub (or GitLab) repositories. I'm less interested in up-to-date information for those, so a stale index is better than no index for me.

But I can also feel the pain of working with GitHub and GitLab and their rate limits and such.

cryptonector · a year ago
Can you make star count and participant count part of the search? `stars:>99` could be a search term to limit the search to repositories with at least 100 stars.
IceWreck · a year ago
> (our biggest customer has ~600k repositories

I'm wondering what kind of company has (or needs) 600k repos.

notpushkin · a year ago
https://grep.app/ is another good one. Not sure how many repos they index though.
Alifatisk · a year ago
It says half million git repos on the main page
speedgoose · a year ago
It's a bit sad. I forked ~~the last~~ an open-source version some time ago[0]. I removed the telemetry, disabled updates, removed the proprietary code, made a docker image, and implemented some lightweight oauth2/oauth2-proxy authentication.

I plan to keep it running behind Oauth2-Proxy for a long time. It has been very reliable software and because it's behind a supposedly secure proxy, I don't feel bad about not updating it.

[0] https://github.com/SINTEF/sourcegraph

notpushkin · a year ago
Thank you for this!

I think 5.0.6 is the last open source version though. Have you considered updating? (Not sure how viable it would be – seems they've moved quite a few things around)

speedgoose · a year ago
Oh, my memory failed me. I don't know when I will have time to update, but that sounds like something that could be done!
cdchn · a year ago
This is awesome thank you for this.
alin23 · a year ago
Damn, I use Sourcegraph so much for my reverse engineering efforts on macOS. They index all those private framework symbols that people extract on every macOS release, and allow searching for headers and even how they are called by other developers that were ahead of me.

A big part of https://lunar.fyi exists thanks to Sourcegraph search. Even now I'm using it to find a way to enable the second monitor on M3 MacBooks without needing to close the lid [1].

I really hope this is not a sign of them taking back the ability to search in the future.

[1] https://alinpanaitiu.com/blog/turn-off-macbook-display-clams...

sqs · a year ago
Glad you use Sourcegraph! I remember that blog post and thought it was awesome. I am the Sourcegraph CEO, and we haven't changed anything about our public code search at https://sourcegraph.com/search. That's the same product tons of customers use for their internal code, and our public code search is a really important way for us to dogfood, iterate fast, etc.

We just made our own internal codebase private.

jlokier · a year ago
> I am the Sourcegraph CEO, and we haven't changed anything about our public code search at https://sourcegraph.com/search.

But in this other comment (https://news.ycombinator.com/item?id=41298516), you said you have changed public search in two significant ways:

> We did cull lots of non-GitHub repositories and repositories with less star.

Removing low-star repos (and non-GitHub high-star repos) affects users who are looking for obscure or hard-to-find information that's not found anywhere in "popular" repos. I think most of my searches on GitHub (or via Google) are for things in repos with zero stars.

> If you have repositories you want us to add that are below the star threshold [..]

How would I go about finding which repos to request, if my objective is to search the "long tail" for information? That seems like I would need an automated search engine first, to discover the repos :-)

If I found the repo containing specific, obscure or hard-to-find information I was looking for, what would I gain from writing to SourceGraph asking to add that one repo? By the time I've found the right repo, I've probably found the information I'm going to get from it. Future searches will likely need a different repo, one I don't know about yet. Perhaps that's the nature of long tail searches.

alin23 · a year ago
Ok, so glad to hear that from you directly! Thank you for all the value you’ve put out there for free!

About the codebase part, I don’t have any need for it so I’m not affected by this, but I wonder if it was possible to keep the current state of the code frozen in a public repository and only make private the future work.

That’s how I did it on Lunar, that’s also how the BetterDisplay dev did, it was a good compromise so as to not steal anything that was already free. But of course we don’t have the same business model or licensing needs so I’m pretty sure I’m missing something.

The way I did it is: - freeze the public code to a new branch “lunar3” - make a private repo LunarPro which works exactly like the previous Lunar repo - but on every commit the private repo syncs the code in an encrypted form to the public repo

That way, permalinks remain valid, everything that was free and accessible before is still available in the future and the branch serves as a “compilable” state without any encrypted files.

But again, I’m just one and you’re many, it might get hard to maintain this structure in a team. And some people might still find things to complain about. I know it was that way for me.

welder · a year ago
> I really hope this is not a sign of them taking back the ability to search in the future.

Searching repos seems to be unchanged:

https://sourcegraph.com/search

EMIRELADERO · a year ago
Straight-up making all dev work private is very weird and perplexing. Why would their business model (which they had since some time, mind you) require not only a proprietary/"open core" license, which I would understand, but complete secrecy around source code itself? What business goal couldn't be accomplished with licensing restrictions alone? And is that difference in potential income generated by this new secret-requiring business model so big that it justifies throwing away the entire "open nature" of the company that has been a core value for most of its existence?
jsiepkes · a year ago
I've seen this multiple times with companies. Another example which went fully closed in an instant is ForgeRock (OpenAM, etc.). Usually it happens when management caves in to complaints from sales. Who will complain being open makes selling the product hard. In the end they will probably find out it's just the sales people's "excuse du-jour" and even after closing the source they still don't hit their targets.
jsiepkes · a year ago
The software Heritage project has archived most of their repo's. Including the main one [1]. Last crawl seems to be of mid July 2024.

[1] https://archive.softwareheritage.org/browse/origin/directory...

sunaookami · a year ago
According to the article, this is the current public snapshot: https://github.com/sourcegraph/sourcegraph-public-snapshot
iddan · a year ago
I wonder from all the people commenting here how much they relied on Source Graph, and how many actually paid for it. Running an open-source company is hard, just like running any company is. Sometimes you understand there are things you just can't give out for free, and that's part of maturing as a company.
CAP_NET_ADMIN · a year ago
My company looked into paying Sourcegraph many times in the past, but they were prohibtively expensive every time we checked.

It's 49 USD per user per month for Code Search, like what the hell man? It's more than twice as expensive as Github Enterprise. Almost twice the cost of Gitlab Premium.

At some point it was 100USD per month per dev, I also remember it being "Starts from 5k USD per year", you can find some quotes for that in old submissions regarding Sourcegraph going open, closed, open and closed again.

kstrauser · a year ago
That's so often the case. I was recently looking at supply chain security / SBOM software. "Coincidentally", 3 different vendors with 3 very different products quoted us the exact same annual price for the features we wanted, and that price was on the order of magnitude of "hire someone to do this manually full-time".

There are IMO too many companies that have no tier between Free and Enterprise. I understand the desire to focus on a small number of whales, but can't help feeling like that's leaving money on the table from all the smaller companies who'd be willing to pay something in the middle.

pjmlp · a year ago
100% this.

Devs have to learn the hard way to behave like the other professionals, want nice things to stay around?

Pay for the tools.

hk__2 · a year ago
This tool is $49/user/mo. That’s more than the price I pay for a single 12-core + 64GB RAM server!

Edit: Ah, and it’s 50 users minimum, so the starting price is $2450/mo.

marcinzm · a year ago
You mean pay for your own tools and then get fired for circumventing the corporate security policies on what tools you can use?
sunaookami · a year ago
Paying doesn't guarantee anything. There are tons of examples of devs selling out even though their program/SaaS is paid.
josephcsible · a year ago
Pretending to embrace open source while you're getting a foothold and then abandoning it as soon as you become successful isn't "maturing". It's pulling the ladder up behind you.