Saying Goodbye to GitHub

I'm gonna get hated on for this, but I don't think "give back" is an open source concept.

I'm not aware of any Open Source license,or Free license for that matter,that has a give-back clause. Source code is available to -users- ,not prior-authors.

Some Open Source licenses can be used in proprietary code, (MIT, BSD etc) with little more than simple attribution.

Those developers chose that license for a reason, and I've got no problem with commercial entities using that code.

There is a valid argument to be made about the training of models on GPL code. (Argument in the sense that there's two sides to the coin.) On the one hand we happily train humans on GPL code. Those humans can then write their own functions,but for trivial functions they're gonna look a lot like GPL Source.

If the AI is regurgitating GPL code as-is, then that's a problem- not dissimilar to a student or employee regurgitating the same code.

But this argument is about Free software licenses,not really (most?) Open Source licenses.

Either way OSS/Free is not about "giving back",its about giving forward.

In the specific case here of co-pilot making money,I'd say A) you're allowed to make money from Free/OSS code. B) no one is forcing you to use this feature.

uuuguaii · 3 years ago

> I'm not aware of any Open Source license,or Free license for that matter,that has a give-back clause. Source code is available to -users- ,not prior-authors.

In essence, copyleft licenses are exactly that. They oblige the author of a derived work to publish the changes to all users under the same terms. The original authors tend to be users. So, a license which would grant this directly to the original authors would end up providing the same end result since the original authors would be both allowed to and reasonably expected to distribute the derived work to their users as well.

This aligns with the reason why some people publish their work under copyleft licenses: You get my work, for free, and the deal is that if you find and fix bugs then I get to benefit from those fixes by them flowing back to me. Obviously as long as you only use them privately you are not obliged to anything, the copyleft author gives you that option, but once you publish any of this, we all share the results.

That's the spirit here and trying to argue around that with technicalities is disingenuous. That's what Copilot does since it ignores this deal.

tm-guimaraes · 3 years ago

> That's the spirit here

not really.

All this Free Software movement started by something really similar to "right to repair", a firmware bug in a printer that was proprietary software. Free Software is about being in control of software you use. The spirit was never "contribute back to GNU", the spirit was always "if you take GNU software, you can't make it non-free". Those GNU devs at the time just wanted a good and actually free/libre OS, that would remain free no matter who distributed it.

You are using expectations of modern day devs in the world of a lot of social development thanks to Github.

You might claim that GP was using the technicalities of the licences, but you can actually check the whole FSF philosophy an you note that they align perfectly with "giving forward" not "giving back".

Free Software is about user's freedom. Not dev rights, or politeness, etc. Now obviously, some devs picked up copyleft licenses with the purpose of improving their own software from downstream changes (Linus states that is the reason he picked GPL), but that's a nice side effect, not the purpose. Which ofc, with popular social sharing platforms like github, those things gets confused.

bruce511 · 3 years ago

I'm not sure I agree with this as a general point of view.

Speaking generally, I'm not sure that one can claim

>> The original authors tend to be the users

There are endless forks of say emacs,and I expect RMS is not s user of any of them.

Of course RMS is free to inspect the code for all of them, separate out bug fixes from features, and retro apply it to his build. But I'm not seeing anything in any license that requires a fork to "push" bug fixes to him.

>> This aligns with the reason why some people publish their work under copyleft licenses: You get my work, for free, and the deal is that if you find and fix bugs then I get to benefit from those fixes by them flowing back to me.

I think you are reading terms into the license that simply don't exist. I agree a lot of programmers -believe- this is how Free Software works, and many do push bug fixes upstream, but that's orthogonal to Free Software principles, and outside the terms of the license.

>> That's the spirit here and trying to argue around that with technicalities is disingenuous.

Licenses are matters of law, not spirit. The original post is about this "spirit". My thesis is that he, and you, are inferring responsibilities that are simply not in the license. This isn't a technicality,it goes to the very heart of Free Software.

noirscape · 3 years ago

You'd think so, but there's also a good chunk of copyleft code that's just "here's our source code, go figure out how to deploy lol".

You can try to fork it into something workable, but that can sometimes literally mean trying to figure out what the actual deployment process is and what weird tweaks were done to the deploying device beforehand. In addition, forking those projects is also unworkable if the original has pretty much enterprise-speed development. At best you get a fork that's years out of date where the maintainer is nitpicking every PR and is burnt out enough to not make it worthwhile to merge upstream patches. At worst, you get something like Iceweasel[0] where someone just releases patches rather than a full fork (and having done that a few times, it's a pain in the neck to maintain those patches).

FOSS isn't at all inherently community-minded; it can be and can facilitate it, but it can also be used as a way to get cheap cred from the people who are naïeve enough to believe the former is the only place it applies.

[0]: "Fork" of Firefox LTS by the GNU Foundation to strip out trademarked names and logo's. It's probably one of their silliest projects in term of relevancy.

JaumeGreen · 3 years ago

> They oblige the author of a derived work to publish the changes to all users under the same terms. The original authors tend to be users. So, a license which would grant this directly to the original authors would end up providing the same end result since the original authors would be both allowed to and reasonably expected to distribute the derived work to their users as well.

I might be in the wrong, but this is not how I understand GPL [0]. Care to correct me if I'm wrong.

What I get from the license is that you have to share the code with the users of your program, not anyone else.

AFAIK you could do an Emacs fork and ask money for it. Not only that but the source code only needs to be available to the recipients of the software, not anyone else.

A company could have an upgraded version of a GPL tool and not share it with anyone outside the company. Theoretically employees might share the code outside, but I doubt they'd dared.

[0] https://www.gnu.org/software/emacs/manual/html_node/emacs/Co...

samtho · 3 years ago

> That's the spirit here and trying to argue around that with technicalities is disingenuous.

First, I am not a lawyer, but don't licenses exist precisely for their technicalities? This is not like a law on the books in which case we can consider the "Letter and Spirit of the law" because we know in what context in which it was written in/for. With a written license however, someone chooses to adopt a license and accepts those terms from an authorship point-of-view.

FrustratedMonky · 3 years ago

Exactly. We all benefit from sharing contributions to the same code base. I use your library, you use mine, we fix each others bugs, add features, etc... The code gets better.

japhyr · 3 years ago

No need to hate on you for a valid response.

I think we're in a new enough situation that we can look beyond what's legal in a license. When many of us started working on open source projects, AI was a far-off concept. Speaking for myself, I thought we'd see steady improvement in code-completion tools, but I didn't think I'd see anything like GPT-4 in my lifetime.

Licenses were written for humans working with code. We can talk about corporations as well, but when I've thought about corporations in the past, I thought about people working on code at corporations. The idea of an AI using my open source project to generate working code for someone or some corporation feels...different.

Yes, I'm talking explicitly about feelings. I know my feelings don't impact the legalities of a license. But feelings are worth talking about, especially as we're all finding the boundaries of new tools and new ways of working.

I don't agree with everything in the post, but I think this is a great conversation to be having.

happymellon · 3 years ago

> Yes, I'm talking explicitly about feelings. I know my feelings don't impact the legalities of a license.

They don't impact the current legality of a licence, but it will affect future ones.

GPL/BSD/Apache/proprietary, they are all picked for ideological concerns which all stem from feelings. It is good to discuss these things, and it is good to recognise that these are emotionally driven.

FrustratedMonky · 3 years ago

Yes. People here seem to be forgetting that Open Source was a community driven ideal first. The License came later as "protection". Corporations were stealing code and there was no recourse. The variety of open source licenses were created to provide a framework for the community, to fight off stealing, to keep it open. So GPT is very much 'laundering' the code just like criminals 'launder' money.

bruce511 · 3 years ago

I agree that AI usage of code is somewhat murky with current licenses,which obviously don't mention it either way.

Free software has a principle of "freedom to run, to do whatever you wish" (freedom 0), so arguably has said that training AI is OK. (We could quibble over the word Run, but the Gnu.org,and RMS clearly say "freedom 0 does not restrict how you use it."

GPL code can be used by the military to develop nuclear weapons. Given that the is a guiding principle of the FSF its hard to argue that the current usage is not OK.

Lutger · 3 years ago

Fully agree.

This may seem a bit nitpicky and philosophical, but anyway: these feelings you mention are about things, and these things the feelings come from are what is most important. Feelings are never standalone, if they are they are just moods which are so personal its hard to have a conversation over.

Let's call 'the things' values. I'd say feelings are perceptions of values, and as such they invariably have a conceptual element to them. And exactly that conceptual aspect makes them suitable for conversation and sometimes even debate, insofar as they can be incorrect. We can acknowledge the subjective, emotive aspect of feelings as highly and inalienably personal, respect the individual opinion behind them and contest the implicit truth-claims all at the same time.

dijit · 3 years ago

> I'm not aware of any Open Source license,or Free license for that matter,that has a give-back clause.

§5.c of the GPL Version 3 states explicitly:

> You must license the entire work, as a whole, under this License to anyone who comes into possession of a copy. This License will therefore apply, along with any applicable section 7 additional terms, to the whole of the work, and all its parts, regardless of how they are packaged. This License gives no permission to license the work in any other way, but it does not invalidate such permission if you have separately received it.

As in, all modifications must be made available. Is that not meeting your definition of giving back? GPL (all variants) is one of the most widely distributed of the free software licenses and has an explicit "give back" clause as far as I can see it. -- and is part of why some people referred to GPL as a "cancer".

FWIW the issue I've come to have with copilot is that you're not explicitly permitted to use the suggestions for anything other than inspiration (as per their terms), there is no license given to use the code that is generated. You do so at your own risk.

bruce511 · 3 years ago

>> As in, all modifications must be made available. Is that not meeting your definition of giving back?

Available to all users. Not previous authors. There may be overlap, or there may not be overlap.

Plus, I would say it's giving forward, not back. If there are public users then the original authors can become users and get the code. But there will be bug fixes and features smooshed together.

Which is why i posit that there's no "give back" concept in the license. Only "give forward".

desas · 3 years ago

Modifications you distribute have to be given back. You don't have to distribute the modifications though.

bostik · 3 years ago

I would say that open source as a movement sprung up from the principles of early netiquette[0]. Which themselves were built on the foundations of sharing your knowledge with your peers.

Whether you were trawling Usenet or just a presence in your local BBS scene, "teach it forward" was always a core concept. Still is. It's difficult to pay back to the person who taught you something valuable, so you can instead pay it forward by teaching the lessons - along with your own additions - to the later newcomers.

Of course the Eternal September changed the landscape. And now we can't have nice things.

0: https://en.wikipedia.org/wiki/Etiquette_in_technology#Netiqu...

detourdog · 3 years ago

I think the big move to formalization of GNU was that no free compilers for C existed. RS rightly saw this as a problem and did what he thought was needed to get a universal free c compiler.

EarlKing · 3 years ago

Co-pilot spits back protected expressions, not novel expressions based on ideas harvested from code. It is therefore violating the licenses of numerous free and open source projects. The author is right to be pissed.

theonlybutlet · 3 years ago

That's not the case, there's a probability it may "spit back" the protected expression. There's also a probability I, as a human "spit back" the protected expressions. This could either be by pure chance or from past learnings, reading the protected code and internalizing it as a solution, my subconscious forgetting I actually saw it elsewhere.

In Uni, students run their theses through plagiarism checkers, even if it's novel research as it naturally occurs.

As the thought experiment goes, given infinity, a monkey with a typewriter will inevitably write Shakespeares works.

sureglymop · 3 years ago

You are correct. The problem is that the GitHub Terms of Service probably (guessing) have a clause which invalidates your license if you upload your code there. And that's exactly why you shouldn't use GitHub.

natch · 3 years ago

This seems to be what people imagine about it, not what it actually does, although I don’t doubt you could cherry-pick some snippet after a lot of trial and error to try to claim that it had regurgitated something verbatim. But certainly let’s see the examples.

rob74 · 3 years ago

You're allowed to make money from Free/OSS code, and plenty of companies have (Google, Amazon etc.), but they have always also at least given back something to the community to earn some good will. The situation with AI is new because it not only doesn't give anything back, it actually takes something away by threatening developers' jobs etc.

dolmen · 3 years ago

> they have always also at least given back something to the community to earn some good will

The busybox authors disagree: https://busybox.net/license.html

layer8 · 3 years ago

One possible problem is if Copilot gets good enough that you can rather easily sidestep GPL (or any other license) by having Copilot implement functionality X for you instead of using a license-bound library providing X. Not only may this be questionable regarding the license, but it would also be tend to reduce contributions to the library which otherwise would have been used.

zirgs · 3 years ago

ChatGPT already can rewrite functions to use different algorithms.

the_gipsy · 3 years ago

Shouldn't be different from a human ripping or regurgitating parts of the code.

If it's not a hard derivation, then it's difficult to prove or even notice.

EthanHeilman · 3 years ago

It would be interesting to have Free Software License that requires that any thing which ingests the source code must be Free Software running on Free Hardware. If you train a model on such inputs, your model would need to be Free software and all the hardware the model runs on would need to be Free Hardware. This would create a massive incentivize to either not use such software in your model or to use Free Software and Free Hardware.

Taken to its logical conclusion, you could add the notion of Free Humans are legally bound to only produce Free Ideas. One could imagine this functioning like sort of monastic vow of charity or chastity. "Vow of silence on producing anything which is not Free (as in freedom)."

Would you take such a vow if offered 100,000 USD/year for the rest of your life (adjusted for inflation)? I would.

b3morales · 3 years ago

This idea ("make a stronger license") has come up in previous discussions of Copilot as well[0].

The problem is that the Copilot project doesn't claim to be abiding by the license(s) of the ingested code. The reply to licensing concerns was that licensing doesn't apply to their use. So unfortunately they would just claim they could ignore your hypothetical Free³ license as well.

[0]: https://news.ycombinator.com/item?id=34277352

gampleman · 3 years ago

From the MIT license:

> The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

Does GPT spit out the copyright notice when it regurgitates my code?

Uvix · 3 years ago

Is GPT spitting out “substantial portions” of your code?

GoblinSlayer · 3 years ago

Trivial functions aren't copyrightable, so it doesn't matter whether they are GPL or not.

bombolo · 3 years ago

> If the AI is regurgitating GPL code as-is, then that's a problem- not dissimilar to a student or employee regurgitating the same code.

Not "if". We know it does.

And since it doesn't show citations, it might be the case that you use and mistakenly end up making your entire software GPL, because of including copy pasted GPL software.

rdiddly · 3 years ago

Yeah on the one hand, isn't opening your source all about not really minding what happens to it after that? It's intended to be copied and used. On the other hand something about the term "laundering" kind of resonated for me. It's kind of like automated plagiarism where you spread your copying out over millions of people. But plagiarism only has meaning as an offense when the thing being copied isn't intended to be copied. But for copyright purposes is there a difference between copying exactly, and the type of blending a LLM does? I'm too confused. That feeling when you hit on something society has never thought about before.

zelphirkalt · 3 years ago

If we go by how many people explain open source, you would be right, but if we go by how people who actually know what their licenses are supposed to do explain open source, then no. You give a license for a specific reason. One might be to allow others to copy, but there is usually a condition, and that is to leave the license information intact. If we go further towards free/libre software licenses like GPL or AGPL, then we have more conditions to it. For example that, if you distribute software using that code, you need to distribute the source of your software as well (a bit imprecise).

If you want to get a better picture of the situation, read up on the licenses and what they do, specifically the term "copyleft".

uuuguaii · 3 years ago

> Yeah on the one hand, isn't opening your source all about not really minding what happens to it after that?

No! That's a gross misrepresentation of what open sourcing is. It's the offer of a deal. You publish the source code and in return for looking at it and using it for sth, I have obligations. Like attribution and licensing requirements regarding derived works.

lionkor · 3 years ago

no, open source isn't about practically giving up your rights, its about restricting use of your code and software in exactly such a way that it gives every user as much freedom as possible.

layer8 · 3 years ago

This actually has been thought about before, in the context of remixes, collages, etc. The essential question is how much of the originality of the original work(s) constitutes the originality of the new/derived work. If it is little enough, then it’s okay. The issue with AI models is that they have no way of assessing originality and tracking the transfer of originality.

AmericanChopper · 3 years ago

The term is being used here to imply that the generated code is somehow bypassing the licensing requirements, which isn’t necessarily true, and certainly isn’t a substantiated claim.

You can read licensed code, learn from it, and then write your own code derived from that learning, without having committed a copyright violation.

You can also read licensed code, directly copy paste it into your codebase, and still not have committed a copyright violation, as long as you did so in a way that constituted fair use (which copy-pasting snippets certainly would).

There’s no copyright issue here at all, and rationally speaking there aren’t any legitimate misuse of open source concerns either. If these people were honest they’d just admit to feeling threatened by AI, but nobody would care about that, so they just try to manufacture some fake moral panic.

taink · 3 years ago

I agree that copyleft is more about "giving forward", and I think it's a confusion a lot of people make. Reading through the thread, I get the impression that some think as soon as one "distributes" the licensed material, original authors should get a copy. I'm extrapolating of course, but even then I feel some people would agree with that statement.

GPL, for instance, merely states that distributed sources or patches "based on" the program should be "conveyed" under the same terms. In other words, anyone who gets their hands on it will do so under the same license.

If anything, I would be worried that GitHub trained itself on publicly-available but not clearly licensed code, because then it would have no license to "use" it in any way[0]. GPL provides such a right, so there is no problem there. It would be even more worrying if the not clearly licensed code was in a private repository but I think I remember reading that private repositories were not included in the training data.

However, would you consider a black box program, of which the output can consistently produce verbatim or at the very least slightly modified copies of code from GPL code to be transformative? The problem does not lie in how the code is distributed but in how transformative the distributed code is. Not only does the same apply to any program besides AI-powered software, it applies to humans[1].

Given how unpredictable the output of an AI is, one should not be able to train itself on GPL code if it cannot reliably guarantee it will not produce infringing code.

[0]: https://docs.github.com/en/site-policy/github-terms/github-t... (https://archive.ph/susi0#4-license-grant-to-us)

[1]: One such example would be how Microsoft employees allegedly prevented themselves from reading refterm source code, cf. https://github.com/microsoft/terminal/issues/10462#issuecomm...

starkd · 3 years ago

Perhaps I'm out of the loop on this, but I always thought the concept of open source was primarily about the opportunity for personal professional development. The ability for someone not connected with a corporation to stay relevant and continuously update his skills in away that was not dependent on proprietary systems. That is a huge asset, not only for oneself but also for the the world.

galleywest200 · 3 years ago

That may be a benefit, but the primary concept of open source is that its open source.

viciousplant · 3 years ago

Time will tell, and it's a destined trend for more devs close sourcing their code no matter what curious angle you are trying to justify large firms using AI exploiting money, and I doubt you are working for one of them.

lenkite · 3 years ago

IMHO, just like there was a robots.txt file made for the web, there needs to be a NOAI.txt for git repos. Sorry, this repo does not permit you to ingest the code for a learning model. Seems completely reasonable.

welshwelsh · 3 years ago

What would be the point of that?

If we were somehow able to prevent AI models from ingesting a codebase, that would mean everyone else who wants to produce similar code would have to re-invent the wheel, wasting their time repeating work that has already been done.

All because... the person who did it first wants attribution? They want their name to be included in some credits.txt file that nobody will ever read? That's ridiculous.

xigoi · 3 years ago

That wouldn't work. robots.txt is not enforceable.

ye-olde-sysrq · 3 years ago

I wonder if the GPLv4 will be coming out soon with an anti-AI-training clause (unless your output model and its outputs are also GPLv4, of course).

cxr · 3 years ago

People keep bringing this up. It's not as straightforward as a clause that says "you can't use this to train AI" (which is what I suspect many people think).

Licensing operates on a continuum of permissiveness. They can only relax the restrictions that you as a creator are given by default. You can't write a copyright license that adds them. You could write a legal instrument that compels and prohibits certain behavior(s), but at that point you're talking about a contract. (And there's no way to coerce anyone to agree with the contract.)

Harry Potter has even more restrictions than the GPL or any other open source license. It's "All Rights Reserved—it enjoys the maximum protections that a work can. And yet it would still be possible feed it into an AI model, even if all of Rowling, Bloomsbury, and Scholastic didn't want you to. They don't get a say in that. Nor do open source software developers in their works which selectively abandon some of the protections that Rowling reserves for herself and her business partners.

The only real viable path to achieve this using an IP license alone would be a React PATENTS-like termination clause—if your company engages in any kind of AI training that uses this project as an input, then your license to make copies (including distributing modified copies) under the ordinary terms are revoked, along with revoking permission for a huge swathe of other free/open source software owned by a bunch of other signatories, too. This is, of course, contingent upon the ability to freely copy and modify a given set of works being appealing enough to get people to abstain from the lure of building AI models and offering services based on them.

robertlagrant · 3 years ago

> I'm gonna get hated on for this, but I don't think "give back" is an open source concept.

You're right. It's a politeness law some people have invented.

It's also a value people have, but that's for themselves. I like contributing to OSS projects. But, as soon as it's imposed on others, and there are punishments for disobeying, it's a politeness law.

Eisenstein · 3 years ago

Politeness keeps society civil and sane.

numpad0 · 3 years ago

    then that's a problem- not dissimilar to  
                          ^

Discontinuity here.

HenriTEL · 3 years ago

> On the one hand we happily train humans on GPL code. Those humans can then write their own functions,but for trivial functions they're gonna look a lot like GPL Source.

Exactly. People are getting mad that Microsoft is making good money while the people who made all that free software available mostly did it for free (like in no money and no recognition). It can sound unfair but that's the deal. If you didn't want people or AI to learn from your code, open source was not the right option.

mftrhu · 3 years ago

> If you didn't want people or AI to learn from your code, open source was not the right option.

There's nothing wrong with other people using - learning and creating derivative works of - one's open-source code, provided they respect the terms of the license. It seems to me that the real issue is the fact that these licenses don't have enough teeth.

WinstonSmith84 · 3 years ago

Most people I know who contribute, or host open source projects, me included, do this for references. And the most successful ones find a way to generate revenue. "Giving back" is a nice additional thing, but I don't know anybody who does that _primarily_ to "help the world"

slg · 3 years ago

If we are being honest as a community, open source developer are pretty far down the list of groups with valid grievances against this current wave of AI for how they are trained. There is at least a debatable case that these systems are operating in the spirit if not the exact letter of general open source licenses. It is a much harder argument to make for the AI trained on writing and art that is clearly copyrighted. If you have ethical questions about Copilot, you really should be against this entire crop of AI systems.

marcus_holmes · 3 years ago

So you're suggesting that developers shut up and let the artists talk first? I'm not sure what the "you're suffering less than these other people" thing is actually intended to translate into? What do we do with that?

All software licences are based on copyright, same as writing, art, music, etc. Some software licences are permissive. Some writing is permissive (e.g. Cory Doctorow). Some music is permissive (e.g. Amanda Palmer). It entirely depends on what the author wants. The fact that more software is permissive is a good thing, right?

I entirely agree that there are ethical problems with training AI on copyrighted training data. But please let's not start gatekeeping this. We need to have a serious discussion as a culture about it, and saying "you're way down the list of victims" isn't helping.

zelphirkalt · 3 years ago

What I agree with is the typical open source dev, who goes "I MIT license all my things, because I have seen it elsewhere and I don't want to think about licenses a lot." being pretty far down the list of groups of people to complain.

What I disagree with is the idea, that they should therefore not complain, or that there could not be an AI system, that does not code laundering, but keeps licenses in place and does this ethically and in an honest way. Adding "ethically" and "honest way", because I am sure that companies will try to find a way around being honest, if they ever are forced to add back the licenses.

In fact, artists might not be the group, that grasps the impact of training on that corpus as quickly as the dev communities. Perhaps it is exactly the devs, who need to complain loudest and first, to have a signal effect.

ksec · 3 years ago

>I'm gonna get hated on for this, but I don't think "give back" is an open source concept.

Well I guess you know why you may be hated for this already. For anyone who has surf HN since ~2010 would know or should notice the definition of open source has changed over the past 10-15 years. Giving Back and Communities are the two predominant Open Source ideals now. Along with making lots money on top of OSS code being somewhat a contentious issue to say the least.

But I want to side step the idealistic issue and think this is more of an economic issue. Where this could be attributed as a zero interest rate phenomenon. You now have developers ( especially those from US ) for most if not all of their professional life living under the money / investment were easy, comparatively speaking. And they should give back when money ( or should I say cash flow ) isn’t an issue. When $200K Total Comp were suppose to be the norm for fresh grad joining Google. And management thinking $500K is barely enough they need to work their way to $1M, while seniors developer believes if Junior were worth $200K than they are asking for $1M total comp is perfectly sane, or some other extreme where everyone in the company should earn exactly the same.

If Twitter or social media were any indication you see a lot of these ideals were completely gone from the conversation. Although this somehow started before the layoffs.

It is somewhat interesting to see the sociologic and idealogical changes with respect to economics changes. But then again, economics in itself is perhaps the largest field psychology study.

> The code that was regurgitated by the model is marketed as "AI generated" and available for use for any project you want. Including proprietary ones. It's laundering open-source code. All of the decades of knowledge and uncountable hours of work is being, well, stolen. There is nothing being given back.

Leaving GitHub wont change that, OpenAI is training its models on every bit of code they can have, sourcehut, codeberg etc. If its public, they will train on it.

Also from my experience of trying to leave GitHub, you just end up having a couple of projects on your alternative platform, and everything else on GitHub. You are still active on GitHub, probably even more than your new alternative.

And if you want to build a community, you will quickly find out that the majority want to stick to GitHub, and leaving it can kill your projects chances of getting contributions.

Personally if the courts decide its fair use, that's it, I'm going back, its the best got platform out there, gitlab doesn't even compare in free features. However I have been eyeing Gitea and Gitea Actions, with it Codeberg could become a realistic choice for me.

To end it, here is a Hot take, I really hate Sourcehut.

it hard to use, the ui is .. Not great and trying to browse issues or latest commits is a nightmare.

Every time a project uses it, its a pain to deal with.

sjamaan · 3 years ago

> Also from my experience of trying to leave GitHub, you just end up having a couple of projects on your alternative platform, and everything else on GitHub.

> And if you want to build a community, you will quickly find out that the majority want to stick to GitHub, and leaving it can kill your projects chances of getting contributions.

That's a defeatist attitude and a self-fulfilling prophecy at the same time. As more and more people leave GitHub (hopefully not to go to the same alternative), it becomes less and less of a must-have. The reason these things are somewhat true today is because of the network effect, and it's precisely that effect which we must actively attempt to squash by leaving.

mattlutze · 3 years ago

Parent is talking about a fundamental feature of networks. A denser and larger network has much more useful network-related features, and if one company has a significant majority of the total addressable market for a network, it's a massive ask for people to extricate themselves and rebuild a network somewhere else.

It's why Facebook is still on top even though everyone hated it for a while; YouTube is the *only video platform, etc.

lelanthran · 3 years ago

> Leaving GitHub wont change that, OpenAI is training its models on every bit of code they can have, sourcehut, codeberg etc. If its public, they will train on it.

Not every bit of code, they are respecting proprietary licenses.

When MS puts the code for Windows, Office, Azure and everything else in front of ChatGPT, Copilot, whatever other AI learning model they have, then perhaps they have a leg to stand on.

Otherwise, they're just being hypocritical to claim that no injury is being done by using code for training, because they are refusing to train on any of their code.

Right now it just looks like they are ripping off open source licenses without meeting the terms of the license.

https://www.lelanthran.com/chap7/content.html

Ajedi32 · 3 years ago

AFAIK that has nothing to do with the license, it has to do with whether the code is public. You don't want the AI accidentally revealing proprietary non-public information (e.g. imagine someone had a secret API key in a private repo and copilot leaked it; that'd be a huge incident), so you don't train it on that information, regardless of what it's licensed under.

You could make a similar argument for not training on GPL code, but it's a lot easier to programmatically determine whether or not code is public than it is to programmatically determine what it's licensed under, particularly when you're training on massive amounts of unlabeled data. Not to mention it's way easier to delete an accidentally-added snippet of GPL code from a codebase than it is to "unleak" company secrets after they've been publicly revealed.

bsder · 3 years ago

> Every time a project uses it, its a pain to deal with.

Sorry, but I consider that a plus.

One of the primary problems with GitHub right now is the "drive by" nature. Everybody is on Github because a bunch of idiotic big corporations made "community contribution" part of their annual review processes so we now have a bunch of people who shouldn't be on GitHub throwing things around on there.

Putting just a touch of friction into the comment/contribute cycle is a good thing. The people who contribute then have to want to contribute.

friend_and_foe · 3 years ago

I like sourcehut, I'm just not a fan of email oriented collaboration workflow, so I dont use it. And the rest of the world isn't either, if the success of github is anything to go by. I get that Drew likes it, the greybeards are used to it, it works, it's adequate, and it keeps things simple, but I just never could do it. I don't like git either tbh, I grumble while I use it. IMO the perfect collaboration suite would be something like fossil with RSS feeds for every action.

toastal · 3 years ago

I believe the goal is to build a minimal UI for those that don't prefer which is fine, but email & pull requests aren't the only model here. Look how much tooling is created to try to fit stack-based diffs atop Git+GitHub instead of using a different platform.

galangalalgol · 3 years ago

I'm mostly familiar with gitlab, what does github provide for free above and beyond that? I like that I can run my gitlab pipeline on my machines and sync to a free gitlab instance. I like that I don't read about security vulnerabilities in gitlab pipelines nearly as often as github actions. I like gitlab issues as they are fairly minimal.

fariszr · 3 years ago

GitHub registry, GitHub actions and GitHub Codespaces are unlimited for public repos, in addition to all enterprise features.

That's without talking about nice to have features like GitHub Sponsors, the for you tab, the (arguably) more popular UI layout, It's simply a better platform for Open source projects

rascul · 3 years ago

> gitlab doesn't even compare in free features.

What features is GitLab missing? I don't know, I'm curious.

fariszr · 3 years ago

Unlimited package registry, unlimited Action run time, premium features unlocked and more. Also, the free tier on GitHub gives more for private repos too!, unlimited orgs, 2000 Ci minutes etc. It's just plain better, and It's because Microsoft can afford to play the long game, GitLab can't anymore.

muyuu · 3 years ago

I believe he just wants to do his bit by removing his activity from github towards lowering their dominance numbers in the space. I don't think he intends to stop those LLM code models.

lofaszvanitt · 3 years ago

This whole open source thing is the biggest farce on planet Earth. Someone with a good knowledge about geeks and their behaviour concocted up this open source bullshit. So now talented people give their skill to the "whole" and they have to beg for contributions and donations to get by. And other geeks (not suits with ties) finance the ones they sympathise with. It's ridiculous.

And faceless entities use their hard work for who knows what, but mostly to fatten up their already oversized corp and give back NOTHING.

And people, seemingly without common sense suck up to companies that rob them, and even disseminate their shiny new "free" tools.

This would be a Hugo-Nebula award winner novel if it wouldn't be reality.

azangru · 3 years ago

This is such a misrepresentation of the open-source landscape. Yes, there are people working on open-source projects who beg for donations; but there also are open-source projects maintained by full-time employees (Eleventy, paid by Netlify; React, paid by Facebook; Angular, paid by Google; Next.js, paid by Vercel; Linux, paid by various companies; etc.). If a person thinks that his efforts will be better compensated elsewhere, he can always start looking for a paid job.

lelanthran · 3 years ago

> So now talented people give their skill to the "whole" and they have to beg for contributions and donations to get by. And other geeks (not suits with ties) finance the ones they sympathise with. It's ridiculous.

Is it? I can't think of a single professional dev making money right now that isn't making money because they did not have to reinvent the entire tech stack that they are skilled in.

If there was no open source, we'd all be making a lot less, and the state of tech would be far far smaller than it is right now.

KronisLV · 3 years ago

While this is a pretty harsh take, I can't help but to feel that articles like "Software below the poverty line" support at least aspects of it: https://staltz.com/software-below-the-poverty-line.html

nemo44x · 3 years ago

I don’t think open source per-se but certainly permissive licenses like Apache were a mistake. They’ve just allowed business to either get free things to make a profit while contributing nothing back or to literally create a business by selling the Apache licensed programs in the cloud.

elric · 3 years ago

Yikes. You sound very bitter. Is there a story behind that bitterness?

There's a wide variety of people in the open source community at large. And a wide variety of motivations for contributing. I for one am happy that open source software is a thing. It's been a net good for mankind. Sure, there are abuses, and I'm sure many things could be improved. But I'm glad it's there all the same.

specialist · 3 years ago

Yup.

FWIW, I keep thinking about some kind of dual licensing, FOSS and something-something-royalties. (Sorry, IANAL, so haven't gotten any further.)