This is a lightly edited transcript of a talk I gave at Bay Area Drupal Camp in Oakland on October 24th, 2024 to an audience of about 40 people. I will be inlining images, but you can also peruse the slides directly. There's a recording available that has audio synchronized with the slides, but this post leaves out my stumbles, uhs, ums, and other vocal pauses. (Additional context will appear like this)
Thank you all for coming. I just want to get started by asking you a little bit about yourself. So how many of you identify as developers? ( 3/4 of audience raises their hands ) How many of you have transcended developers? (a fifth of the room) Still alive, still doing okay? Okay. How about open source developers? (half the room) Commercial developers? (a fifth) Okay, so some of the same hands are going up. Great and competing sessions are also great, so we won't say anything bad about them. I don't even use Playwright or what was the other one, Cypress — I use TestCafe I'm, like, weird.
There we go, a little bit about me — I'm... weird.
I've been in the Python ecosystem for a very long time. I started off in graduate school (2006) at UC Berkeley working on matplotlib. I got my commit rights there, so that's the community I came from, and then I became one of the leaders in IPython and the Jupyter world, a project that's been rewarded by the ACM with the Software System Award (2017).
Now, I don't normally start off my talks with like "oh look at me — I'm a big shot! Look at this big award I got" but this will be relevant, very relevant, to the talk because the ACM is the Association for Computing Machinery, as you may know. It's the Professional Organization for computer scientists it was founded in 1947, so 77 years ago.
Since most of the room raised their hands and said they're developers, I'm going to get super nerdy because we're all among friends here, and talk history and things like that.
The award is a glass trapezoid sitting on its side, so it doesn't actually film very well. You can kind of maybe see that there's this reflection, anyway, but the books in the background (Frank Herbert's Dune) are foreshadowing. So you can kind of mentally place a 🌶️ chili emoji somewhere on there, if you're into that sort of thing, because the spice must flow. (audience chuckles)
The other things on the slide is that I used to work in academia at UC Berkeley, that's where I stuck around basically to work on Jupyter, and then I worked at a couple of startups and a couple of finance firms, and currently working on a Jupyter notebook search engine over at spines.dev. So that's enough about me.
Contributors to this talk ( Itay Dafna, Matt Turk, Juan Nunez-Iglesias, David Nicholson, Carol Willing )
I want to thank some of my friends that helped me reason through some of these things talking about them in particular Matt Turk who spent a lot of time back and forth IM-ing ideas and what to say and how to say it. So this is the first attempt at capturing this sort of thing.
But enough about them let's talk about me again. (chuckles)
I was born 1984, which makes me 40, and the same year the Macintosh was introduced. You remember Apple's smashing the screen Big Brother commercial?
So that was 1984 and that was also the year of the first hackers conference which was just across the Bay and that happened because Steven Levy wrote a book Hackers: Heroes of the computer Revolution
How many have read that book? (no one raises a hand) Oh that's such a good book, it's great. This is "hackers" before that word was associated with breaking into computers because people weren't even breaking into computers yet, people were busy trying to build computers to begin with.
So at that hacker conference, Richard Stallman first publicly explicitly stated the idea that all software should be free and makes it clear that free refers to freedom, not price, and saying that software should be freely accessible to everyone.
It was the first time he did that publicly, okay and so what's interesting about that being 40 years ago, is that if you go back 40 years in the other direction.
(I spread out my arms, turning them into a timeline, each arm measuring 40 years)
# 1944 - - - - - - 1984 - - - - - - 2024
# \ \ \
# no computers free software today
With time flowing left to right, my nose is where I was born, 1984. And all the way on your right, this is where we are today. At the tip of my opposite hand, there were no computers, basically. 1944 was when the first computer, the Harvard Mark 1, was built, and if you were interested in computing Bessel functions you had a computer for that. You didn't really have a generic computer yet, so this (my armspan) is all of computing history in the modern sense. This is not talking about just calculators but computers, right. And so half of the time we've spent with this idea of free software and building it and sharing it, explicitly.
Okay, so the oldest programming language that's still in use today, FORTRAN was specified in 1954 and it took 'em another 3 years to build a compiler for it. COBOL was finished in 1959, LISP in 1960, so that's all back here. But what I'm concerned about is the future, because a tremendous amount of value has been unlocked from all of this creative tinkering that people were doing trying to scratch their own itch. The previous talk(CTO of The Drupal Association Tim Lehnen) mentioned the LAMP stack right: Linux — okay that's free software, Apache — yeah that's free software, MySQL / Maria DB whatever it was, that was free software too, and P stood for PHP in this world, it's Python where I'm from, all of that is free software. Linux kernel developer and maintainer Greg Kroah-Hartman in a talk just about a month ago said that Debian runs 70% of the world. He talks to a lot of cloud vendors. He's paid by the Linux Foundation to work on Linux and he says at least 70% of cloud workloads run Debian, which is a non-commercial distribution, if you're not familiar, and 80% or more than 80% he estimates run non-commercial distributions in general. So he says that Red Hat, SUSE, Ubuntu are great distros, but they're not what the world is using.
(I neglected to explicitly assert here that there would be no cloud computing without open source and free software )
And the concern is that this world that is being used right now is going to shrink and shrivel and die on the vine.
Have I painted a dark enough picture?
So the challenges that I want to talk about today are the three up here. We're going to talk about LLMs for code completion. We're going to talk about mixing paid and volunteer labor and how inherently you can't get away from it like that's that's a struggle that will always be there provided that you want to have hobbyists around. And then the last one is this relicensing of open source software that been happening with a lot of venture backed companies, which people have been referring to's a "rug pull" of taking away a license that was previously "I can do anything with the software" to now "I can't" and what to do about that.
So let me just start with that first one and say that
GitHub Copilot, and other large language models for code completion trained on publicly available software with no regard for the licenses of that software, acts as a license laundering cudgel that denigrates the work of open source developers who have contributed their code under a legal framework for how their code can be used. If their code was contributed under a copyleft license, like the GPL which Richard Stallman introduced a couple of years after that 1984 meeting, their expectation is that no user of their code will ever lose the ability to modify the software. If it was contributed under a permissive license, like the BSD or MIT licenses, the expectation was that inclusion of their code would result in an attribution to the project they contributed to. Tools like GitHub Copilot makes all of those expectation of the original authors null and void, enabling theft of their work, all in the name of making it "easier" for other programmers.
Okay so I called this a pyramid scheme in the abstract and here's what I mean: normalizing plagiarism. It's also a heist. This is a picture of the Louvre, if you're not familiar with it, and it's on the scale of taking everything from the Louvre. That's what I think. Talk about "value capture", right from the previous talk.
And for a pyramid scheme, you need two things. Okay, it's loosely defined as a pyramid scheme, it's just an effective metaphor, I think. You need to have some schemers that want to corrupt other parties. They want to to get more people into the game, "oh yeah, let's get them in there" and then you need somebody profiting at the top. Sort of, the higher up in the pyramid you are, the more profit you'll get okay.
So these are the two things that I'll need to show, or that I will be end up showing.
Hopefully it's self-evident but let's roll with it anyway.
So let's start with the base and work our way up.
If enough people participate in a plagiarism, we expand the world of plagiarist. We normalize it. I'm just repeating myself here. Let's take a look at how the corruption pyramid looks like at its base.
Fundamentally, no one using it, or rather the system is built in such a way that you are meant to not care where the code came from and I think that matters and I think we should care where the code came from.
So GitHub does their little meeting or big meeting, GitHub Universe they call it, I think it's this week or next week across the bay in San Francisco. And every year they release the State of Open Source as they see it. This year's survey hasn't yet come out, but in last year's survey there was a nice little banner the in the middle of it telling you: "GitHub Copilot: don't fly solo. Try it, it's free. First one's free."
Okay, in the middle of their report they say "speaking of generative AI, almost a third of Open Source projects use GitHub Copilot." Well, that sounds like a lot, although you know "at least one star"? I can star my own projects. Nevermind.
Then not five sentences later they repeat the same claim: "open source maintainers are adopting generative of AI" and then let's take a look here oh "this follows our program to offer a GitHub Copilot free to open source maintainers"
In 2023 there were "301 million total contributions to open source projects across GitHub". Wow, that's great, and "commercially backed projects continued to attract most of the open source contributions" — that's also interesting.
Okay, so, hmmm... I mean I know I'm a little conspiratorial here, but this is what this sounds like to me.
(laughter, Tim Lehnen sitting near the front says: "But I like cookies?!") You do like cookies? I like cookies, too. I have coffee, I don't have a cookie, right now. Wouldn't it be nice to have a cookie? I'd like a cookie...
I'm not going to convince them, the makers of these tools how to do it, but I think that there's a different way to live.
So hopefully I've made the case for sort of the base of the pyramid and the pyramid keeps wanting to perpetuate itself in the hype cycle that wishes to make the whole world feel it's normal to steal other people's work without attribution.
Okay so where's the top of the pyramid? Somebody has to be profiting from this.
Well in the Q2 earnings report, I mean you can you can read faster than I can but basically, GitHub Copilot as of July of this year makes more money than all of the GitHub business did when Microsoft acquire it acquired it 6 years ago.
So Microsoft acquired GitHub for for $7.5 billion 6 years ago. Today, GitHub Copilot makes more money for GitHub than GitHub did back then.
Moreover, that's not all. If you license Copilot business or Copilot Enterprise directly from GitHub this document applies to you.
The first one's okay you get to keep your own code. That's that sounds great until you read the next one which tells you that, oh by the way, you're on the hook if you're doing something wrong. If you're violating somebody's rights, that's on you, that's not on us. We're just mAkInG sUgGeStIoNs (chuckles) "I'm not stealing other people's work without giving them credit, I'm making a suggestion." So they got users' money so that the users themselves can potentially infringe others rights.
I think there's also a secondary pyramid scheme, one that might work out for those of us in the room that are already good at developing software.
The case is that we have been, for 40 years, writing lots of code most code is buggy so we've got lots of work to do. It's job security for those of us that know how to code.
I would claim that these tools that make it easier to learn how to code, they help to make it easier for you to get fish, they don't teach you how to fish. You get the end product faster and there's evidence for this I don't have the citation handy (Here the CACM article: "The Impact of AI on Computer Science Education " about Professor Eric Klopfer's experiment in his undergraduate computer science class at MIT), but there was a study done with three groups of students to complete a programming project. One of the groups got to use something like Chat-GPT, then the second group it was something more like Copilot where they weren't talking back and forth with getting code suggestions but it was just inline code generation. And then the last group got to use a search engine and had to do it.
So guess who completed the task fastest? ChatGPT, right. Guess who was unable to, or who was the worst at figuring out how to change that same code for a follow-up task? ChatGPT.
So the search engine User Group took the longest to finish their task but they were they all understood what that code did, and they were able to to change their code for followup task.
On "Productivity"
The Communications of the ACM is a journal that gets published every month from the ACM and in March of this year (2024) they released a little video previewing the front cover article on "Measuring GitHub's Copilot's Impact on Productivity" and I was enraged as I hopefully demonstrated earlier on. (chuckles)
I wrote a letter to the editor the contents of which I read to you up front and I the subject of that line was "Measuring GitHub's Integrity?" or it could have also been "Measuring GitHub's impact on the erosion of trust" or "the erosion of professional ethics"
We're sitting here, I'm supposed to be an ethical professional, and we're talking about productivity.
I mean, listen, we're talking about... productivity?
Not ethics.
Not ethics.
Productivity.
We're talking... about... productivity.
Not ethics... that we go out there and project into the world, we're talking about productivity.
So they posted this video and I was fuming and I decided I was going to do something about it I was going to leave a comment on the video. (slight chuckle)
And something happened, not two minutes later that comment was gone and I was like well what's going on here? Okay so clearly there was some glitch in The Matrix let me repost again. I repost it again and it was gone two minutes later and I'm like what's going on? Is the ACM really unable to handle any kind of criticism? And if you go to this link right now you will see that there's three comments on the video but you have to kind of click around in order to see them because I've been shadow banned. When I go as a logged in user I see my comment posted on there but none of the rest of the world gets to see it.
So what's nefarious about this well I made the classic now mistake of including "as a large language model" in my commentary because YouTube now is used to having bots writing answers to them and you know how bots are? They always write "As a large language model I can do this but I can't do that" so I, I basically became a bot, as far as YouTube is concerned. (slight chuckles) So I was censored by trying to talk about this problem
Okay, so how comfortable are we with licenses in this if I if I throw up this picture are we good?
(Note this chart contains an oversimplification, as there isn't one GPL license, but two versions in use, and they can be incompatible with one another, per Rob Landley's OhioLinuxFest 2013 talk, where he uses the concrete example of Linux and Samba being both GPL licensed, but not being able to share code, because one is GPLv2, and the other is GPLv3))
We've got BSD up at the top that's a permissive license.
We got GPL, a copyleft license that's in the middle: it's it can receive works from other permissive or copyleft licenses but the produced work must remain GPL. Some people call that the viral nature of GPL other people call it the security of their users.
Proprietary software is out there — there's a lot of it. It can only take from the permissive licensed works, and can't take from GPL. Anything that takes GPL licensed work has to be GPL.
So what is the most popular successful open source project? It's Linux kernel. And what license does it use? GPLv2, which is the same license that Drupal uses.
Okay, so Linux had its 91st birthday, oh-uh sorry 33rd birthday, (chuckles, someone, probably Tim again, says "You came from the future?") Yeah, hehe... Come with me if you want to live. (laughter) Yes, thank you for that.
I found out recently (via Rob Landley's Ohio LinuxFest 2013 talk) that actually the first couple of versions of Linux weren't GPL licensed. Linux was GPLed in 1992, so there was some interim where the license was just "no commercial use." It was actually explicitly against commercial use.
So the question is: did GitHub Copilot exclude GPL code from the training of the code assistant?
I don't see any head nods and I see some shaking heads. Okay so I don't know, but the world is bigger than just GitHub, right, and so there's another project that is called BigCode project and luckily they did exclude GPL code from their training set.
But it's not all good and gravy, right. The spice must flow, right? I'm not going to be nice to everyone or anyone.
Bigcode is an open scientific collaboration. It's supported by Huggingface and Service Now Research. That's who funds them and that's where the development comes from. It trained this StarCoder — StarCoder — there's a lot of stars today, right I guess we're going to the stars, maybe? We're all made of stars? I'm not sure. Stardust, okay? Ziggy, no? Is this thing on? (one chuckle) okay, okay good.
So the Stack V2 is a data set that has over 600 programming languages. That sounds impressive. It's 32 terabytes of code. Okay and then they cite, because they're responsible, you see, they said that they're going to be "working on the responsible development," they're going to follow these principles from the Software Heritage statement on LLMs, which amongst other things says that they're here to "democratize the software creation process" okay
Now, hmmm, be wary of people that say "democratize" anything, okay. Some have already inferred by my last name that I come from the former Soviet Union and you just, just beware, okay.
So let's take a look at the aggregate licenses in this data set
So the vast majority of the data set has no license on it, whatsoever. And of the little slice of pie, the less than a quarter of the rest of it, the majority of that is MIT.
Okay, so they did do something right, they left out GPL up front. That's great, but what's wrong is that there are other conditions outside of the kind of work that the derivative works may be combined with that apply, namely, that the licenses stipulate attribution for the original authors of the work.
So the question is, are you in the Stack? is your code in the Stack? They have an app for that.
So you can go to this URL and click on that and you're welcome to type in your GitHub username. The point is that I'm, in the Stack. 83 of my repositories are in the Stack.
And have they considered that not everyone's code was written on GitHub? Like, there's a bunch of code that pre-dates GitHub. What happens to that code? And like— what the hell, man. Why is it that the authors have to opt out? Why is opt-in the default?
Like, if only there was some way that we as developers could indicate to other people how we want our code to be used — (laughter)
oh! Oh! We do have this mechanism — it's called a license.
Alright, so Gabe Newell, is the main guy behind Steam. He's a former Microsoft executive that started Valve Software and he runs it. And he had this statement which is that "if property rights are provisional then your value of property changes accordingly" and I think the value of our property has dropped precipitously when it is no longer even attributable. Like, good job, good job, Tech World. You're managing to cannibalize your own roots, basically. we gave this thing away for free, all we asked is, most of us asked just attribute us, right, and you managed to still pirate that. Cool. That's going to be a wonderful future to live in.
Okay and so ignoring licenses I argue could be seen as a license to kill the community (Tearing up the social contract we operated under for the past 40 years. The erosion loop has begun. We will become part of the sediment)
Others might say "well, that's just like, your opinion, man" because some developers are embracing this new reality.
And Rob Landley is one of them. He was actually involved in the GPL lawsuits because he was the maintainer of BusyBox, which is like a standalone micro Linux distribution that lots of embedded systems use, and he was interested in getting code back from these vendors that were shipping products, routers and things like that, without providing the code. And he's like "I want to see the code" and he's seen the code and he wanted nothing to do with it. He wanted the lawyers involved (Software Freedom Law Center) to stop doing the licensing wars and they wouldn't and so he decided to rewrite BusyBox from scratch, which he called Toy Box. Using a BSD license, he actually created this Zero-Clause BSD license so you don't even need the attribution part of it, so it's completely kind of like free-for-all: "Just use it. Don't sue me." And ToyBox is now shipped in every Android device and continues to be used and he continues to work on it. I've left out some of the details of this story, there's more in the Toybox vs BusyBox talk
There are also commercial companies that are kind of routing around the GPL in a different way by just supporting projects like LLVM that aren't GPL licensed, and that allow them to do the things that they want to do.
A friend of mine (Matt Turk) also told me that I should include the Santa Cruz Operation trial in here.
These are quotes from Jonathan Corbet's article about it. The SCO lawsuit was filed 21 years ago, but all of this is to say that this is nothing new: there's always somebody knocking, knocking on our door, trying to destroy what we have.
Okay, so what are the paths forward here?
I think it's clear that we need to provide data sets and models that train only on attribution-free licenses: like MIT Zero, like Zero-Clause BSD, or public domain. I already mentioned Toybox, SQLite is another example of a project that has one of these licenses, it's public domain. We have these projects, just don't take everything.
Gentoo Linux put forth a policy of not allowing any AI generated code in their distribution in March of 2024. NetBSD Foundation followed suit in May. There's hope on that front, that we can continue to work in an ethical manner.
And there's a lawsuit that was filed.
I didn't know about that, my friend David Nicholson told me about it. So I don't know how this will turn out, but I'm glad. I'm glad somebody's doing this because this isn't right.
Okay, moving on:
Oil and water
The effort and expectation differences of mixing paid and volunteer work.
Popular collaboratively developed software can attract a mixture of monetarily compensated development as well as the hobbyist unpaid labor that nourish the software and its nascency. Business critical gift economies don't exist, but projects and their participants vary on the spectrum, from those two opposite poles and positions shift over time. Paradoxically, this tension can be both productive and detrimental at the same time, depending on the perspective of those involved.
So let's make this a bit darker — you're used to me by now, (laughter) we're friends.
So the paid workers, I'm going to call them as oil. Because, you know oil is dirty, right, it's dark. Water's abundant, it's pristine and clean. Yeah, I'm a hippie and things are more complicated. but let's do a little story of Keith Packard.
Keith Packard is one of the main developers of the X Window System. If you've ever used Linux and or the BSDs, and had a graphical user interface, you have Keith Packard, partially, to thank for that. He's also behind Fontconfig and Cairo. So the story here is in 1988, there's a problem with a bunch of the Unix vendors. Linux doesn't exist yet, right, there's a problem that they have in that Sun Microsystems has kind of the only working graphical system that is worth anything and so the rest of the vendors decide to get together and work in the open - to fund this effort that will allow them to leap frog Sun View and capture the users. This is the origin of the MIT license, it comes from this project and I linked here the talk itself.
Here's what Keith had to say is that he would answer email from people that were paying him first, right? Makes sense, so even though it was developed out in the open, and you know if I was able to type back then, and had access to the computer, and could speak English, and lots of things - but I could have sent him packages but they would have just sat in his queue unanswered.
So that's an early example, right, it predates Linux, an early example of a success story of commercial entities working together, but even though the work was out in the open not everyone could participate in it equally and not just because they didn't have the time, but because the attention of those involved was focused on where the money was, what their job was.
Okay, so I think this tension is always going to be there. So, there's a spectrum of being a paid or a volunteer and you know you can you can think of yourself like students don't get paid grad students do get paid a little bit as I can tell you... (Tim: "Students pay") That's true, students do pay. Yeah, I didn't think of that, that's true. Yeah, I guess yeah I was thinking of students like a high school student in public school, but yeah you're right. a college student does end up paying, thank you for that. yeah good contribution. Uh, you can heckle me, too you guys, (laughter) This is fine, we're all friends, it's a small crowd.
Okay, so there's also on the spectrum of, well you can be paid, but you could be paid a lot more if you didn't do open source, for example. But if you want to do open source, maybe part of your sliding on that scale of paid to volunteer. you say to yourself "okay I really care about this stuff and I'd rather just make less money and continue to be able to do it" Yes, maximally, it would be great to make tons of money doing this stuff, but realistically maybe I'm okay with just, you know, paying the bills while while doing something that I care about.
There are different kinds of projects that get started, right. You can have academic projects or you can have industry-led projects, like the story I started with in this section. And then there's another project that Matt Turk told me about which is that "prove yourself" project, you got to make a splash.
Talking about the technology that insists on monopolizing your attention, from a previous talk, you have projects that insist on "here's a cool thing, like it might not be useful, but it's cool" There's lots of projects out there that now we're used to that operate under the premise of "oh, you have to make a splash in order to make a contribution".
Okay, but I argue that, other than just this volunteer-paid axis there are other axes.
self <---> society
There's an axis of self versus society: is it is it about me or is it about what I'm taking on into the world. Some people care about intellectual curiosity, "I just want to get something out there and it happens to be that I'm doing it out in the open". Other people are like "No, I really want to make an impact in the world".
journey <---> destination
Okay then there's the journey and destination. Do I care about the experience of what it feels like to be doing this thing or where we end up, in the product.
Athenian <---> Spartan
Do we care about egalitarianism or do we care about hierarchy? Do we take marching orders or are we all voting? Do we all have an equal voting share, right, what does that look like?
dogmatic <---> pragmatic
Are we going to be stand for ideals or are we going to get something done? Is it about how quickly we're going to do it, or does it not matter to us how quickly we do it because we care about what the thing is going to be and how it's going to get to get done.
silly <---> serious
Do we like to have fun? Well, what kind of fun are we talking about? Is it okay to be silly or do we prefer things to be serious, so that we can have efficiency?
courteous <---> abrasive
Are we nice or is it not about being nice? OpenBSD and the 9front communities are shining examples of this abrasive culture that nevertheless is able to stand up and get lots of things done. And I wish that people didn't take credit away from them when they assumed that you have to be courteous, right. This is controversial, yes but abrasive can be effective. You don't have to be nice. This is a very American notion that everybody has to be nice. In other cultures, they're much more comfortable with being in open conflict, and it's productive that way. You can be productive that way.
pushing <---> steering
Are we pushing or are we steering? Some people don't want to do the work, some people don't care what kind of work they're do they just want to work out.
emotional <---> intellectual
And then there's emotional and intellectual. Is this work or is this a diversion from work?
I have this tweet from 2011 when I was working on Jupyter, on a grant for Jupyter. So I was being paid out of a grant and there were lots of flyby contributors coming in, and they were donating code left and right and I was like: "If pyramids had a commit log, they could have been built by volunteers." And you might be thinking, "well, they were paid by slave labor." No, they weren't paid by slave labor. They were paid laborers, right. This was established in the 1990s, they found the graves of the builders of the pyramids, and they were well compensated for their work.
So it's a nice notion that maybe pyramids could have been built by volunteers, and I still think maybe they could be. It depends on what else is involved. It depends on the other axes in the project.
So, would you pitch in and help build the pyramids, if you knew that others are being paid for the work? Again, I think it depends. I don't think it's clear, because if I get to do the code stuff that I like to do, and somebody's doing the project management stuff that I hate doing, then yeah, that that seems like a fair trade. Somebody else is organizing the conference? Yeah, I can pitch in and give a talk. So it depends.
That wasn't dig at this conference, this is a great conference. This is great, you guys are great.(chuckles) I'm just saying.
So how do you balance this paid and volunteer work axis? Sometimes, you might identify a great volunteer, and you want to bring them into the fold, right, you want to compensate them for their work. How do the other volunteers feel about that? When the money gets a little bit tighter, what happens when you kind of have to let them go? You know that they were already great to begin with, maybe they'll go away, maybe they'll stick around. It's a tough problem.
I already mentioned about this game company that I'm sure if you've done any PC gaming, you're well aware of who they are (Valve). But they're now supporting, as of just a couple couple weeks ago, Arch Linux, a distribution that they use for their little Steam Box, err, so Steam, what is it? (audience: "Steam Deck!") Yes, the gamer nerds are helping me out, thank you so much, of which I'm one, but I'm just not a very eloquent one.
Then, there's also a recent effort that's led by the folks at Sentry called The Open Source Pledge, which is asking companies to pledge to donate at least $2,000 per one of their internal developers per year to open source. I have links here to Chad Whitacre's Blog where he writes about this, and he's been trying to advocate for this since 2017 Interestingly, the number of $2K hasn't changed (audience chuckles), but anyway ...
So, I think the tension is always there. I think the tension is there, personally for me, the tension is there, because Jupyter recently formed a 501C6 Foundation, explicitly trying to seek more money from big companies. I think that this is a case of hobbyists being squeezed out. I was one of the people that voted against this.
Now, I don't want to, you know, bring my dirty Jupyter laundry over to you guys and dump it on you, but it's just just an aside. Matt also pointed out to me that "given Copilot's voracious maw, it seems like it's harder to say what's a hobbyist project and what isn't since the usage of hobbyist code in commercial products is basically unknowable". I might start off as a hobbyist, do something clever, get a job offer out of it, okay, and then I get some financial support or a grant, great.
Innovation happens at the edge. Innovation happens with individual users. This is Eric von Hippel's work that Carol Willing was the one that pointed me to it. The stuff that I do on my computer is where the magic happens, right. The stuff that you all do on your computers is where the magic happens. It's not organizations that make that magic happen. Organizations provide the fuel to make that magic happen. Producers have to worry about the size of the market. When you did something cool, it's not enough. When you did something cool, it's cool enough for you, it might be interesting, but it might not be profitable or it might not be worth pursuing as a commercial enterprise.
It can be lonely to tinker and do things by yourself and no talk about open source would be complete without a slide about burnout. (excerpt from page 39 of the 2024 Tidelift State of the Open Source Maintainer Report)
Recall that property of oil to hold more heat. There's a lot of things that I put up with when I'm being compensated for it, monetarily. There's a four-letter word that comes to mind, and it's JIRA. (laughter).
Money can be a salve to things that we don't like doing but... you don't want to end up in a Dune world, where all the water evaporates. "Try our one size fits all miracle cure"
Okay, so what is this formula?
(1/3)Bh = V
Math Pop Quiz! ...Anyone?
(audience member: one third of the business-hour is the velocity of project?).
Okay, you're close, I'll give you a hint.
B is the size of the user base.
h is the height of entitlement (audience begins cracking chuckles)
V is the volume of outrage (laughter)
This is the formula for the rage pyramid. It's a problem in open source, it's a challenge.
All right, last thing to talk about.
Silver spoons lead to later forks. Pay the piper: relicensing trends in commercially supported source-available projects.
With increasing frequency, previously open source software projects backed by a commercial entity are shifting away from traditional licenses. Recent examples include Sentry, Terraform, Redis, and CockroachDB. Why are they doing that and what are affected users and developers doing in response?
So why are they doing that? This is a shorter section. To make money. They fear hyperscalers. They've received insufficient return on investment from trying to foster contributors.
What are we doing in response? Largely, we're forking. Note that the forks already happened when they changed their license. That was the original fork, they just happened to take the original name along with that fork.
Here's 's where the talk, from being spicy turns into a bit of self flagellation.
Philanthropy. This is how you should think of open source software, regardless of who is making it.
You can't be mad at a donor, that you relied on for a long time, that stops giving to your cause. I mean, you can get mad about anything, you can be Hulk, right? But you wouldn't begrudge a past-time top contributor from not being around anymore, because they wish to spend their time elsewhere. You shouldn't begrudge a business that's choosing to spend its money elsewhere.
Okay, we're all trying our best. I mean, not me, I'm not, but other people, right?
Fair Source is an alternative to closed source. Again, this is a recent development, also from Chad Whitacre and the other folks at Sentry, leading the charge here. I don't know if this will work out, but there's already eight companies adopting it.
The idea is that it transitions to being a traditional open source license after some time. I don't know if it'll work out, but people are allowed to make decisions, and make changes in how they they live in the world.
And it's not all bad, because, for example just recently, I think in August of this year, after 4 years of not being open source, Elasticsearch is back to being open source again. So sometimes they change their mind and they come back and you can read about how how it is that they did that (Elastic Blog; Techcrunch)
Then there's other disruptive changes.
So the left-pad incident: the thing that doesn't get
talked about the left-pad incident enough is
it's sort of been painted as like "oh one developer got mad and decided to
just screw everybody". And no, that's not
what happened.
npm decided to side with one of their
commercial users. Kik was a company and they
wanted the name that this developer,
who happened to also develop left-pad,
had been
using. And when they took away his kik
...
(ahem) When they kicked him
out... No? Too much? Yeah it's good? Okay, I'll
leave it in. Okay, and he says
that this is not driven by logic — I'm sorry it was driven by logic.
(I got some help from the audience here as I confused
myself)
It's not driven by logic, anger, or greed. It's an exclusive nor? (audience chuckles)
Thank you for the
laugh, appreciate that. So he says
that "
Left-pad was like a 'death' and 're-birth' moment for me.
The part of me passionate about open-source was dead, and something new took over. Now, I'm passionate about business, marketing, running companies / teams in different ways, as much as I'm about programming
Okay, and then the other disruptive challenge are the ongoing WordPress Saga. But let's remind ourselves of the rage pyramid.
Rage still applies here, because previously I used the rage pyramid in defense of maintainers, and this time it's worth mentioning that commercial developers are allowed to do what they want to do. The open source licenses allow them to do it and we're allowed to fork.
The three things that I talked about is that I tried to give you the kinds of flavors of challenges that exists in developing open source software today. The big alarm bell is this attribution theft. Then business cycles come and go, have their booms and busts, and there should always be tension between free and open source projects and the paid contributors inside of them. And finally forking is a right: you have to adjust your expectations accordingly.
Thank you for your time and attention. [Applause]