> The team that made dataroom has stated that they did not use any of papermark’s code and that dataroom was made from scratch with inspiration from existing document sharing softwares, and that this post’s allegations of us stealing code are false. [...]
The screenshots clearly show they copied whole pages verbatim, both design and texts. The founder, Nico Laqua, basically responding with "we didn't copy _code_" and not taking any responsibility says a lot about his and his company's moral code. It might not be enough to get sued. That doesn't make it right.
I did an interview a couple years ago when Corgi was first hiring engineers. Nico and I ... did not click and it was probably the least smooth interview I've ever had despite it just being a phone screen.
I wouldn't be that surprised if Nico genuinely thinks "we didn't copy the code" is a reasonable defense. It would be a clear cut rule, and extreme "shape rotator" types often have trouble with the fuzziness of things like law. In reality, copyright infringement is often more like the porn test, you know it when you see it.
They’re not really, it’s just the YC hype cycle. The business is selling insurance to other YC startups with some AI flair. They’re not even the first YC startup to do this, a previous YC insurance startup was acquired a few years ago for ~$1bn. So, they’re worth 3x the exit of the exact same company… because of what, AI? The fact that they’re cloning other software to release SaaS products is extremely bearish. Why are they wasting their time on this? A wildly successful $3bn startup would not spend their precious resources by launching a $10/m document sending SaaS. They’ll be doing down rounds soon enough. Could you imagine Paul Graham encouraging this?
Listened to the founder on 20VC episode talk endlessly about sleeping and showering in the office and comparing their insurance company to Alexander the Great and Napoleon.
Silicon Valley is just so disconnected from reality.
Also, I should add, they’re growing fast because they will underwrite anyone for anything. They’re one “oops our AI underwriting has been taking on far too much risk” away from disaster. That they’re demanding 7 days a week from their employees while spending their time building a dataroom product instead of, I don’t know, improving their underwriting, is a bad sign.
Normally getting insurance from a startup like Corgi would be a very bad idea because what’s to say they’ll be able to pay out claims? I assume other YC startups are happy because a) they can’t get insurance anywhere with good underwriting b) they figure YC will bail Corgi out when it goes wrong because seemingly every YC startup depends on them.
“Policyholders should be aware that certain Specialty Insurance Carriers may not be admitted insurers in the state in which the insured risk is located. Policies issued by non-admitted insurers, risk retention groups, captive insurers, and certain other Specialty Insurance Carriers may not be subject to all of the insurance laws and regulations of your state. State insurance insolvency guaranty funds may not be available for policies issued by non-admitted insurers, risk retention groups, captive insurance companies, offshore insurers, or other non-admitted Specialty Insurance Carriers. In the event of the insolvency of such a carrier, policyholders may not have access to state guaranty fund protection and may bear the risk of the carrier's inability to pay claims.”
Mostly because open source projects rarely sue. If you did this to a more litigious company there's a decent chance they would sue, and I'd give them about a 50/50 chance of winning.
Hard to say whether this would be ruled as copying the creative and artistic elements, or just the methods of operation. Copying features is fine, wholesale copying UX quickly becomes copyright infringement
He was bragging about working on weekends and comparing his shitty little insurance company to the Manhattan Project a while back. Somewhere he claimed this company/industry is the most important application of AI in the world. I have no doubt they ripped it off, this guy is not trustworthy to say the least..
I think what you mean is that functional designs aren’t protected by copyright. Of course you could patent it.
But in this case they almost exactly copied the graphic design and copied the text verbatim which would maybe infringe copyright.
Which would bring you nowhere. If they didn't change this at some point, I remember at the time everyone was staring to use ChatGPT that OpenAI wrote in their terms that the user is responsible for the model's output. If they can do this, I expect other model providers doing this as well.
yes and besides the whole thing that is happening lets not suddenly pretend css and html are code either. There might be bad things going on but we need to maintain our standards!
Can someone give a bit more of context on this thread? I have no idea who Nico is nor what Papermark is or does.
As an aside thought not related to the thread: Is it my perception or people are getting more used to not only vibe code things from existing solutions/projects but also "steal" open source code and do whatever the heck they want without complying morally/ethically/legally to the whole premise of open source?
I have the feeling that more than ever open source violations are flourishing everywhere without any major legal consequences.
> Is it my perception or people are getting more used to not only vibe code things from existing solutions/projects but also "steal" open source code and do whatever the heck they want without complying morally/ethically/legally to the whole premise of open source?
yes. it's way easier to do now. edit -- plus a lot of new ai-only entrant devs don't understand/care that foss is about freedoms rather than free as in beer.
i work on a GPL3 library that parses a hardware audio sampler's binary data files. someone built an app so people can do "stuff" on top of my library, following GPL3 license.
someone recently posted an entirely vibe-coded clone of that app, full website with purchase links for $60 odd. completely obvious clone too; the UI was exactly the same minus the different colour scheme. no GPL3 conditions adhered to at all. mods delisted the thread. banned the clone's dev. forum community expressed their support for the original app dev. dmca takedowns were sent out. clone's website went down a few days later.
the original app dev was lucky there's only one main forum where people post things for this manufacturer, and the mods hate ai stuff too, which is kind of ironic cos the original app dev vibe codes all his stuff lol. without that forum and those mods, the original app dev would have been fucked tbh (and so would i as the GPL3 library maintainer).
centralization has benefits... without that, the only alternative i see is a mass movement where everyone goes closed source to force a conversation about respecting the work of others. we've been running on an honour/community backlash system until now.
Judges and governments are pro-business and anti-consumers, anti-citizens. Corporations are getting use to get away with anything and everything.
Move fast and break things have changed to be about technology and it is now about the law. Uber popularized the trend, now everybody does the same. AI breaking copyright law is just part of that trend.
With the new "laws are for losers" mentality we are in for a hard time.
When the biggest thieves are on track to trillion dollar valuations, what do you expect. Everything on the Internet is free for all now, don’t kid yourself.
If you are convinced this is a winner takes all race to ASI, and ASI results in absolute world dominance, then of course you are never going to feel restricted by current laws, especially not simple IP rules. Because the only way to make 100% sure you lose is not to play.
If you’re a business that deals in documents from external customers / partners, you use a data room like DocSend (by Dropbox) to share and receive documents with access management, analytics, auditing etc.
Papermark is an open source alternative to DocSend. Papermark is very popular, as it is a much more cost effective alternative to DocSend — self-host or hosted.
Corgi is a YC backed insurance startup that sells insurance to other YC startups. Nico is a founder. Recently they raised $100m at a ~$3bn valuation. They’re one of the darlings of YC right now, endless fawning over them.
Since insurance underwriting involves lots of documents, Corgi were paying Dropbox thousands of dollars per month for DocSend. For some reason, Corgi ostensibly formed a team of 12 to build their own DocSend alternative, called Dataroom. And Corgi decided to make it into a SaaS product, pitched as a cheaper DocSend from just $10/month, in an already crowded space.
Papermark noticed immediately that Corgi’s Dataroom used a lot of identical language and structure that Papermark’s open source product does. Papermark assumed that Corgi had taken Papermark’s work without attribution. Corgi have denied it, claiming it is just a coincidence that there are word for word matches between the products.
Another YC startup, Delve, got caught doing what Corgi are accused of (and much more) which led to their removal from YC.
A startup raises ~$100m at a ~$3bn valuation and forms a team of 12 employees including their Head of Operations to build a clone of a product they pay less than $1,000/month for while they have more than 50 open roles they are hiring for.
Hmmm, yes, a very good use of available resources.
Thanks for the insight. So regarding what you explained above, is Corgi's fate supposed to be similar to Delve's? Or are those numbers so big/important for YC that they won't be banned?
Delve screwed over other YC startups, Corgi is stealing from a non-YC startup. Therefore, there will be no official censure from YC. This is a pretty well-established fact pattern historically. Just don't mess with the YC community if you want to stay in YC.
Not necessarily. Delve did a lot of bad things, the primary reason for their removal was misrepresentations they made to other YC startups, i.e: YC startups paid them for security audits that turned out to be bunk which caused a big headache for their customers. Basically, the rest of YC wanted them gone for causing widespread chaos.
Delve’s first drama was around copying from other startups, it was later that their betrayal came out. Corgi is currently at the copying from other startups stage… one might choose to believe there is a path they’re following rather than this being a one off.
For example, I outlined in another comment how their product is not what it seems, it is not traditional insurance, it takes advantage of an esoteric piece of insurance regulation. They’re doing very aggressive underwriting without any of the traditional insurance regulatory protections applying to them.
it takes advantage of an esoteric piece of insurance regulation. They’re doing very aggressive underwriting without any of the traditional insurance regulatory protections applying to them.
elsewhere;
"Laqua, whose father is a lawyer for an insurance company"
From what I can tell, his argument seems to be that
1. no code was manually copied by a developer, and
2. all software in the same space copies off of each other
But the big giveaway here is the exact same layout/copywriting on both products. Telling an LLM "write this product and build a 1:1 clone" is still copying by all sensible definitions. The fact that he argues nothing was copied is ridiculous.
I guess that is at the core of Google vs Oracle, they copied the API kept the implementation clean-room. It was definitvely ruled that this was fair use. If fair use applies to something as strict as re-implementing an API, I would argue it applies to something much more elusive, like cloning UI/layout.
If we take what they're saying as fact and that they didn't copy and paste the code, but for all intents and purposes the LlM basically did reproduce the same code based on its crawling of the repo and not respecting the license. It would make a great civil case for the courts to decide.
Their defence seems to be "well we asked an LLM to reproduce your work, so 'WE' never copied your code". Smells bad to me.
they probably need to sue to enforce this, I think this is actually going to be a larger issue than just corgi. copyright with these models really is just a mess
What I don't understand is that if a lawsuit happens, then must the plaintiff produce their source code for verification ? Even so a git tree is trivial to change into some other arbitrary code even if a license violation has occurred. I also heard if proven the consequences are that they would lose all revenue starting from when the violation has occured
I wonder if this is a bigger risk/more widespread in the AI era? Could a bad actor with a copy of someone else's proprietary source code train an LLM on it and come out with code that does not show enough evidence of theft?
Since the Tweet is small enough and a lot of people aren’t reading it (Twitter links don’t work well for those without an account some times) I’ll quote it here
> Hey Nico,
> It looks like you didn't vibe code your data room but stole it from Papermark's open source and enterprise-licensed code.
> We demand you take this copyright and license infringing product down immediately.
> It's not moving fast and breaking things, it's fraud.
> It makes the rest of your business questionable and the YC community look terrible.
I wonder if Nico will be feeling so cocky when Papermark gets their general counsel involved. The public Twitter shaming was clearly an attempt to resolve this without litigation, but hey, if that's how Nico truly feels, guess he gets to see what's behind door #2 (a massive bill for a legal retainer).
>I am curious if/how YC will handle this to get ahead of earning a reputation of being a den of scammers
flock is a YC company, so it's pretty clear that YC does not care about a negative reputation. as long as it makes money, apparently nothing else matters.
Many YC companies do bad things, and I guess they do so independently. There may well be repercussions for the most egregious cases, but I suspect a lot of ill-behaviour simply flies under the radar.
For example only yesterday I got spam from an YC company, Polymath, and I replied back asking where they got my details from - no response yet. Once I get something I'll make a GDPR subject access request, then a deletion request. I hope the overhead of that causes them to rethink their spamming campaign.
I have also gotten spammed by a YC startup, but they spammed an email that I use in git commits, and lead with "I saw your fork of $POPULAR_PROJECT, pretty cool!" or something like that and then continued to pester me with their drip program even as I replied asking them to never email me again.
I didn't realise that one could forcibly require a competitor to disclose trade secrets.
Now, INAL of course, but I would think this sort of mechanism would be quite gameable from both sides ( i) a wealthy competitor legally forcing a promising upstart to reveal source ii) a copycat working out some kind of arrangement where the code itself is licensed to them via shell company based overseas.)
As with most legal hacks, the courts figured this one out long ago :).
If someone is trying to dig into their competitor's trade secrets via discovery, the court offers multiple ways to safeguard against that. The defendant can identify information as a trade secret and ask that it be protected in some way - for example, the documents may be restricted to "Attorneys' Eyes Only", so while the plaintiff's attorneys can review the material, the plaintiffs themselves are barred from reviewing it. Or the judge themselves may get involved in an in-camera session.
There are software engineers that specialise in source code analysis that lawyers will often use in these cases. The engineers will be given access to source code in secure environments where they're not allowed to bring any device in or out. They review, analyse, and write up a report using pen and paper, that can then be reviewed by the lawyers.
Absolutely. It was very similar to one of my first jobs: "Legal Technical Analyst". Not as much time doing deep source analysis, but basically translating things for lawyers: "So as far as this claim of copyright/plagiarism... this block here, that's CS 101 stuff, that block there, that's novel, and does x, y and z".
What's with this response in the Twitter thread??:
"This ain't what a C&D looks like. Implies you don't actually have a leg to stand on. Upload a copy of your official legal demand (from a lawyer) or I'll forever see your company as one who attempts to bully the competition in public"
The X link has screenshots where the two products have lots of identical pages. Is that IPable? Honestly don't know since I seem to use a lot of products that look like other products (LibreOffice, etc). But the pages for obscure things looking identical is kind of sus.
Yes, written verbiage is subject to copyright. UI is also subject to copyright. The degree of similarity is astounding - this is not an edge case at all.
The lack of understanding of copyright on HN does astound me, however.
This isn't a case of convergent design (OpenOffice vs. Microsoft Word), this is identical word-for-word with a simple s/room/dataroom:
> When enabled, folders uploaded to Rooms will be mirrored into 'All Documents' with the same structure.
When disabled, all documents will be placed in a single folder named after the Room in 'All Documents!
> This action cannot be undone.
- All documents and folders will be permanently removed
- All links and viewer access will be revoked
- All analytics, audit logs, and Q&A data will be lost
- Group permissions and branding will be deleted
Yeah, the title that the OP chose is so sufficiently misleading that I think this one will need to be get changed by the mods. Seitz isn't opining on the ethics of vibe coding in his tweet, he's pointing out that Corgi literally just stole Papermark's AGPL codebase and passed it off as vibe coding.
Short segments of popular works sure. Many UI pages with identical layouts and copy, essentially zero chance. The agent had access to the original code at inference time.
It's nearly word-for-word the content of the tweet. Right at the top. It isn't misleading unless you literally don't even bother to open the linked content.
Just ban users who comment without reading, I think that would go further to keep the quality of discussion high.
The number of bots/trolls responding to the title without reading the content and missing the point entirely is astounding, honestly, and I don't think any of those posts are contributing to high quality discussion. We could do without those users.
"but but but I can't/won't open twitter links" - then don't flap your yak-hole. Ignoring for a moment that the content has been reproduced in full in this thread, and another user has provided an alternative xcancel link.
Ideally yes, but we know people don't RTFA - there's a reason that initialism dates back to early Slashdot.
The paraphrase is doing a lot of heavy lifting to convert it to ragebait. Had the OP gone with something like "you didn't vibe code it, you plagiarized Papermark's open source project" (may need some editing to fit under the character limit) it would have at least been more true to the original tweet.
I know I RTFA, and I know I'm not interested in discussing things with people who don't. Maybe others feel differently, because more people is better or something. Information pollution is a serious, persistent, growing problem and I'm just not inclined to be tolerant about it anymore. Mistakes are one thing, deliberate stupidity is another.
If you come to book club without reading the book, and you derail the conversation into something completely irrelevant, you're not getting invited back.
I remember a few cases when asking an LLM to do something in the early days yielded not only the code but an author and a COPYRIGHT license.
Naturally LLM technology has moved on since then. I don't remember any recent word for word reproductions of a copyright license.
There are a lot of people lauding the technology though because it occasionally one-shots a wildly impressive example of something which...already exists.
FOSS licenses were obviously written in the spirit of sharing with humans. Some later licenses made the license less amenable for sharing with corporations because some authors didn't feel like they were being treated fairly. Some authors today have similar feelings about their code being used by Gen AI. It is perfectly fine for authors to want to place restrictions on how they want others to use their work.
> Step out of the FOSS swamp, step in to human dignity.
"Spirit" means nothing when it comes to legal - or even community - compliance. Either something is allowed, or it isn't, and if a license doesn't do everything that a user of said license desires then they should change that license. Just as licenses were made that explicitly made sharing with corporations less amenable, so should licenses re Gen AI usage. Only then is it worth making a case.
I’m old and I don’t recall FOSS being about truly free, truly open, just not for some categories of use.
In fact I seem to recall FOSS advocates denouncing licenses that put limits on who could use the software or for what purpose. This “it was always only for humans” take is new to me.
Surely "only for humans" is the obvious default given that there were no AI megacorps when these licenses were written?
Surely it's always been obvious that the person doing the sharing is the one to decide on the terms of the sharing? Maybe I want to share my cake with you but not with someone I don't like? How is that not my decision to make?
I'm absolutely fine with people having different sharing philosophies. Different licenses with different nuances are a thing. But I don't like this take that everything that was shared is automatically retconned to be included in AI training data. That's not the spirit in which I shared my stuff. Maybe that's the spirit in which you shared yours, and I respect that.
Human dignity when it comes to work and contribution is very simple:
Software developers should charge a fair price for their products from their users. That's dignified and beneficial for everybody involved. And it doesn't invite "code stealers" or anybody who wants to reap what they didn't sow.
Just like any type of work. Fair compensation is the key. Not working for free for people who don't care about you and then complain that they didn't give you anything.
Even so, what's wrong with this? They told you up front that they're going to discriminate. Students can use the code freely, businesses may struggle. People don't need to be fair.
Yo! Open Source Software works within copyright law. Your software should comply with the OSS licence you are forking/redistributing from. If you don't comply, OSS freedoms are void and it defaults back to being copyrighted material for you. Comply with licences. And enjoy the freedoms. Otherwise, you are copying from a copyrighted material. Which is illegal. Comply or write it from scratch.
Or... Be nice and ask. People tell u what to do. Don't be rude here.
I remember this Video editor software which didn't comply properly with OSS licence of FFMPEG(?). And people told author what to do. It's always cheap to be kind. Or win dumb prizes.
FOSS doesn't mean you give up all rights to your work. In this case, the software is AGPL licensed, which imposes huge list of requirements on copies - including attribution and sharing back changes.
This person is so dangerous that if I offer them to stay in my shaded yard in the middle of the excruciating sun, they will demand that I let them take my house as well.
Has any group of workers ever "won" a long-term victory against a new technology? There are plenty of short-term concessions made in the face of powerful trade union opposition, but I can't think of any technology that was just stopped dead to appease workers with obsolete skills.
That assumes we're talking about technologies that are legal and in some way beneficial. AI is basically large-scale copyright infringement. If allowed to continue, human authors (I'm including programmers here) will eventually just stop publishing, because why feed the machine that's busy replacing you? You're not even getting paid for it, because the magic box can do the same thing you can for cheaper.
Thing is, everything AI produces is derivative; it cannot make anything truly original. Therefore widespread AI adoption will inevitably lead to scientific and cultural stagnation.
So we'll have our magic box that can perform our every wish. And we'll all be worse off for it.
The tweet was fine - it was directly addressing Corgi's claim that they had "vibe coded" DataRoom when they had copied and pasted it from Papermark. The problem is the OP chose to perform a contextectomy on the tweet and make it look like it's making a completely different argument.
Many open source licenses levy restrictions upon the acceptable use of the software. Those restrictions may include attribution requirements, up to and including a requirement to include the license when redistributing the code; they may forbid using derivative works for commercial purposes; they may require the downstream project to utilize the same license. Open source is not the same thing as "anybody can do anything they want forever."
Yup, if we take OSI as defacto authority on open source definition
> 6. No Discrimination Against Fields of Endeavor
> The license must not restrict anyone from making use of the program in a specific field of endeavor. For example, it may not restrict the program from being used in a business, or from being used for genetic research.
Well, if it's my memory at fault then I apologize. My memory of the comment I replied to didn't include the initial qualifying phrase with either word choice.
Copyright violation is not theft. Your effort to create something that can be effortlessly copied conveys to you no property. Society deems it beneficial to grant a time limited monopoly on copying it to spur innovation.
Stealing a car - or anything tangible - means... the owner is very literally deprived of the benefits of owning said car/thing. Can't really say the same for a copied pattern of bits.
Copyleft is still a thing. Right to attribution is still a thing. Please, read about it and you will discover that there is a lot of nuance to the open-source code.
LLM generated code could have very similar pattern to existing code with stricter license it trained on. So, it's better to keep them to yourself instead of bothering the public.
It is, but this isn't competition. This just copyright infringement.
Competition would be if these people created their own software, possibly innovating and improving it in the process. That would encourage Papermark to improve their own offering, and would create an environment where these businesses are economically incentivized to improve the product or service.
Nobody is incentivized to improve the software in question here. If copyright law doesn't protect anything, then improving your product is helping the competition and potentially hurting your business. Same is true if you're the people who did the infringement.
Who cares if the consumer buys it and uses it? Information is worth nothing anymore, attention is, so if they manage to capture a larger audience somehow, they win.
What do you do for a living? For most of us in the tech industry, information being worth something (because it takes creative and intellectual labor to produce) puts food on our tables.
LLMs produce about 95% of the code at my company and review about 70% of it for 3 years now. Our team has downsized from 40 to 8 people in this time. My creative labor is spent writing harnesses and wrappers. When there is enough of a data distribution on this, the LLMs will be able to do that as well.
I have saved up a buffer in funds and bonds because it's going to be over at some point when the company moves from explore to exploit.
This laissez-faire logic is insane, but I think it is telling that a lot of folks here seem to have this mindset and makes me empathize with increasingly nihilistic people.
I agree. It's a sarcasm of the new reality. What is copying vs writing from scratch? The line is blurred now, non-existent. You can ask an LLM to re-write any open source to a degree where there is no definite way to say that it's a derivative.
When everyone is using LLMs to suggest IA, build basic UIs, dump out your startup in a day, etc. everything will look the same, even the source code. There will be no way to litigate this. Does it benefit society to force two companies to make their products look different? Where’s the outrage over all basic pencils looking the same? Let the market decide which pencils it prefers.
Being a bot of a devils advocate here. What I do not understand if it just looks similar, or implements the same features, or the code is actually copied and modified, i.e. the source is obviously from papermark. I think interfaces can be copied, thinking along the lines of implementing a protocol or a feature, so that would be legit. The UI looks very similar but if this is a totally different code then what? is it copyright infringement on the look and feel of the papermark brand?
Clearly it should be an issue for the investors anyway as it “looks” like a copy in the tweet alone, it might mean this code will eventually become available from download to comply with agpl, which in turn wipes out any moat.
> The team that made dataroom has stated that they did not use any of papermark’s code and that dataroom was made from scratch with inspiration from existing document sharing softwares, and that this post’s allegations of us stealing code are false. [...]
The screenshots clearly show they copied whole pages verbatim, both design and texts. The founder, Nico Laqua, basically responding with "we didn't copy _code_" and not taking any responsibility says a lot about his and his company's moral code. It might not be enough to get sued. That doesn't make it right.
https://x.com/nico_laqua/status/2070158170937581951
I wouldn't be that surprised if Nico genuinely thinks "we didn't copy the code" is a reasonable defense. It would be a clear cut rule, and extreme "shape rotator" types often have trouble with the fuzziness of things like law. In reality, copyright infringement is often more like the porn test, you know it when you see it.
If AI can’t make them recognize a work life balance has value then it’s easy to see they don’t believe the “force multiplier” BS they are peddling
Silicon Valley is just so disconnected from reality.
Normally getting insurance from a startup like Corgi would be a very bad idea because what’s to say they’ll be able to pay out claims? I assume other YC startups are happy because a) they can’t get insurance anywhere with good underwriting b) they figure YC will bail Corgi out when it goes wrong because seemingly every YC startup depends on them.
https://en.wikipedia.org/wiki/Risk_retention_group
“Policyholders should be aware that certain Specialty Insurance Carriers may not be admitted insurers in the state in which the insured risk is located. Policies issued by non-admitted insurers, risk retention groups, captive insurers, and certain other Specialty Insurance Carriers may not be subject to all of the insurance laws and regulations of your state. State insurance insolvency guaranty funds may not be available for policies issued by non-admitted insurers, risk retention groups, captive insurance companies, offshore insurers, or other non-admitted Specialty Insurance Carriers. In the event of the insolvency of such a carrier, policyholders may not have access to state guaranty fund protection and may bear the risk of the carrier's inability to pay claims.”
https://www.corgi.insure/disclaimers
Actually normally it’s fine because it’s rarely the startup selling insurance who’s doing the underwriting.
Corgi is more worrying because they’re (apparently) underwriting too.
A rare but sensible insurance tech startup would use external underwriters and reinsurance and provide insolvency protection.
Corgi doesn’t have any external underwriters, doesn’t have any insolvency protection, doesn’t have any reinsurance.
I think they’re bad on all 3 points, not just the underwriting?
Mostly because open source projects rarely sue. If you did this to a more litigious company there's a decent chance they would sue, and I'd give them about a 50/50 chance of winning.
Hard to say whether this would be ruled as copying the creative and artistic elements, or just the methods of operation. Copying features is fine, wholesale copying UX quickly becomes copyright infringement
https://x.com/nico_laqua/status/2061130574358773852?s=20
Perhaps that’s enough for them. Legal gray area worked for Uber, AirBnB and many more.
As a consumer in not happy though, I don’t like incentivizing companies with such creative approach to law.
That would be my cynical response.
Parts of pages. Look at the screenshots. The wording is different between the pages.
As an aside thought not related to the thread: Is it my perception or people are getting more used to not only vibe code things from existing solutions/projects but also "steal" open source code and do whatever the heck they want without complying morally/ethically/legally to the whole premise of open source?
I have the feeling that more than ever open source violations are flourishing everywhere without any major legal consequences.
yes. it's way easier to do now. edit -- plus a lot of new ai-only entrant devs don't understand/care that foss is about freedoms rather than free as in beer.
i work on a GPL3 library that parses a hardware audio sampler's binary data files. someone built an app so people can do "stuff" on top of my library, following GPL3 license.
someone recently posted an entirely vibe-coded clone of that app, full website with purchase links for $60 odd. completely obvious clone too; the UI was exactly the same minus the different colour scheme. no GPL3 conditions adhered to at all. mods delisted the thread. banned the clone's dev. forum community expressed their support for the original app dev. dmca takedowns were sent out. clone's website went down a few days later.
the original app dev was lucky there's only one main forum where people post things for this manufacturer, and the mods hate ai stuff too, which is kind of ironic cos the original app dev vibe codes all his stuff lol. without that forum and those mods, the original app dev would have been fucked tbh (and so would i as the GPL3 library maintainer).
centralization has benefits... without that, the only alternative i see is a mass movement where everyone goes closed source to force a conversation about respecting the work of others. we've been running on an honour/community backlash system until now.
Move fast and break things have changed to be about technology and it is now about the law. Uber popularized the trend, now everybody does the same. AI breaking copyright law is just part of that trend.
With the new "laws are for losers" mentality we are in for a hard time.
Papermark is an open source alternative to DocSend. Papermark is very popular, as it is a much more cost effective alternative to DocSend — self-host or hosted.
Corgi is a YC backed insurance startup that sells insurance to other YC startups. Nico is a founder. Recently they raised $100m at a ~$3bn valuation. They’re one of the darlings of YC right now, endless fawning over them.
Since insurance underwriting involves lots of documents, Corgi were paying Dropbox thousands of dollars per month for DocSend. For some reason, Corgi ostensibly formed a team of 12 to build their own DocSend alternative, called Dataroom. And Corgi decided to make it into a SaaS product, pitched as a cheaper DocSend from just $10/month, in an already crowded space.
Papermark noticed immediately that Corgi’s Dataroom used a lot of identical language and structure that Papermark’s open source product does. Papermark assumed that Corgi had taken Papermark’s work without attribution. Corgi have denied it, claiming it is just a coincidence that there are word for word matches between the products.
Another YC startup, Delve, got caught doing what Corgi are accused of (and much more) which led to their removal from YC.
That's like, nothing, for a company in the insurance business valued at 3b
A startup raises ~$100m at a ~$3bn valuation and forms a team of 12 employees including their Head of Operations to build a clone of a product they pay less than $1,000/month for while they have more than 50 open roles they are hiring for.
Hmmm, yes, a very good use of available resources.
https://xcancel.com/SergioGarc20223/status/20702512486962956...
Delve’s first drama was around copying from other startups, it was later that their betrayal came out. Corgi is currently at the copying from other startups stage… one might choose to believe there is a path they’re following rather than this being a one off.
For example, I outlined in another comment how their product is not what it seems, it is not traditional insurance, it takes advantage of an esoteric piece of insurance regulation. They’re doing very aggressive underwriting without any of the traditional insurance regulatory protections applying to them.
https://news.ycombinator.com/item?id=48672328
Someone might believe that their conduct + very high risk product + exposure to a large number of YC companies means they’re very similar to Delve.
Plus the founders are at the top of another funnel… Forbes 30 under 30. 30u30 is practically a kiss of death.
elsewhere; "Laqua, whose father is a lawyer for an insurance company"
lol
1. no code was manually copied by a developer, and
2. all software in the same space copies off of each other
But the big giveaway here is the exact same layout/copywriting on both products. Telling an LLM "write this product and build a 1:1 clone" is still copying by all sensible definitions. The fact that he argues nothing was copied is ridiculous.
You would be very wrong in this argument. It's extremely well-established that corporate verbiage and UI are subject to copyright.
Their defence seems to be "well we asked an LLM to reproduce your work, so 'WE' never copied your code". Smells bad to me.
You have to share the source code even when the user interacts over the network with the software.
The project which uses that code, must also be AGPL,
There are ways to separate it and go around it, for example, using an AGPL auth server shouldn't affect the code where your business logic lives
I am sure they could have found a way to design their product to be compliant, especially following past drama.
This is assuming the code is indeed copied, since we don't know that for sure, it does look very similar but I am not sure how that is enforced
> This action cannot be undone
> Freezing is reversible from this page
I assume being irreversible is an essential part of the freezing feature.
> Hey Nico,
> It looks like you didn't vibe code your data room but stole it from Papermark's open source and enterprise-licensed code.
> We demand you take this copyright and license infringing product down immediately.
> It's not moving fast and breaking things, it's fraud.
> It makes the rest of your business questionable and the YC community look terrible.
“Team effort”
“:praying-hands (x2)”
And so on… The audacity and complete shamelessness…
I wonder what narrative they tell themselves.
Surely UI enough isn't enough to prove that source code was plagiarised?
In the event Papermark chooses to sue how will the defendant defend themselves short of presenting their own (possibly) closed source?
I am curious if/how YC will handle this to get ahead of earning a reputation of being a den of scammers - a few months after the Delve scandal
flock is a YC company, so it's pretty clear that YC does not care about a negative reputation. as long as it makes money, apparently nothing else matters.
For example only yesterday I got spam from an YC company, Polymath, and I replied back asking where they got my details from - no response yet. Once I get something I'll make a GDPR subject access request, then a deletion request. I hope the overhead of that causes them to rethink their spamming campaign.
But I'm not going to complain to YC about it.
Now, INAL of course, but I would think this sort of mechanism would be quite gameable from both sides ( i) a wealthy competitor legally forcing a promising upstart to reveal source ii) a copycat working out some kind of arrangement where the code itself is licensed to them via shell company based overseas.)
If someone is trying to dig into their competitor's trade secrets via discovery, the court offers multiple ways to safeguard against that. The defendant can identify information as a trade secret and ask that it be protected in some way - for example, the documents may be restricted to "Attorneys' Eyes Only", so while the plaintiff's attorneys can review the material, the plaintiffs themselves are barred from reviewing it. Or the judge themselves may get involved in an in-camera session.
The meme keeps on memeing.
"This ain't what a C&D looks like. Implies you don't actually have a leg to stand on. Upload a copy of your official legal demand (from a lawyer) or I'll forever see your company as one who attempts to bully the competition in public"
-- https://xcancel.com/jacobhartmannx/status/207012600834729596...
Is this just trolling?!
Besides - who is this guy, and why does he think he's owed sight of any legal paperwork?
The lack of understanding of copyright on HN does astound me, however.
This isn't a case of convergent design (OpenOffice vs. Microsoft Word), this is identical word-for-word with a simple s/room/dataroom:
> When enabled, folders uploaded to Rooms will be mirrored into 'All Documents' with the same structure. When disabled, all documents will be placed in a single folder named after the Room in 'All Documents!
> This action cannot be undone. - All documents and folders will be permanently removed - All links and viewer access will be revoked - All analytics, audit logs, and Q&A data will be lost - Group permissions and branding will be deleted
Those are clear copyright violations.
Just ban users who comment without reading, I think that would go further to keep the quality of discussion high.
The number of bots/trolls responding to the title without reading the content and missing the point entirely is astounding, honestly, and I don't think any of those posts are contributing to high quality discussion. We could do without those users.
"but but but I can't/won't open twitter links" - then don't flap your yak-hole. Ignoring for a moment that the content has been reproduced in full in this thread, and another user has provided an alternative xcancel link.
An honest title would be “Corgi didn’t vibe code it, they stole Papermark’s AGPL code”.
Sure, people should read links, but when a writer posts ragebait for engagement, there’s plenty of blame to go around.
I was mostly fighting the title character limit
The paraphrase is doing a lot of heavy lifting to convert it to ragebait. Had the OP gone with something like "you didn't vibe code it, you plagiarized Papermark's open source project" (may need some editing to fit under the character limit) it would have at least been more true to the original tweet.
If you come to book club without reading the book, and you derail the conversation into something completely irrelevant, you're not getting invited back.
Naturally LLM technology has moved on since then. I don't remember any recent word for word reproductions of a copyright license.
There are a lot of people lauding the technology though because it occasionally one-shots a wildly impressive example of something which...already exists.
FOSS licenses were obviously written in the spirit of sharing with humans. Some later licenses made the license less amenable for sharing with corporations because some authors didn't feel like they were being treated fairly. Some authors today have similar feelings about their code being used by Gen AI. It is perfectly fine for authors to want to place restrictions on how they want others to use their work.
> Step out of the FOSS swamp, step in to human dignity.
What is that even supposed to mean?
In fact I seem to recall FOSS advocates denouncing licenses that put limits on who could use the software or for what purpose. This “it was always only for humans” take is new to me.
Surely it's always been obvious that the person doing the sharing is the one to decide on the terms of the sharing? Maybe I want to share my cake with you but not with someone I don't like? How is that not my decision to make?
I'm absolutely fine with people having different sharing philosophies. Different licenses with different nuances are a thing. But I don't like this take that everything that was shared is automatically retconned to be included in AI training data. That's not the spirit in which I shared my stuff. Maybe that's the spirit in which you shared yours, and I respect that.
That may be true, but I don't think it's obvious. What don't I know about the history of OSS?
Not humans who are using AI tools?
Software developers should charge a fair price for their products from their users. That's dignified and beneficial for everybody involved. And it doesn't invite "code stealers" or anybody who wants to reap what they didn't sow.
Just like any type of work. Fair compensation is the key. Not working for free for people who don't care about you and then complain that they didn't give you anything.
The 'spirit of free software' is bullshit. It's software authoritarianism disguised as a noble cause.
Or... Be nice and ask. People tell u what to do. Don't be rude here.
I remember this Video editor software which didn't comply properly with OSS licence of FFMPEG(?). And people told author what to do. It's always cheap to be kind. Or win dumb prizes.
FOSS != public domain.
Yeah, that's nonsense - licenses exist precisely to solve this problem. Read up on it - do everyone a favor.
0. https://en.wikipedia.org/wiki/Drapetomania
Thing is, everything AI produces is derivative; it cannot make anything truly original. Therefore widespread AI adoption will inevitably lead to scientific and cultural stagnation.
So we'll have our magic box that can perform our every wish. And we'll all be worse off for it.
Then it shouldn't reference AI or Vibe coding.
https://www.gnu.org/software/bison/manual/html_node/Conditio...
The most widely used definitions of “open source” do not allow such a prohibition.
> 6. No Discrimination Against Fields of Endeavor
> The license must not restrict anyone from making use of the program in a specific field of endeavor. For example, it may not restrict the program from being used in a business, or from being used for genetic research.
https://opensource.org/osd
I did choose the wrong word, though. Comply, not copy.
their comment still says "copy". the comment you are replying to clarifies that they meant to type "comply", not copy.
since the wrong word is still there, 'by definition' they have not edited it.
Though it looks like in this case they didn't do either.
A cursory look reveals they aren't complying. So, as you say, they are stealing. What's the point of this comment?
Competition would be if these people created their own software, possibly innovating and improving it in the process. That would encourage Papermark to improve their own offering, and would create an environment where these businesses are economically incentivized to improve the product or service.
Nobody is incentivized to improve the software in question here. If copyright law doesn't protect anything, then improving your product is helping the competition and potentially hurting your business. Same is true if you're the people who did the infringement.
What do you do for a living? For most of us in the tech industry, information being worth something (because it takes creative and intellectual labor to produce) puts food on our tables.
I have saved up a buffer in funds and bonds because it's going to be over at some point when the company moves from explore to exploit.
This would fall under patents (design patent at the very least), not copyright.
Furthermore, the English verbiage between the two are literally exactly the same. That's a clear copyright violation.
Both products are so incredibly derivative and boring that I find it very, very hard to care about this "case".
Clearly it should be an issue for the investors anyway as it “looks” like a copy in the tweet alone, it might mean this code will eventually become available from download to comply with agpl, which in turn wipes out any moat.