Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I am always wondering: If I take the ee directory and garble it through a LLMA, is it protected by copyright?

Does GitHub Copilot take the license change into account for the ee directory, or does it take the tag from the main directory?



I mean, if you--as a human--read that code yourself, learned how it worked, and then tried to write something similar later--even in a different programming language!--that is highly likely to be contaminated and considered a derivative work <- this is the reason why clean room design--where you have one team look at the original to write documentation which is then cleared by a lawyer to not contain any expressive intent before being handed to a second disjoint team to implement--is a thing people have to do. That GitHub Copilot is somehow afforded some kind magic assumption that it perfectly removes expressive intent is kind of ridiculous and highly unfortunate.


I find it terrifying that this type of reasoning is actually legitimate law.

Think about all the drive by licensing felonies everyone has comitted by accidentally reading open core code and then, at a later point implementing something vaguely related....

EDIT: Typos & punctuation


IANAL but that's why law has to be interpreted by judges rather than being a book of absolute rules.

Lots of codified laws are bent or modified by prevailing social attitudes, customs or precedent, so you could argue "I looked at it a couple of years ago and then independently decided to implement something similar" is a very different situation to "we read through the code and a week later started on a competitor".

Whether you'd get away with it is another matter, depending upon interpretation and perspective.


The clean room described above is a fairly bullet proof technique, but it's only been employed very few times and is not necessary to create an original work. Nearly every work was produced by people who were exposed to other copyrighted works. Unless something is a straightforward copy, it's a high burden to show that something is effectively a copy or translation. A lot of these kinds of things are untested in court probably because it's so difficult to make an abstract case.


Good question.

Does LLM prompt conversion erase license? Probably no.

Does LLM training erase license? ClosedAI argues yes.

Legislation is missing around it. There are many smooth sliders that blur line between yes/no where on one extreme it’s definite yes and on the other definite no.


Hmm, to be honest, I don't know. But I just found an article from Codeium and they stated some problems with licensing stuff with GitHub Copilot, here: https://codeium.com/blog/copilot-trains-on-gpl-codeium-does-....


Bonus-Question: If I am a malicious actor and just want to argue, obfuscate the fact that I blantly stole the code. Could I just prompt: Refactorate this code? Would somebody notice and could it hold in court?


This would, of course, depend on the code. But you could ask it to rewrite the code in another language, and add it; claiming it was your own work (which it is, just using an AI tool) This again opens up the debate of AI generated vs 'hand written' code...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: