• 4 Posts
  • 7 Comments
Joined 10 months ago
cake
Cake day: July 26th, 2024

help-circle
  • But every time someone gets on their soapbox in the comments it’s like they don’t even know the first thing about the math behind it. Like just figure out what you’re mad about before you start an argument.

    The math around it is unimportant, frankly. The issue with AI isn’t about GANN networks alone, it’s about the licensing of the materials used to train a GANN and whether or not companies that used materials to train a GANN had proper ownership rights. Again, like the post I made, there’s an easy argument to make that OpenAI and others never licensed the material they used to train the AI, making the whole model poisoned by copyright theft.

    There’s plenty of uses of GANNs that are not problematic. Bespoke solution for predicting the outcomes of certain equations or data science uses that involve rough predictions on publically sourced statistics (or privately owned.) The problem is that these are not the same uses that we call “AI” today – and we’re actually sleeping on much better uses of neural networks by focusing on a pie in the sky AGI nonsense being pushed by companies that are simply pushing highly malicious, copyright infringing products to make a quick buck on the stock market.


  • See, I’m troubled by that one because it sounds good on paper, but in practice that means that Google and Meta, who can certainly build licenses into their EULAs trivially, would become the only government-sanctioned entities who can train AI. Established corpos were actively lobbying for similar measures early on.

    As long as people are paying other people, these things will equalize eventually. Ultimately, it would be much more likely that the cost of AI production would become so severe that it would no longer be viable as a business (which, frankly, is fine. There will eventually be enough public domain content that AI will be at the quality it is today with public materials alone.)


  • What I want from AI companies is really simple.

    We have a thing called intellectual property in the United States of America. If I decided to make a Jellyfin instance that I charged access to, containing material I didn’t own, somehow advertising this service on the stock market as a publicly traded company, you would bet your ass that I’d have a 1 way ticket to a defense seat in court.

    AI companies, otherwise, operate entirely on data they don’t own and don’t pay licensing for ANY of the materials that are used to train their neural networks. So, in their eyes, any image, video (tv show/movie) or book that happens to be posted on the Internet is fair game in their eyes. This isn’t how intellectual property works for individuals, so why exactly would a publicly traded company have an exception to this rule?

    I work a lot in the world of FOSS and have a firm understanding that just because code is there doesn’t make it yours. This is why we have the GPL for licensing. In fact, I’ll take it a step further and say that the entirety of AI is one giant licensing nightmare, especially coding AI that isn’t actually attributing license details with the code they’re sampling from. (Sampling code being notably different than, say, learning from. Learning implies self-agency, and not corporate ownership.)

    It feels to me that the AI bubble has largely been about pushing AI so hard and fast that people were investing in something with a dubious legal state in the US. Nobody stopped to ask whether or not the data that Facebook had on their website (for example, they aren’t alone in this) was actually theirs to own, and what the repercussions for these types of decisions are.

    You’ll also note that Tech and Social Media companies are quick to take ownership of data when it benefits them (artists works, intellectual property that isn’t theirs, random user posts about topics) and quick to deny ownership when it becomes legally burdensome (CSAM, illicit drug deals, etc.) to a degree that no individual would be granted. Hell, I’m not even sure a “small” tech startup would be granted this level of double-speak and hypocrisy.

    With this in mind, I am simply asking that AI companies pay for the data that they’re using to train AI. Additionally, laws must be in place that allows for the auditing of all materials used to train an AI with the legal intent of verifying that all parties are paid accordingly. This is how every other business works. If this were somehow granted an exception, wouldn’t it be braindead easy to run every “service” through an AI layer in order to bypass any and all copyright laws?

    Otherwise, if facebook and others want to claim that data hosted on their website is theirs to own and train off of – well, great, but there should be no exceptions to this and they should not be allowed to host materials they then have no ownership over. So pictures of IP they don’t own or materials they want to claim they have no ownership over must be removed from the platform. I would much prefer the first of these two options, however.

    edit: I should note, that AI for educational purposes could be granted an exception for this under fair use (for university) but would still also be required to site all sources used to produce the works in question (which is normal for academics, in the first place.) and would also come with some strict stipulations on using this AI as a “product” (it would basically be moot, much like some research papers). This basically the furthest I’m willing to give these companies.