Basically a deer with a human face. Despite probably being some sort of magical nature spirit, his interests are primarily in technology and politics and science fiction.

Spent many years on Reddit before joining the Threadiverse as well.

  • 0 Posts
  • 228 Comments
Joined 1 year ago
cake
Cake day: March 3rd, 2024

help-circle



  • So they’re still feeding LLMs their own slop, got it.

    No, you don’t “got it.” You’re clinging hard to an inaccurate understanding of how LLM training works because you really want it to work that way, because you think it means that LLMs are “doomed” somehow.

    It’s not the case. The curation and synthetic data generation steps don’t work the way you appear to think they work. Curation of training data has nothing to do with Yahoo’s directories. I have no idea why you would think that’s a bad thing even if it was like that, aside from the notion that “Yahoo failed therefore if LLM trainers are doing something similar to Yahoo then they will also fail.”

    I mean that they’re discontinuing search engines in favour of LLM generated slop.

    No they’re not. Bing is discontinuing an API for their search engine, but Copilot still uses it under the hood. Go ahead and ask Copilot to tell you about something, it’ll have footnotes linking to other websites showing the search results it’s summarizing. Similarly with Google, you say it yourself right here that their search results have AI summaries in them.

    No there’s not, that’s not how LLMs work, you have to retrain the whole model to get any new patterns into it.

    The problem with your understanding of this situation is that Google’s search summary is not solely from the LLM. What happens is Google does the search, finds the relevant pages, then puts the content of those pages into their LLM’s context and asks the LLM to create a summary of that information relevant to the search that was used to find it. So the LLM doesn’t actually need to have that information trained into it, it’s provided as part of the context of the prompt,

    You can experiment a bit with this yourself if you want. Google has a service called NotebookLM, https://notebooklm.google.com/, where you can upload a document and then ask an LLM questions about the documents’ contents. Go ahead and upload something that hasn’t been in any LLM training sets and ask it some questions. Not only will it give you answers, it’ll include links that point to the sections of the source documents where it got those answers from.





  • Betteridge’s law of headlines.

    Modern LLMs are trained using synthetic data, which is explicitly AI-generated. It’s done so that the data’s format and content can be tailored to optimize its value in the training process. Over the past few years it’s become clear that simply dumping raw data from the Internet into LLM training isn’t a very good approach. It sufficied to bootstrap AI development but we’re kind of past that point now.

    Even if there was a problem with training new AIs, that just means that they won’t get better until the problem is overcome. It doesn’t mean they’ll perform “increasingly poorly” because the old models still exist, you can just use those.

    But lots of people really don’t like AI and want to hear headlines saying it’s going to get worse or even go away, so this bait will get plenty of clicks and upvotes. Though I give credit to the body of the article, if you read more than halfway down you’ll see it raises these sorts of issues itself.











  • “Never intended” doesn’t mean it doesn’t work as one.

    The point I’m making here is that if we already have a chunk of plastic, why not bury it? Your own comment that I originally responded to was about how the composting process for these bioplastics is difficult to do and so people rarely do it. Landfills are comparatively quite easy and common, we already have that process well established. So if you’ve got a chunk of carbon-rich plastic right there in your hand and you’re trying to decide what to do with it, which makes more sense, turning it into CO2 to vent into the atmosphere, or sequestering it effectively forever? There are carbon sequestration projects that go to much greater lengths to bury carbon underground than this.



  • It absolutely baffles me how states are able to botch executions like they’re doing. I’ve had many dogs over my lifetime and sadly that means I’ve seen many of them off to the rainbow bridge at the ends of theirs, and there’s never been a botched euthanasia. I guess vets are just more professional and compassionate than these executioners.

    I oppose the death penalty universally. But I’ve long argued that if you absolutely must execute someone and must avoid the messiness of exploding their brain for instant painlessness and reliability, then nitrogen gas asphyxiation is probably the best way to go - completely painless and incredibly hard to botch. Just flood the room with nitrogen gas, how hard is that? It’s a common industrial accident. And yet there was a case recently where a state tried nitrogen gas asphyxiation and the monsters somehow managed to botch even that.