mouthporn.net
#openai – @zenosanalytic on Tumblr
Avatar

Racing Turtles

@zenosanalytic / zenosanalytic.tumblr.com

"Why run, my little Phoenician?"
Avatar
Avatar
ralfmaximus

To understand what's going on here, know these things:

  1. OpenAI is the company that makes ChatGPT
  2. A spider is a kind of bot that autonomously crawls the web and sucks up web pages
  3. robots.txt is a standard text file that most web sites use to inform spiders whether or not they have permission to crawl the site; basically a No Trespassing sign for robots
  4. OpenAI's spider is ignoring robots.txt (very rude!)
  5. the web.sp.am site is a research honeypot created to trap ill-behaved spiders, consisting of billions of nonsense garbage pages that look like real content to a dumb robot
  6. OpenAI is training its newest ChatGPT model using this incredibly lame content, having consumed over 3 million pages and counting...

It's absurd and horrifying at the same time.

Avatar

ok I have a bit of a problem with one sentence in this article:

Traditionally in the US, Section 230 shields internet firms from legal liability for information produced by a third party and hosted on their platforms.

I am not an expert on this subject obvsl, but ChatGPT is NOT a "third party" using OpenAI's "platform"; ChatGPT is their PRODUCT which THEY ARE PROVIDING and which is producing misinformation when users request information from it on a subject. This is one of the MANY reasons why presenting these predictive language model chatbots as "Search Engines" was a really fucking stupid idea, and a website presenting itself as providing reliable technews really ought to be able to get this right.(also, just for clarity, this article is from june 9, 2023. And also also, as the article explains this case PROBABLY wont go anywhere, but it's for reasons other than this.)

Avatar

I love it when anons/guests find my works and kudo/leave reviews, but given the new revelation that Elon Musk is using bots to mine AO3 fanfiction for a writing AI without writer's permission, my works are now archive-locked and only available for people with an AO3 account.

what the fuck.

Avatar
kelssiel

to archive lock multiple works at once:

go to your dashboard then “works”

click the “edit works” button

select all the works you want to lock

click on the “edit” button

scroll down to “visibility” and select “only show to registered users”

then “update all works”

this will hide your works from anyone without an archive of our own account

Avatar
landwriter

Looking at the notes and OP and there's a lot of telephone game goin' on. Some information missing from above that I want to put all in one place:

  • data scraping to train AI is really common and if you post things in public places you can expect machines to be fed it these days. nobody asks permission of you. even if they should. if it's on the internet it can be downloaded and will be downloaded and often repurposed/redistributed/repackaged/resold. even if it's behind log-in walls and paywalls.
  • this is not the first time for-profit AI has been trained on copyrighted works (x)
  • if this is true that Sudowrites has scraped Ao3, archive-locking all your existing works will not remove them from the dataset. archive-locking any subsequent fics may exclude those from future datasets.
  • however it is easy to simulate being logged on using a script (x)
  • this information is only a day old and yet to be confirmed beyond "seems fucking likely"
  • ao3 has been made aware of it and has requested people stop sending in support tickets about it
  • elon musk sure is a guy but he is not personally involved in this. he left openai, the research group that produced the tech used by sudowrites, in 2019 (x)
  • this is a very new field with developing legal precedent and the social and ethical concerns of AI trained on human labours to reproduce similar things for cheaper are not going to go away anytime soon. i hope more people get involved in the discussion because of this!
  • i understand otw has a legal fund so maybe fanfic will become a part of that case law. that would be neato

on a user by user basis, personally, i am sitting tight. i am unsurprised. the notion of stories made for free and out of passion being fed into a sausage maker to train software that can sell stories for cheaper than the rates of the humans who do get paid for it is distressing. nonetheless i am not going to archive-lock my fics at this time because probably 98% of my reading consumption was as a guest and i think if algorithms get our stories humans without accounts should too. i am also not confident it would change anything at this point. i respect anyone's decision to archive-lock their stories and understand the desire to do so. i am privately extremely relieved to be on the right side of wall bc two months ago i'd be wondering where all my half-read fics had gone hahaha

You are using an unsupported browser and things might not work as intended. Please make sure you're using the latest version of Chrome, Firefox, Safari, or Edge.
mouthporn.net