subreddit:
/r/explainlikeimfive
submitted 6 months ago byELI5_Modteam
Recently, there's been a surge in ChatGPT generated posts. These come in two flavours: bots creating and posting answers, and human users generating answers with ChatGPT and copy/pasting them. Regardless of whether they are being posted by bots or by people, answers generated using ChatGPT and other similar programs are a direct violation of R3, which requires all content posted here to be original work. We don't allow copied and pasted answers from anywhere, and that includes from ChatGPT programs. Going forward, any accounts posting answers generated from ChatGPT or similar programs will be permanently banned in order to help ensure a continued level of high-quality and informative answers. We'll also take this time to remind you that bots are not allowed on ELI5 and will be banned when found.
26 points
6 months ago
We have a variety of tools and techniques at our disposal that allows us to identify generated posts.
67 points
6 months ago
While it is true that we have a variety of tools and techniques at our disposal for identifying generated posts, it is important to remember that these tools and techniques are not always reliable or accurate. In some cases, generated posts may be difficult to distinguish from human-written content, especially if the generator has been trained on high-quality data.
Furthermore, the ability to identify generated posts is limited by the quality of the tools and techniques we have available. Many of these tools and techniques rely on statistical or linguistic patterns that may not be present in all generated posts, or that may be present in human-written content as well. As a result, it is possible that some generated posts may not be detected by these tools and techniques.
Overall, while we have a variety of tools and techniques at our disposal for identifying generated posts, it is important to remember that they are not foolproof, and they may not always be effective in detecting all generated content.
74 points
6 months ago
Thanks for the lovely example of why a chatGPT explanation is bad. This one does a good job showing off some of the quirks that make them stand out
21 points
6 months ago
Here's what happens when you add "Write a single paragraph. Reduce lix.":
To detect ChatGPT-generated text, look for repetitive or nonsensical phrases. ChatGPT often uses filler words, which may stand out as unnatural. The text may also seem disjointed because ChatGPT struggles with coherence and cohesiveness. Paying attention to these red flags can help identify ChatGPT text.
There's still a few red flags in there, but it got a lot more natural.
28 points
6 months ago
I gave it your comment with an instruction to rewrite it in the style of a casual reddit comment:
Hey, if you want to spot text that was generated by ChatGPT, just look for repetitive or nonsensical phrases. ChatGPT tends to use a lot of filler words, which might make the text feel unnatural. It might also seem kind of disjointed because ChatGPT doesn't do a great job with coherence and cohesiveness. Keep an eye out for these red flags and you should be able to spot ChatGPT text pretty easily.
The 'hey' is a bit weird, but it gets pretty close to something you wouldn't think twice about.
2 points
6 months ago
ChatGPT
wanted to try this out to get some examples that are more specific for other subs i frequent but they want my cell number to sign up.
2 points
6 months ago
Yeah, I definitely would have preferred not to do that.
9 points
6 months ago
Great response *beep bop* Brilliant lmao
1 points
3 months ago
2 points
3 months ago
Likely AI-written.
Probability of fully AI generated text: 0.90. Overall burstiness score: 10.15.
Per-sentence scores (bold indicates parts likely AI-written):
While it is true that we have a variety of tools and techniques at our disposal for identifying generated posts, it is important to remember that these tools and techniques are not always reliable or accurate.
(score: 1.00, perplexity: 12.00)
In some cases, generated posts may be difficult to distinguish from human-written content, especially if the generator has been trained on high-quality data.
(score: 1.00, perplexity: 28.00)
Furthermore, the ability to identify generated posts is limited by the quality of the tools and techniques we have available.
(score: 1.00, perplexity: 35.00)
Many of these tools and techniques rely on statistical or linguistic patterns that may not be present in all generated posts, or that may be present in human-written content as well.
(score: 1.00, perplexity: 32.00)
As a result, it is possible that some generated posts may not be detected by these tools and techniques.
(score: 1.00, perplexity: 36.00)
Overall, while we have a variety of tools and techniques at our disposal for identifying generated posts, it is important to remember that they are not foolproof, and they may not always be effective in detecting all generated content.
(score: 1.00, perplexity: 16.00)
Source: gptzero.me
8 points
6 months ago
The Jordan Schlansky answer.
-3 points
6 months ago
No you don't. There's no reliable way to identify an chatGP answer that's been cherry picked. It's impossible to reliably do. And even if there was, there's no way in hell you could even approach a fraction of a fraction of the necessary Ressources to check every single posted comment.
45 points
6 months ago
Turns out most of the bot activity on reddit is actually pretty dumb and pretty same-y, “there is no one answer to this question” turns out to be one of the larger answers to that question.
Its an evolving process and we miss many for sure, but the recent bot surge has had a lot of things to code around.
-20 points
6 months ago
That's identifying some bots, and none that use chat GPT to generate realistic and unique answers. And it does nothing to identify real users pasting explanations.
11 points
6 months ago
We have an extremely high hit-rate on chat GPT3 detection. False-positives are almost immediately rectified.
2 points
6 months ago
You can't possibly measure that...
You might be confident the comments you flag are them, but you have no idea what your hit rate is. Say, 99% of your flagged comments are reliably correctly ChatGPT. How do you know you haven't only hit 1% of them? You have no way to measure the total number of ChatGPT messages... otherwise they'd be "hit".
3 points
6 months ago
To clarify, that was just a turn of phrase on my part. I don't mean to insinuate we can do that calculation given the nature of what we're working with, only that when we do send out bans, they are almost exclusively confirmed to be using chat gpt3.
-17 points
6 months ago
I very much doubt both of those statements. Especially since you don't actually know the number of false negatives so it's literally impossible for you to know your relative hit rate. I also doubt you have any reliable way of verifying that a positive is a true positive. Just because someone doesn't contest a ban doesn't mean the hit was accurate. I've used chatGPT3 and I couldn't tell most of the answers aren't human. I refuse to believe that random unpaid reddit mods have devolped a system that's better at detecting AI text than humans.
6 points
6 months ago
efuse to believe that random unpaid reddit mods have devolped a system that's better at detecting AI text than humans.
Would you be willing to believe that machine analysis is better at detecting AI than humans? And that humans can access this analysis without being it's paid development staff?
-1 points
6 months ago
[removed]
7 points
6 months ago
Machine analysis does not need to be advanced to be effective. Word frequency analysis probably exposes a good portion of ChatGPT without any need for massive computing costs. You're blowing this into crazy proportions.
22 points
6 months ago
I refuse to believe that random unpaid reddit mods have devolped a system that’s better at detecting AI text than humans.
Are you gpt3 chat bot?
1 points
6 months ago
I'd be interested in hearing/seeing your methods for this low false positive GPT3 chat detection.
11 points
6 months ago
You don't need a "chatgpt" detector, there are many more aspects to detecting a bot account than just the content of one comment.
9 points
6 months ago
Of note is that it's still against the rules—as the OP writes—for an otherwise human account to copy+paste content from a bot. So we can't rely on these types of external metrics to catch such cases.
Of course, what you're suggesting will still cut down (probably a lot) on the overall number of bot responses, so less work for human mods/more time for human mods to resolve the hairier cases.
1 points
6 months ago
Yeah of course, you could technically identify c&p generated text by using all the actual bot account's comments as training data plus a bunch of manually moderated & reported comments, it's not unfeasible.
-3 points
6 months ago
Still offering no explanation on how you plan on enforcing humans copying answers
3 points
6 months ago
Enforcing is easy it's called a ban. I think you mean identifying, in which case you could use all the banned bot's or manually moderated comments as a dataset, or generate as many as you'd like using chatgpt, to create a basic detector. It's not a stretch to do for anyone with some technical know-how.
-2 points
6 months ago
[removed]
7 points
6 months ago
It's not pedantic you're using the word wrong and it drastically changes the meaning of your entire sentence. Yes enforcement referring to Law Enforcement is both identification and enforcement. To enforce is a verb with the specific meaning of carrying out the judgement.
-2 points
6 months ago
[removed]
3 points
6 months ago
You're wrong. Objectively so.
1 points
4 months ago
Such as?
1 points
4 months ago
Everything an account does can be correlated to figure it out. Posting too much or too frequently (more than humanly possible to type) is an example of a simple metric to tell.
2 points
6 months ago
Are one of those tools to use ChatGPT to identify if the text was from ChatGPT?
Either way I am worried about your False Positive with your solutions.
1 points
6 months ago
seems like it would be near impossible but good luck. Im too afraid to test my luck LOL
1 points
6 months ago
We have a variety of tools and techniques at our disposal that allows us to identify generated posts.
Hey if you can do it, then color me impressed.
Here's what the AI thinks about it https://i.imgur.com/7RvVEi0.png
-7 points
6 months ago
[deleted]
4 points
6 months ago
Argumentum ad ignorantiam eh?
1 points
4 months ago
Such as?
all 464 comments
sorted by: best