Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Incubator Q&A

Welcome to the staging ground for new communities! Each proposal has a description in the "Descriptions" category and a body of questions and answers in "Incubator Q&A". You can ask questions (and get answers, we hope!) right away, and start new proposals.

What's the deal with content farms? Question

+4
−0

In the last 5-10 years, I've noticed that many searches (especially Google) about technical or semi-technical questions, have the results clogged with "content farm" results.

It's hard to precisely define "content farm", but the Wikipedia article gives a good idea. Best I can do is say "if you've seen it, you know it".

I've noticed two types:

  1. Low-quality but more focused and obviously human-written ones like www.geeksforgeeks.org or www.maketecheasier.com. These are clearly a case of paying a large number of unqualified writers very small rates, in a bid for quantity over quality. At the rates that the site must be paying, you won't get any genuine experts, nor will they spend any time doing real research, so the writers simply do a half-hearted web search on each topic and carelessly copy paste. The result is an article that is not comprehensive, badly structured, but at least clearly written by a human brain.
  2. Extremely low-quality, broad scope and very formulaic sites. I can't think of an example right now, but usually these will have a pattern that seems computer generated: Say the article is titled "How to repair widget X?" and there will be a lot of nonsense filler sentences like:

It is very important to repair widget X. Users of widget X are often frustrated when widget X breaks. The solution to this is to repair widget X. This article will go over the steps of how to do repair widget X. Repairing widget X will allow widget X to return to normal function.

As you can see this isn't actually providing information, it is just restating the question over and over. These types of articles will then have even more filler under sections like "Breaking down widget X". One example would be: https://exputer.com/guides/palword-fluffy-pal-bed/ (you can find many more by searching for palworld fuzzy pal bed on DuckDuckGo)

I'm not asking about the 1) but about 2). 2) seems like it's way too mind-numbing and repetitive for any human to write. Humans would write something less formulaic out of sheer boredom. However, I was seeing such sites before the advent of LLMs, so that can't be what they're using.

So what's the deal here? Are these really fully automated articles that rely on some earlier text generation tech, like seq2seq? Or is it still humans doing most of the work, but there's some kind of interface they're made to use which results in such repetitive prose?

History
Why does this post require moderator attention?
You might want to add some details to your flag.
Why should this post be closed?

0 comment threads

0 answers

Sign up to answer this question »