Why Are Search Engine Opts Interested In Content Mills?
What is a content mill?
Content mills are sites that publish cheap filler content. The content is either user-contributed, paid, or a mix of the two. The term content mill is obviously my choice; the implication being that the content is only published to pump content into search engines, and is typically of low value in terms of benefit or quality. More and more we have Auto-generated software driven sites, especially blogs filled with unintelligible ca-ca.
The obvious problem is that some sites that publish cheap content may well provide value, yet it depends who is reading it. For example, a forum could be considered a content mill, as it contains cheap, user-generated content of little value to a disinterested visitor, or a forum might be a valuable, regularly updated resource provided by a community of enthusiasts!
Depends who you ask.
Are Content Mills the Future of Online Publishing? What Comes Next?
Content mills are all the rage in 2010. Let’s take a closer look.
This idea is nothing new. It’s actually white-hat SEO strategy, and has been used for years. Think “Article Marketing”.
Research keywords-preferably long tail
Write content about those keywords
Publish content and attempt to rank that content in search engine results
Repeat
If you can publish a page at a lower cost than your advertising return, then you simply repeat the process over and over, and you’re golden. Think Adsense, affiliate, and similar means to monetize sites/pages. Take a look at Demand Media.
The problem with content mills is that there is enough auto generated crap floating around online clogging up the filing cabinets ie; “Google-MSN-Bing-Yahoo etc.
The first one to invent a site shredder like a paper shredder will retire happy.
One of the problems with content mills is that in an attempt to drive the production cost of content below the predicted return, some site owners are producing garbage content, usually by facilitating free contributions from users.
At the low end, Q&A sites proliferate wherein people ask questions and a community of people with opinions, informed or otherwise, usually not, provide their two cents worth. Yahoo Answers? Na!
Sadly, most of the answers are worth somewhat less than two cents, resulting in pages of little or no value to an end reader. I’m sure you’ve seen such pages; as such pages often rank well in search engines if they are published on a domain with sufficient authority.
Some sites, like Mahalo, not only automate their page creation, yet they use that automated page to generate automate related question pages as well. The rabbit hole has no bottom! Whoa Alice! Blue pill-Red pill?
At the other end of the spectrum, we have sites that publish higher-cost, well researched content sourced from paid reputable writers. A traditional publishing model, in other words. Generally speaking, such pages are of higher value to end users, yet the problem is that the search engines can’t seem to tell the difference between these pages and the junk opinion pages. If the content mill has sufficient authority, then the junk gets promoted.
The problem here is that every provider of freelance content is NOT providing junk – though most are. As far as I know, there is no current semantic processing that can sort out the two. It’s tough to see how this could be quickly and effectively reined in, at least not by software algorithms. I assume that this kind of empty filler content is not very useful for visitors — it certainly isn’t for me.
The Future of Content Mills
Such sites will surely appear on Google’s radar, because junk, low value content doesn’t help their end users. And as soon as they have a way to delete the useless content-oops there goes most of the current sites right down the old …….
It must be a difficult problem to solve, else Google would have done so by now, yet I think it’s reasonable to expect Google will try to relegate the lowest of the low-value content sites at some point. If you are following a content mill strategy, or considering starting one, it’s reasonable to prepare for such an eventuality. So when your business wants a reliable source of one off-quality items written about your company, you know who to contact.
http://Online-Publishing-Group.com/
The future, I know, is not to be a content mill, in the real sense of the word. Aim for quality only.
Arbitrary definitions of quality are difficult enough, as we’ve discussed above. Objective measurement is impossible, because what is relevant to one person may be non relevant to the next. Therefore to have an objective view and or experience of said quality instead of a subjective requires another’s view point. The field of IQ (information quality) may provide us some clues regarding Google’s approach. IQ is a form of research in systems information management that deals specifically with information quality.
Here are some of the metrics they use:
Authority- Authority refers to the expertise or recognized official status of a source whether real, imagined and or contrived i.e. artificially manufactured.
Consider the reputation of the author and publisher. When working with legal or government information, consider whether the source is the official provider of the information. Always check the online reputation management sites for sources first.
Scope of coverage – Scope of coverage refers to the extent to which a source explores a topic. Consider time periods, depth, weight, geography or jurisdiction and coverage of related or narrower topics.
Composition and Organization- Composition and Organization have to do with the ability of the information source (writer) to present its particular message in a coherent, logically sequential manner.
Objectivity – Objectivity is the bias or opinion expressed when a writer interprets or analyzes facts. Consider the use of persuasive language, the source’s presentation of other viewpoints, its reason for providing the information and advertising. I.e. agenda.
Validity – Validity of information has to do with the degree of obvious truthfulness which the information carries or doesn’t.
Uniqueness – As much as ‘uniqueness’ of a given piece of information is intuitive to its meaning, it also significantly implies not only the originating point of the information yet also the manner in which it is presented and thus the perception which it creates. The essence of any piece of information we process consists to a large extent of these two elements.
Timeliness – Timeliness refers to information that is current at the time of publication.
Consider publication, creation and revision dates.
Reproducibility
Any of this sound familiar? It should, as the search landscape is riddled with this terminology. This is not to say Google looks at all these aspects, yet they have used similar concepts, starting with Page Rank.
As conventional “Search Engine Opinion” wisdom goes, Google may have tried to solve the relevancy problem partly by focusing on authority, on the premise/assumption that a trusted authority must publish trusted content, so the pages of a domain with a high degree of authority receive a boost over those with lower authority levels. However this situation may not last, as some trusted sources, in terms of having authority, do, at times, publish auto-generated garbage content. Google may well start looking at composition metrics, if they aren’t doing so already.
This is all pure speculation, of course. Because even Google often doesn’t know what Google is doing.
I think a good rule of thumb, for the time being, should be “will this page pass human inspection?” If it looks like junk to a human reviewer in terms of organization, and reads like junk in terms of composition, it probably is junk, and Google will likely feed such information back into their algorithms.
James Tyler
Managing Editor
Online Publishing Group.com
Publishing & Media Professionals
1-800-341-3593
