+ Reply to Thread
Page 1 of 3 1 2 3 LastLast
Results 1 to 20 of 55

Thread: Duplicate Content and vBulletin Forums

  1. #1
    vBFAQ Adminstrator Joeychgo has disabled reputation Joeychgo's Avatar
    Join Date
    Jan 2005
    Location
    Chicago, IL
    Age
    43
    Posts
    6,408
    Blog Entries
    2

    Default Duplicate Content and vBulletin Forums

    There has been alot of discussion regarding duplicate content and how search engines deal with it. Often, people claim there is a duplicate content "penalty" that is imposed by search engines, specifically Google.

    One blog writer wrote:
    there is a real duplicate content penalty for content that is duplicated with minor or no variation across the pages of a single site. There is also a "mirror" penalty for a site that is more or less substantially duplicating another single site. What I'm talking about here is the reprint of pages of content individually, rather than in a mass, on multiple sites.
    By extension, people have come to believe that because of how vBulletin is designed, that there are multiple features which may cause such a penalty. The forum archive is one of the main features people point to as likely to cause such a penalty, since it is effectively a "mirror" or duplicate of the threads on your site.

    I have long believed this to be an incorrect assumption.

    I have never bought into this theory, believing that no penalty was assessed because of the archive. Further I believed that if Google found both the original thread, and the corresponding archive page, google would simply choose to index one or the other, but not both, and would not assess a penalty.

    It appears I may have been wrong...

    I have found evidence that Google will spider and index BOTH the vBulletin thread page and the corresponding vBulletin archive page with no problem. Thus, not only is there no penalty in this particular instance, but there is no reason to not display both.

    To determine this, I went to several popular vBulletin forums, including vBulletin itself, vB.org, Digital Point, The Tivo Community, Programming Talk. I looked at the archive pages, and when one was indexed (as many are), I checked the corresponding thread and often found it indexed also.

    I also found that Google will index the Showthread Showpost AND the printable versions of a thread.

    As a single example, (I found many, but im not going too list them all, you do your own reseach)

    Here is a thread on Digital Point:

    Original Thread: "My Wish"
    Archvie Page: "My Wish" Archive
    Printable Page: "My Wish" Printable Version
    First Post: "My Wish" First Post
    Second Post: "My Wish" Second Post

    So thats an example of how we can have duplicate content in several places from the same thread, all indexed. No apparant penalty.

    Why does this happen? Well, look at those pages. They are all different. They contain SOME of the same elements, but they are different. Dont forget to look at how the pages are contructed. The archive page has little graphics / coding on it compared to the original thread, the printable page is different still, etc etc.

    In my opinion, search engines dont penalize a page or a site just because SOME of its content is available elsewhere on the internet. If a site is substantially duplicated, then it may impose a penalty on a site, or pages within that site.

    I drew this conclusion because there are many websites out there which reprint much or all of its content. Google News is a good example of this.

    Matt Cutts, a Google Engineer, said this in a recent blog entry:
    "Search engines are not trying to penalize content," Mukherjee said. "We're trying to find the right content to promote. Independent of how large our indexes get, there will always be capacity constraints."

    "Honest site owners often worry about duplicate content when they don't really have to," Google's Cutts said. "There are also people that are a little less conscientious." He also noted that different top level domains, like x.com, x.ca, are not a concern.
    When I read that I started looking deeper into how this affects vBuletin. What I came away with from that statement was that

    How does this affect the vBulletin forum owner? Well, one concern of mine was that when someone posts a copy of an article on the forum for others to comment on. Once comments begin to roll in, and the more comments that are posted, changes the page and makes it less and less a duplication and more original.

    I also notice something else while writing this article. Google will index both the main thread and the archive page for a particular search query.



    You will notice that when I did a search for "vBulletin Search" the first 2 results were both from vBulletin' support site. One was a main thread, and one was an archive page. These were different threads. I will also point out that the second (archive page) listing was double indexed, meaning so was its original thread.

    And here is a screenshot of Google listing both the original page AND the corresponding Archive page in the 1st and 2nd places on the page.




    How is this possible?

    Actually, its pretty simple. The standard forum pages are much different then the archive pages. Most people forget to consider that there is a big difference between the forum and the archive. Look at the pages sizes alon as specified by Google. The main page has 81k and the archive has 19k. That alone is a huge difference. Where does that difference come from? The standard formatting of the forum. All the tables, HTML, images, alt tags, etc. that the forum contains is not present in the archive.

    Also, vBulletin 3.5x archive is listed a little differently. What I mean is that the main forum lists the newest threads first. But the archive lists the oldest threads first. The archive lists threads by when they were first posted, and the the main forums list posts by the last time they have been posted to. SO the order of the posts is completely different.

    In checking these 2 pages I found them to be only 7% similar. I used a tool that compares pages: Similar Page Checker.

    I also used this tool to compare the aforementioned thread from Digital Point. I found the original thread was only 38% similar to the archive page for that thread. The Original thread was only 43% similar to the printable thread.

    So you can see, the duplicate content is not really duplicate content at all, at least as far as the search engines see it.


    Why have several copies of essentially the same page indexed?

    Let me answer the question with a question. Why not? What does it hurt?

    In my experience, Google will index archive pages sooner then the standard forum pages. Why? Easy. They are tremendously smaller to store in the cache. The archvie pages are approximately 75% smaller in size. So Google can store 4 times the archive pages in its servers. Google considers this in its algorithm, believe me. They have storage / bandwidth concerns just like we do. Just on a bigger scale. As your Pagerank grows, your site's importance and quality also grows (in Google's eyes), and they are more willing to index the bigger pages.

    Point being, you want your pages to be indexed. Why not give Google every opportunity to index them.


    Conclusion:
    It appears that at the present time, there is no reason to concern yourself with duplicate content as far as how vBulletin is constructed and presents information and content.

    My recommendation, is to have every option available for Google and other Search engines to spider and index. Let Google decide which ones to list. Dont turn off your archive or make it revert to the original thread.



    UPDATE SEPTEMBER 15,2009

    Google today issues a video discussing duplicate content, penalties, etc.

    Watch the video below. You will find that my assertions above from almost 3 years ago are pretty much on the money.

    [ame="http://www.youtube.com/watch?v=6hSoXutuj0g"]YouTube - Duplicate Content & Multiple Site Issues[/ame]



    ~
    Last edited by Joeychgo; 09-17-2009 at 10:13 AM.

  2. #2

    Default

    Usability is certainly weighted in my opinion. These functions increase usability by focusing on certain content is various ways to enhance it. Focus on a post. Display a Printable version. Display a conversation with no images or formatting. Condensing content for syndication.

    I think these things increase our sites value it seems to me.

    Syndication is a great thing too. I'm working on an article about the benefits of syndication. Let me make one quick statement, you are a fool if you do not have people using your synication.(RSS,XML Feeds) If you can't see the benefits of a page of links back to you, you have no business in this business. Even if it's your competitor linking to you, these are strong related links. If you are worth your salt, you could win the sales battle. But some folks are missing out on these benefits.

    More to come later on the ins and outs of syndication. Who owns it, who's entitled to it and what you can do with it.

  3. #3

    Default

    Interesting article joeychgo, very enlightening.

  4. #4
    Moody Admin Peggy is on a distinguished road Peggy's Avatar
    Join Date
    Jan 2005
    Location
    NE Ohio
    Age
    50
    Posts
    12,414

    Default

    yes it is... and now I'm on my way to turn my archives back on

  5. #5

    Default

    I won't turn my archives back off until Google gives me a reason too

  6. #6
    vBFAQ Adminstrator Joeychgo has disabled reputation Joeychgo's Avatar
    Join Date
    Jan 2005
    Location
    Chicago, IL
    Age
    43
    Posts
    6,408
    Blog Entries
    2

    Default

    Quote Originally Posted by BamaStangGuy
    I won't turn my archives back off until Google gives me a reason too

    thats the best course - in my opinion.

  7. #7
    vBFAQ Adminstrator Joeychgo has disabled reputation Joeychgo's Avatar
    Join Date
    Jan 2005
    Location
    Chicago, IL
    Age
    43
    Posts
    6,408
    Blog Entries
    2

    Default

    Updated the article.

  8. #8

    Default

    Quote Originally Posted by Joeychgo
    Updated the article.
    But what changed? I certainly didn't memorize the original. Perhaps an addendum would be better?

  9. #9

    Default

    Quote Originally Posted by noppid
    But what changed? I certainly didn't memorize the original. Perhaps an addendum would be better?
    I'd be interested in knowing too.

  10. #10
    vBFAQ Adminstrator Joeychgo has disabled reputation Joeychgo's Avatar
    Join Date
    Jan 2005
    Location
    Chicago, IL
    Age
    43
    Posts
    6,408
    Blog Entries
    2

    Default

    Everything from below the first screenshot and above Conclusion

  11. #11
    Intermediate vBulletin User SaN-DeeP is on a distinguished road SaN-DeeP's Avatar
    Join Date
    Apr 2006
    Location
    techarena.in
    Posts
    37

    Default

    My recommendation, is to have every option available for Google and other Search engines to spider and index. Let Google decide which ones to list. Dont turn off your archive or make it revert to the original thread.
    this is a nice suggestion indeed..

  12. #12

    Default

    I am also a firm believer of what Joe has said and now I have proofs as well

  13. #13
    Experienced vBulletin User protoss is on a distinguished road
    Join Date
    Jul 2006
    Posts
    259

    Smile

    I'd like to confirm that AFAIC, Joeychgo has hit the nail on the head.

    Some proof, based on my forum and Google.

    We ran with phpBB, until April this year. At the end of April we had 55k listings.
    I did all the SEO work myself. Removed page drain form links, sigs etc. etc. Also installed a phpBB archive. Within 6 months of the initial work our listings grew from 300 to 55k.

    Not knowing enough about vBulletin, I decided to purchase vBSEO. Installed fine and Google visited regular, sometimes three four times a day. At this point we did not activate the vB archive. I had a reason not too, that is moot.
    Anyway, during the period April to July 11th our listings fell from 55k to 900.
    The 900 were all for the new vB installation.

    On 12th July we enabled the vB archive. Our current Google listings stand at 44k.
    Proof that the vBulletin archive works. I think so. 900 to 44k in 11 days, does it for me .

  14. #14

    Default

    First - GREAT article Joe. Thanks for posting that. I think that if we all step back a second and think about when we have personally used Google we would agree about the Archives. I end up on Archived Pages of forums A LOT when doing searches. Often with active topics.


    Second - I wonder what protoss used to find how many listings he had on Google. Anyone?


    Third - May I reproduce that article on my forum?

  15. #15
    Experienced vBulletin User protoss is on a distinguished road
    Join Date
    Jul 2006
    Posts
    259

    Default

    @ G_Man

    Enter site:yoursite.com into the search box.
    No need to put www or http. Applies to all the major search engines.

  16. #16

    Default

    Quote Originally Posted by protoss View Post
    @ G_Man

    Enter site:yoursite.com into the search box.
    No need to put www or http. Applies to all the major search engines.
    Cool. thanks.

    I'll not advertise my big number though!! LMAO!!

  17. #17
    Moody Admin Peggy is on a distinguished road Peggy's Avatar
    Join Date
    Jan 2005
    Location
    NE Ohio
    Age
    50
    Posts
    12,414

    Default

    Quote Originally Posted by G_Man View Post
    Cool. thanks.

    I'll not advertise my big number though!! LMAO!!

    aww c'mon, I wanna see that big number..

  18. #18

    Default

    whoaaaaa. a whopping 41

    It has only just opened though (as of last week)

    What if google detects that mirrors are linking to me, some one who likes my forum came along to my site and has started to make blogs about it, and they all look fairly similar, will I be penalised for that too?

  19. #19

    Default

    Hopefully Google will just discount those, as im getting quite a few link backs from high ranking sites and i cant really stop him from doing this. I do actually appreciate his support quite allot

  20. #20

    Default

    Most unique visitors I get arrive through archive pages

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

     

Similar Threads

  1. End of the duplicate content debate?
    By Dave A in forum Search Engine News and Discussions
    Replies: 3
    Last Post: 04-30-2009, 01:13 PM
  2. Duplicate Content Solution
    By slink in forum SEO Discussion For Your vBulletin
    Replies: 4
    Last Post: 04-29-2009, 02:19 AM
  3. vBulletin Content Management System (vBCMS)
    By Peggy in forum vBulletin Announcements
    Replies: 1
    Last Post: 02-26-2009, 12:30 PM
  4. Archive PDA: Duplicate Content?
    By ResaleBroker in forum SEO Discussion For Your vBulletin
    Replies: 2
    Last Post: 11-05-2005, 02:36 PM
  5. Duplicate Content Question
    By Joeychgo in forum SEO Discussion For Your vBulletin
    Replies: 8
    Last Post: 01-19-2005, 09:34 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts