search engine optimization articles do it yourself seo hiring search engine experts html and search engine optimization web site analysis and submission service

HTML Help: Stolen Content

After reading your article about copyrights, I wanted to know, how do you find people that have stolen copyrighted content from your web site? Any help you can provide is appreciated.
That's a good question. There are a couple ways to find out if your material is being stolen. Of course, sometimes you just get lucky and either stumble across it while surfing, which I've done a few times; or someone that knows you or is a frequent visitor to your web site reports another site to you that is using your material, and I've had that happen more than once too.

There are other ways that rely less on luck and more on intelligence though. I've outlined them below, first we'll look at how to find textual content theft.

  1. You can usually find unique text passages in your textual content. There will usually be several long phrases or complete sentences that are unique to your writing. By copying and pasting these unique word groupings into a search engine, you can find other sites that have used these unique word groupings.

    Place your word grouping inside quotation marks before you search for them so that the search engine returns pages with that exact phrase, otherwise it will return pages that contain any of the search words.

    By doing this, any page that is registered with the search engine and has that exact word grouping on it will be brought up first in the search results. By visiting those pages you can determine if the page content was stolen from you.

    If you remember when I was writing The Incredible Money Book, I asked AAN subscribers for their own unique money saving tips in exchange for a free copy of the eBook when it was finished. I used this search technique on contributions I suspected may be stolen to see if the contributor simply took content from someone else. I caught two submissions where the content was taken directly from other web sites.

    As a side note, any time you ask your subscribers or web site visitors for text submissions of some kind, be sure to perform exact phrase searches to ensure you're not being given copyrighted content from a person unauthorized to give you the reprint rights.

  2. Someone who steals your words is lazy . . . at least too lazy to write their own content on that subject. A person lazy in one thing is very often lazy in other things. Those too lazy to write their own content are often too lazy to write the HTML to format content they steal, so they copy and paste directly from the source code.

    This tendency of theirs can be used to your advantage. By slipping a made-up word or phrase into the code via a hidden HTML comment, you can search for that made-up word or phrase. You're almost assured that if there are any search results, you've been ripped off.

    A hidden HTML comment is like this:

    <!-- This is a hidden comment. -->
    If you slip a comment into <!-- furplesnitz --> your HTML code, like I just did there, you can search for your made up words at a search engine and find the lazy content thieves pretty quickly. Remember, a hidden comment doesn't show up on the web page, I just forced it to show up in that sentence as an example of where you might include it.

    Or, instead of using made-up words you might use a copyrighted by [your name] notice hidden in the content. The only problem with that is you'll have to sift through your own pages in the search results.

    This trick works best when the HTML comment is hidden in the middle of long paragraph in the middle of the content. Many who steal your content won't look that closely at the actual code, they just look for the beginning and end to copy and paste, so a comment hidden in the middle can easily be overlooked. Hiding it in the middle of a bunch of actual code, where there are lots of other < and > tags, can also be a good hiding spot.

    The only drawback to this method is that many search engines will not index a hidden comment. Instead of hiding the made up word in a comment tag you could hide it in some title attribute text. It's not well known, but you can add a title to many HTML elements such as paragraph tags, table tags, division tags, and many others. Members can log in and go to the Core Attributes chart for further information.

    Or, you could also add made-up words or words that are simply not well know to your page text. Depending on the topic and seriousness of your work, it can be a good gongadoodle! If you don't want to take the time to look up an uncommon word, an alternative is to run two common words together for some crazygood fun.

  3. You can include a transparent GIF image in your code, but you'll need to use the absolute path to the image so it's called from your site to the content thief's site. If the content thief copies and pastes from your source code you can then use your server's raw log files, if your host makes them available to you, to see what sites are calling that image from your server. Find that, find a thief.

    Keep reading if you don't have access to your site's raw log files, there's another way to use this trick explained below.

    To do this, give the image a unique name so you can easily find it in your log files, and hide it in your code in a place where it doesn't mess up your page layout.

    A great way to do this is to place it between two unique words, I'll explain why in a moment. Size it the same as a space, and in fact, use it in place of a space between two words and leave the actual space out. This way there is a visible space between the words as your visitors read the page because they are separated by the transparent image, but in the code there is no space between the words.

    Doing this means a search engine, which will ignore the image when performing a word search, will be able to return valid search results for the run-together words, giving you another way to find the thief with this trick even if you don't have access to your raw log files.

    For example, if you coded a place like this in your page:

    ...the moment the mailman<img 
    width="10" height="10">unexpectedly dropped his bag... could then search for "mailmanunexpectedly" at a search engine and if someone stole your content and didn't replace the transparent GIF image, you've caught them.
Finding someone that steals your images is much more difficult. You can name them with an unusual name (such as "horseT9") and hope that whomever steals it uses the default name the browser will offer to save it as, because if they do you can search for that unique name at the search engines to find where it's being used.

You can also go to an image search engine and search for images of the same sort. For example, if you had an image of a wild horse you could search for "wild horse" and sift through the results to see if your image comes up on someone else's site, but that's kind of a pain, too.

Depending on how an image is used on a web site, it may be better and easier to try to protect the image from the start, rather than to chase down offenders. I've covered that in previous articles so I won't repeat it here since this is getting long.

Lastly, you can join a service like Copyscape and, for a fee, let them keep an eye on your content for you. Of course, you'll still have to search for the content you want to check on, only you'll do it through their service, which won't require these kind of tricks.

This concludes the

HTML Help about Stolen Content and Catching Content Thieves

Back | HTML FAQ | HTML Tag Chart | HTML Index | Web Design Tutorials
search engine optimization course
If you want your web site to rank high in the search engines . . . what are you going to do to get it there? Check out my search engine optmization guide, SEO for YOU: Search Engine Optimization for Ordinary Everyday People!

Check out SEO for YOU now!

Almost a Newsletter

Subscribe today for exclusive website design tutorials and grab some free gifts to boot! Learn more, or subcribe below:

First Name:

Privacy Policy

Did you know...

The member's site has about 100 standards compliant HTML and CSS tutorials, 31 handy reference charts, reprintable content, web graphics, exclusive fonts, free software, free ebooks and more? All this for less than 9 cents a day! [ Details ]
See my fancy bottom! :)