HTML Help: Stolen Content
After reading your article about copyrights, I wanted to know, how do you find people that have stolen copyrighted content from your web site? Any help you can provide is appreciated.
That's a good question. There are a couple ways to find out if your
material is being stolen. Of course, sometimes you just get lucky and
either stumble across it while surfing, which I've done a few times;
or someone that knows you or is a frequent visitor to your web site
reports another site to you that is using your material, and I've had
that happen more than once too.
There are other ways that rely less on luck and more on intelligence
though. I've outlined them below, first we'll look at how to find
textual content theft.
Finding someone that steals your images is much more difficult. You
can name them with an unusual name (such as "horseT9") and hope that
whomever steals it uses the default name the browser will offer to
save it as, because if they do you can search for that unique name at
the search engines to find where it's being used.
- You can usually find unique text passages in your textual content.
There will usually be several long phrases or complete sentences that
are unique to your writing. By copying and pasting these unique word
groupings into a search engine, you can find other sites that have
used these unique word groupings.
Place your word grouping inside quotation marks before you search for
them so that the search engine returns pages with that exact phrase,
otherwise it will return pages that contain any of the search words.
By doing this, any page that is registered with the search engine and
has that exact word grouping on it will be brought up first in the
search results. By visiting those pages you can determine if the page
content was stolen from you.
If you remember when I was writing The
Incredible Money Book, I asked AAN subscribers for their own
unique money saving tips in exchange for a free copy of the eBook when
it was finished. I used this search technique on contributions I
suspected may be stolen to see if the contributor simply took content
from someone else. I caught two submissions where the content was
taken directly from other web sites.
As a side note, any time you ask your subscribers or web site visitors
for text submissions of some kind, be sure to perform exact phrase
searches to ensure you're not being given copyrighted content from a
person unauthorized to give you the reprint rights.
- Someone who steals your words is lazy . . . at least too lazy to write
their own content on that subject. A person lazy in one thing is very
often lazy in other things. Those too lazy to write their own content
are often too lazy to write the HTML to format content they steal, so
they copy and paste directly from the source code.
This tendency of theirs can be used to your advantage. By slipping a
made-up word or phrase into the code via a hidden HTML comment, you
can search for that made-up word or phrase. You're almost assured that
if there are any search results, you've been ripped off.
A hidden HTML comment is like this:
<!-- This is a hidden comment. -->
If you slip a comment into <!-- furplesnitz --> your HTML
code, like I just did there, you can search for your made up words at
a search engine and find the lazy content thieves pretty quickly. Remember,
a hidden comment doesn't show up on the web page, I just forced it to show
up in that sentence as an example of where you might include it.
Or, instead of using made-up words you might use a copyrighted by [your
name] notice hidden in the content. The only problem with that is
you'll have to sift through your own pages in the search results.
This trick works best when the HTML comment is hidden in the middle of
long paragraph in the middle of the content. Many who steal your content
won't look that closely at the actual code, they just look
for the beginning and end to copy and paste, so a comment hidden in the
middle can easily be overlooked. Hiding it in the middle of a bunch
of actual code, where there are lots of other < and > tags, can also be a good
The only drawback to this method is that many search engines will not index a hidden
comment. Instead of hiding the made up word in a comment tag you could hide it in some
title attribute text. It's not well known, but you can add a title to many HTML elements such
as paragraph tags, table tags, division tags, and many others. Members can log in and go to the
Core Attributes chart for further information.
Or, you could also add made-up words or words that are simply not well know to your
page text. Depending on the topic and seriousness of your work, it can be a good gongadoodle!
If you don't want to take the time to look up an uncommon word, an alternative is to run
two common words together for some crazygood fun.
- You can include a transparent GIF image in your code, but you'll
need to use the absolute path to the image so it's called from your
site to the content thief's site. If the content thief copies and
pastes from your source code you can then use your server's raw log
files, if your host makes them available to you, to see what sites are
calling that image from your server. Find that, find a thief.
Keep reading if you don't have access to your site's raw log files,
there's another way to use this trick explained below.
To do this, give the image a unique name so you can easily find it in
your log files, and hide it in your code in a place where it doesn't
mess up your page layout.
A great way to do this is to place it between two unique words, I'll
explain why in a moment. Size it the same as a space, and in fact, use
it in place of a space between two words and leave the actual space
out. This way there is a visible space between the words as your
visitors read the page because they are separated by the transparent
image, but in the code there is no space between the words.
Doing this means a search engine, which will ignore the image when
performing a word search, will be able to return valid search results
for the run-together words, giving you another way to find the thief
with this trick even if you don't have access to your raw log files.
For example, if you coded a place like this in your page:
...the moment the mailman<img
width="10" height="10">unexpectedly dropped his bag...
...you could then search for "mailmanunexpectedly" at a search engine
and if someone stole your content and didn't replace the transparent
GIF image, you've caught them.
You can also go to an image search engine and search for images of the
same sort. For example, if you had an image of a wild horse you could
search for "wild horse" and sift through the results to see if your
image comes up on someone else's site, but that's kind of a pain, too.
Depending on how an image is used on a web site, it may be better and
easier to try to protect the image from the start, rather than to
chase down offenders. I've covered that in previous articles so I won't
repeat it here since this is getting long.
Lastly, you can join a service like Copyscape
and, for a fee, let them keep an eye on your content for you. Of course, you'll still have to
search for the content you want to check on, only you'll do it through their service, which
won't require these kind of tricks.
This concludes the
HTML Help about Stolen Content and Catching Content Thieves.
If you want your web site to rank high in the search engines . . . what are you going to do to get it there? Check out my search engine optmization guide, SEO for YOU: Search Engine Optimization for Ordinary Everyday People!
Check out SEO for YOU now!
Almost a Newsletter
Subscribe today for exclusive website design tutorials and grab some free gifts to boot!
Learn more, or subcribe below:
Did you know...
The member's site has about 100 standards compliant HTML and CSS tutorials, 31 handy reference charts, reprintable content, web graphics, exclusive fonts, free software, free ebooks and more? All this for less than 9 cents a day! [ Details ]