A common question I hear about sites like weblogs.asp.net and other large communities we run (as well as Community Server in general) is: Why don’t you have support for CAPTCHA?
The short answer is: you don’t need it.
While CAPTCHA isn’t a feature included in Community Server there are several different CAPTCHA implementations that people have developed as add-ons for Community Server. You can download a CAPTCHA add-on for Community Server from the www.communityserver.org file gallery right now.
CAPTCHA, which stands for (C)ompletely (A)utomated (P)ublic (T)uring test to tell (C)omputers and (H)umans (A)part, works well for small sites but larger ‘community’ sites where there are multiple SPAM targets CAPTCHA only provides a false sense of security – it can be broken fairly easily and serious spammers are getting more sophisticated all the time.
Just do a Google search on “Break Captcha” and you’ll find several example. Below is one of the better write-ups on the topic:
http://www.brains-n-brawn.com/default.aspx?vDir=aicaptcha
However, this isn’t the reason we don’t support CAPTCHA in Community Server. It’s an overly cumbersome tool to solve a problem which can be much more easily solved by auto-detecting SPAM:
1. Dynamic Rules Engine – This is what we use for Community Server. It’s a set of rules and scores that validate content as it comes in that is designed to change/adapt/grow as the spam changes. The popular plug-in that Telligent makes use of is the WordPress Akismet plug-in.
2. Bayesian filters – Really only useful if the comment spam follows patterns. Unfortunately the type of spam seems to constantly be changing so a Bayesian filter isn’t really a great solution.
Well I have to disagree. Even if Captcha is not perfect, it’s working well for all the standard spam.
The advantage is when it’s implemented we can all let the blog unmoderate and more ‘alive’.
I think like many we have a life and checking regularly all the comments is time consuming and I found myself often with comments that are already old before being published.
If you think about a vote to ask the bloggers their opinion I vote yes for Captcha in weblogs.asp.net!
Rob,
Have you considered using the many eyes of the community to indicate which comments are spam?
I too do not like CAPTCHAs due to their intrusive nature. I am currently implementing a ‘flag as spam’ feature for dotnetkicks.com which will allow the community to indicate spam posts or comments. If a threshold level is reached, the post or comment is removed from the site.
I think community features like these are effective and actually improve the community feel to the website.
Cheers,
Gavin
Help, again like Rob mentioned the spammers are finding ways around CAPTCHA and it really doesn’t scale well to large sites. It’s getting to the point that C is barely even human readable on some sites, and I personally refuse to visit those sites anymore. The only 100% effective method at this point is moderation, but then that puts the burden on site owners…and for busy sites again that doesn’t scale well.
Gavin, what you mentioned has the opposite effect…works great for large sites but small sites will suffer. Plus what happens when you get a rogue user (or even worse, a group of users) who start flagging everything as spam just to be a nuisance? The more power you give users, the bigger chance of abuse.
I currently use ReverseDOS (google it) and have had virtually no spam since I installed it. The caveat is that it’s .Net only (for now). But it’s a wonderful tool and requires no extra overhead for end users, and minimal administrative overhead to implement and maintain. So far I’ve found it to be the best solution when used in combination with a rules based engine.
Akismet is great as well, but I don’t know how many people use it so who knows how large/effective the database actually is. Plus it’s bound to increase bandwidth so that may put off folks who host their sites elsewhere.
Always about tradeoffs eh? There is no silver bullet fix in this case, and probably won’t ever be.
So is this spam?
I read an article on how much effective spam is and the conclusion was that a spammer has to send a lot of messages to get sales.
Of course cpu power is very cheap I read that onather approuch is that you give tha spammer a task he needs to solve and if he want’s to automate it he could do it. But he will not be profitable if you make the taks cpu intensive.
He will have to pay more for the electricity bill than het gets out if his sales.
The artivle appeared a while ago in Scientific American.
Jayson,
“The more power you give users, the bigger chance of abuse”
I believe the opposite is true as long as the community outweighs the spammers in number terms. You are right in that this technique is only effective for large sites; weblogs.asp.net is certainly big enough to employ this technique.
The community over on dotnetkicks.com is smaller but still manages to filter out many posts that are not interesting, off topic or are spam. The quality of the homepage feed is improving all the time.
Tom, that’s what ReverseDOS does (sort of)…it tricks the would be spammer into thinking that your web server is generating 403 requests until finally their bot thinks your server has gone down and they’ll disconnect. Requires very minimal overhead on the server itself and is quite effective.
Gavin, it only takes a couple of rotten apples to spoil the bunch. I do believe that what you’ve mentioned will be effective, but having a piece of spam on a website for any amount of time whatsoever means the spammer has won. It needs to be blocked at the source.
Captcha does work just don’t use ones that are simple. Also why are all captcha’s are of ‘type the letters of what you see’? How about typing the object of what you see. If there’s a photo of an airplane, type ‘airplane’. AFAIK, there’s isn’t any software out there which can recognize objects. However this solution is a problem for people who don’t know English.
If the burden is on the user, how about a concept that depends on drag & drop? The user doesn’t have to type anything or squint their eyes. (This requires javascript to be enabled or even Flash).
abdu
Abdu, great idea but would be difficult to implement. For example, a picture of an airplane. Some people would just type in “plane” or if they are in the UK, it’s “Aeroplane” so it would be up to the site admins to account for all these variations. And then there’s localizing it…who knows how many terms for “airplane” there are in some languages. Plus someone might see it and type in “flying” or “flight” or something completely unrelated. There are accessibility issues with that as well (as there are with CAPTCHA).
Drag drop is a neat sounding idea, but it’s still intrusive, plus if it relies on code then the spammers will eventually figure out how to get around it. Flash isn’t an option whatsoever, if I saw a site implementing some sort of spam solution that used flash I’d probably never go back and visit.
Something that doesn’t evolve is a false sense of security. I will not argue with that, however, to say that CAPTCHA is not scalalble and cannot be implemented into a large site is simply reflects on a sense that developers are not keeping their systems up to date.
Large sites such as RapidShare use captcha for their free accounts. By keeping their image generation modules up to date, they can stay one step ahead of the developers/hackers that are trying to bypass their download protocols.
Saying that CAPTCHA is useless is like say that since there are hackers out there trying to implement better and stronger virii is a reason not to use Anti-Virus software.
there is very fine solution from http://www.protectwebform.com/
With audio support, customization options, etc.
sorry, some problems with link: http://www.protectwebform.com/