This would be even more incredibly cool if you could track which hosts
were fed which fake addresses so you could see which hosts were really
doing the spam crawling.
For example:
When a host like crawler.spammer.com goes to the wpoison page, feed it a
unique email address@yourdomain. Of course this would require that you
tie the wpoison script up to your mailer.
Re: polluting the legitimate search engines as well as the bad:
webcrawler.com keeps an (almost) up-to-date list of good webcrawlers (as
well as some bad ones).
(http://info.webcrawler.com/mak/projects/robots/robots.html) This list
is only 3 or 4 months out of date, so search engines like Microsoft's
yukon would probably not be in the list. You could use this to exclude
which engines get the wpoison page or not. Of course this gets back to
the topic of "whitelisting" the internet because of the spammers.
Oh well, I'll shut up now