Welcome, poorly behaved Chinese spiders

So some Chinese site called BlogPeople decided to crawl the entire back catalog for Club Troppo, which is a fair whack. And it decided to do it several hundred pages at a time. And it wasn’t very smart about how it generated links.

So here’s a fun discovery. The web server I use is fast and lightweight, but it turns out that while it can dish out buckets of static files under insane loads, if it gets gummed up once on talking to PHP it stops working properly. It starts responding to everything with a ’500 server error’ code and doesn’t reset.

Say, for the sake of giggles, that somebody requests several hundred pages simultaneously, and you have only 6 PHP processes ready to service incoming requests. Multiply this bug by Wordpress’s cack handed architecture and you have a recipe for bringing an otherwise invincible server to its knees.

Come the holidays I may need to sharpen my C skills and fix this bug in the web server myself.

Update:

Like the Terminator, the BlogPeople spider decided that it had to come back to finish the job. We got hammered on again this afternoon and again it took the server offline.

Our answer is to simply drop any traffic incoming from their IP address. That should solve the issue for the moment.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Notify me of followup comments via e-mail. You can also subscribe without commenting.