Rehash 15.05.1 - Release Notes
The primary cause of the slowdown was due to the fact that rehash did large JOIN operations on text columns in MySQL. This is bad practice in general due to performance reasons, but it causes a drastic slowdown with MySQL cluster, which prevents the query optimizer from doing what's known as a "pushdown", and allowing the query to execute on the NDB nodes. This caused article load to be O(n*m), where n was the number of articles in the database and m was the number of articles with the neverdisplay attribute set. The revised queries now load at O(1). Instead it had to do multiple pulls from the database and assemble the query data on the frontend, a process that took 4-5 seconds per problematic query. The problem was compounded that there are limited number of httpd daemons at any given moment, and any database pull that hit a problematic query (which were in index.pl and article.pl) would cause resource exhaustion.
Fortunately, our load balancer and varnish cache have a fairly high timeout waiting for httpd to come available, preventing the site from soyling itself under high load, or when we do an apache restart, which prevented SN from going down. Thank you for everyone's patience with this matter :).
~ NCommander
(Score: 2) by iWantToKeepAnon on Tuesday June 02 2015, @10:38PM
"Happy families are all alike; every unhappy family is unhappy in its own way." -- Anna Karenina by Leo Tolstoy