<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: And now for something completely geeky</title>
	<atom:link href="http://clubtroppo.com.au/2009/06/25/and-now-for-something-completely-geeky/feed/" rel="self" type="application/rss+xml" />
	<link>http://clubtroppo.com.au/2009/06/25/and-now-for-something-completely-geeky/</link>
	<description></description>
	<lastBuildDate>Sun, 12 Feb 2012 03:01:38 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
	<item>
		<title>By: Jacques Chester</title>
		<link>http://clubtroppo.com.au/2009/06/25/and-now-for-something-completely-geeky/#comment-358624</link>
		<dc:creator>Jacques Chester</dc:creator>
		<pubDate>Sun, 28 Jun 2009 14:54:03 +0000</pubDate>
		<guid isPermaLink="false">http://clubtroppo.com.au/?p=8771#comment-358624</guid>
		<description>&lt;blockquote&gt;Queuing cannot (even theoretically) speed up your average transaction rate. The best it can do is load-level, to ensure the peak burst transaction rate is increased, but only if there is time to clear up the queue during quiet periods. This is useful if you expect input transactions to come along in a bursty pattern.&lt;/blockquote&gt;

It depends how we&#039;re measuring transaction rates. In total system terms you&#039;re write; in fact a queue adds overhead. But remember, my top priority is to minimise response time. Pushing data straight to a fast queue means the web server can immediately serve the next request rather than waiting on a database write.

Apart from evening out bursts (as you pointed out), the other reason queues get used a lot is to allow the web and database layers to be scaled independently without much jiggery-pokery.

&lt;blockquote&gt;MySQL is generally faster at simple operations than PostgreSQL.&lt;/blockquote&gt;

Only on a single CPU, and only if you&#039;re using MyISAM. I&#039;d sooner trust my data to used toilet paper.

Postgres scales better, is much more reliable, has better inbuilt crypto, supports proper constraints, has decent VIEWs, allows me to use Lua as the language for stored procedures and is much more extensible in general. I haven&#039;t yet decided whether to bulk load data and then process it, or process it and then do the bulk load. I will need to build prototypes for both and see which works best.

&lt;blockquote&gt;If you are buying virtual machine space, is there any good reason to put the database server on a separate virtual machine? If you buy two VMs, they are just as likely going to end up as two processes on the same hardware anyhow so may as well plonk everything into the same VM (and reduce the communication and context-switch overhead).&lt;/blockquote&gt;

Firstly, it allows me to scale the web front end and the database separately. On the backend there will be a small pool of beefy VMs. On the front end a buzzing cloud of cheap, small VMs.

Secondly, and to me this is more important, it gives me an extra ringfence around the data. In a traditional setup (like my original arch above), you set up the DB servers so that they only respond to traffic from the web servers. In the second and third architectures you can go one better -- they need not even know where the web servers are or vice versa, making it that much harder for an attacker to compromise the data.

&lt;blockquote&gt;Getting it running correctly is more important than getting it running fast, especially when you have no customers&lt;/blockquote&gt;

True; but on the other hand, &quot;measure twice, cut once&quot;. Or should that be &quot;design thrice, code once&quot;?

And also, in my case, speed &lt;em&gt;is&lt;/em&gt; a feature. It partly defines correctness for my requirements.</description>
		<content:encoded><![CDATA[<blockquote><p>Queuing cannot (even theoretically) speed up your average transaction rate. The best it can do is load-level, to ensure the peak burst transaction rate is increased, but only if there is time to clear up the queue during quiet periods. This is useful if you expect input transactions to come along in a bursty pattern.</p></blockquote>
<p>It depends how we&#8217;re measuring transaction rates. In total system terms you&#8217;re write; in fact a queue adds overhead. But remember, my top priority is to minimise response time. Pushing data straight to a fast queue means the web server can immediately serve the next request rather than waiting on a database write.</p>
<p>Apart from evening out bursts (as you pointed out), the other reason queues get used a lot is to allow the web and database layers to be scaled independently without much jiggery-pokery.</p>
<blockquote><p>MySQL is generally faster at simple operations than PostgreSQL.</p></blockquote>
<p>Only on a single CPU, and only if you&#8217;re using MyISAM. I&#8217;d sooner trust my data to used toilet paper.</p>
<p>Postgres scales better, is much more reliable, has better inbuilt crypto, supports proper constraints, has decent VIEWs, allows me to use Lua as the language for stored procedures and is much more extensible in general. I haven&#8217;t yet decided whether to bulk load data and then process it, or process it and then do the bulk load. I will need to build prototypes for both and see which works best.</p>
<blockquote><p>If you are buying virtual machine space, is there any good reason to put the database server on a separate virtual machine? If you buy two VMs, they are just as likely going to end up as two processes on the same hardware anyhow so may as well plonk everything into the same VM (and reduce the communication and context-switch overhead).</p></blockquote>
<p>Firstly, it allows me to scale the web front end and the database separately. On the backend there will be a small pool of beefy VMs. On the front end a buzzing cloud of cheap, small VMs.</p>
<p>Secondly, and to me this is more important, it gives me an extra ringfence around the data. In a traditional setup (like my original arch above), you set up the DB servers so that they only respond to traffic from the web servers. In the second and third architectures you can go one better &#8212; they need not even know where the web servers are or vice versa, making it that much harder for an attacker to compromise the data.</p>
<blockquote><p>Getting it running correctly is more important than getting it running fast, especially when you have no customers</p></blockquote>
<p>True; but on the other hand, &#8220;measure twice, cut once&#8221;. Or should that be &#8220;design thrice, code once&#8221;?</p>
<p>And also, in my case, speed <em>is</em> a feature. It partly defines correctness for my requirements.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tel_</title>
		<link>http://clubtroppo.com.au/2009/06/25/and-now-for-something-completely-geeky/#comment-358620</link>
		<dc:creator>Tel_</dc:creator>
		<pubDate>Sun, 28 Jun 2009 12:52:51 +0000</pubDate>
		<guid isPermaLink="false">http://clubtroppo.com.au/?p=8771#comment-358620</guid>
		<description>A few suggestions...

Getting it running correctly is more important than getting it running fast, especially when you have no customers :-)

Queuing cannot (even theoretically) speed up your average transaction rate. The best it can do is load-level, to ensure the peak burst transaction rate is increased, but only if there is time to clear up the queue during quiet periods. This is useful if you expect input transactions to come along in a bursty pattern.

MySQL is generally faster at simple operations than PostgreSQL. If write performance is the main issue then don&#039;t bother indexing the data (or keep the index as minimal as you can). Also, for the single query at a time database operations, make sure it is not re-parsing the SQL each time (use the long-winded prepare and execute process, should be faster for repetitive operations because prepare only happens once). Probably still won&#039;t be as fast as bulk-insert.

If you are buying virtual machine space, is there any good reason to put the database server on a separate virtual machine? If you buy two VMs, they are just as likely going to end up as two processes on the same hardware anyhow so may as well plonk everything into the same VM (and reduce the communication and context-switch overhead).

There are a few tricks with ext3 that the Squid guys have documented, like disabling updates of access times and similar. Something to try.</description>
		<content:encoded><![CDATA[<p>A few suggestions&#8230;</p>
<p>Getting it running correctly is more important than getting it running fast, especially when you have no customers :-)</p>
<p>Queuing cannot (even theoretically) speed up your average transaction rate. The best it can do is load-level, to ensure the peak burst transaction rate is increased, but only if there is time to clear up the queue during quiet periods. This is useful if you expect input transactions to come along in a bursty pattern.</p>
<p>MySQL is generally faster at simple operations than PostgreSQL. If write performance is the main issue then don&#8217;t bother indexing the data (or keep the index as minimal as you can). Also, for the single query at a time database operations, make sure it is not re-parsing the SQL each time (use the long-winded prepare and execute process, should be faster for repetitive operations because prepare only happens once). Probably still won&#8217;t be as fast as bulk-insert.</p>
<p>If you are buying virtual machine space, is there any good reason to put the database server on a separate virtual machine? If you buy two VMs, they are just as likely going to end up as two processes on the same hardware anyhow so may as well plonk everything into the same VM (and reduce the communication and context-switch overhead).</p>
<p>There are a few tricks with ext3 that the Squid guys have documented, like disabling updates of access times and similar. Something to try.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jacques Chester</title>
		<link>http://clubtroppo.com.au/2009/06/25/and-now-for-something-completely-geeky/#comment-358594</link>
		<dc:creator>Jacques Chester</dc:creator>
		<pubDate>Fri, 26 Jun 2009 18:14:48 +0000</pubDate>
		<guid isPermaLink="false">http://clubtroppo.com.au/?p=8771#comment-358594</guid>
		<description>Those tests cover the third architecture, comparing the two different file systems.</description>
		<content:encoded><![CDATA[<p>Those tests cover the third architecture, comparing the two different file systems.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jacques Chester</title>
		<link>http://clubtroppo.com.au/2009/06/25/and-now-for-something-completely-geeky/#comment-358593</link>
		<dc:creator>Jacques Chester</dc:creator>
		<pubDate>Fri, 26 Jun 2009 18:13:21 +0000</pubDate>
		<guid isPermaLink="false">http://clubtroppo.com.au/?p=8771#comment-358593</guid>
		<description>JM, for your edification and enjoyment. Stats is, I admit, one of my weaknesses.

&lt;pre&gt;
--- EXT3 RESULTS ---
Server Software:        lighttpd/1.4.19
Server Hostname:        10.1.1.6
Server Port:            80

Document Path:          /test_test_test
Document Length:        107 bytes

Concurrency Level:      253
Time taken for tests:   588.523 seconds
Complete requests:      1000000
Failed requests:        27
   (Connect: 0, Receive: 0, Length: 27, Exceptions: 0)
Write errors:           0
Total transferred:      252993169 bytes
HTML transferred:       106997111 bytes
Requests per second:    1699.17 [#/sec] (mean)
Time per request:       148.896 [ms] (mean)
Time per request:       0.589 [ms] (mean, across all concurrent requests)
Transfer rate:          419.80 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0   43 259.6      5   35052
Processing:     1  105 596.4     70   97360
Waiting:        0  103 493.6     69   97360
Total:         26  148 718.9     77  101371

Percentage of the requests served within a certain time (ms)
  50%     77
  66%     83
  75%     89
  80%     93
  90%    114
  95%    269
  98%   1074
  99%   1963
 100%  101371 (longest request)

--- NILFS RESULTS ---
Server Software:        lighttpd/1.4.19
Server Hostname:        10.1.1.6
Server Port:            80

Document Path:          /test_test_test
Document Length:        107 bytes

Concurrency Level:      253
Time taken for tests:   484.517 seconds
Complete requests:      1000000
Failed requests:        36
   (Connect: 0, Receive: 0, Length: 36, Exceptions: 0)
Write errors:           0
Total transferred:      252991145 bytes
HTML transferred:       106996255 bytes
Requests per second:    2063.91 [#/sec] (mean)
Time per request:       122.583 [ms] (mean)
Time per request:       0.485 [ms] (mean, across all concurrent requests)
Transfer rate:          509.91 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0   37 224.1      5    7038
Processing:     6   85 554.3     57   66351
Waiting:        0   82 397.8     56   66351
Total:         14  122 655.0     63   71375

Percentage of the requests served within a certain time (ms)
  50%     63
  66%     66
  75%     69
  80%     72
  90%     89
  95%    248
  98%   1047
  99%   1352
 100%  71375 (longest request)
&lt;/pre&gt;</description>
		<content:encoded><![CDATA[<p>JM, for your edification and enjoyment. Stats is, I admit, one of my weaknesses.</p>
<pre>
--- EXT3 RESULTS ---
Server Software:        lighttpd/1.4.19
Server Hostname:        10.1.1.6
Server Port:            80

Document Path:          /test_test_test
Document Length:        107 bytes

Concurrency Level:      253
Time taken for tests:   588.523 seconds
Complete requests:      1000000
Failed requests:        27
   (Connect: 0, Receive: 0, Length: 27, Exceptions: 0)
Write errors:           0
Total transferred:      252993169 bytes
HTML transferred:       106997111 bytes
Requests per second:    1699.17 [#/sec] (mean)
Time per request:       148.896 [ms] (mean)
Time per request:       0.589 [ms] (mean, across all concurrent requests)
Transfer rate:          419.80 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0   43 259.6      5   35052
Processing:     1  105 596.4     70   97360
Waiting:        0  103 493.6     69   97360
Total:         26  148 718.9     77  101371

Percentage of the requests served within a certain time (ms)
  50%     77
  66%     83
  75%     89
  80%     93
  90%    114
  95%    269
  98%   1074
  99%   1963
 100%  101371 (longest request)

--- NILFS RESULTS ---
Server Software:        lighttpd/1.4.19
Server Hostname:        10.1.1.6
Server Port:            80

Document Path:          /test_test_test
Document Length:        107 bytes

Concurrency Level:      253
Time taken for tests:   484.517 seconds
Complete requests:      1000000
Failed requests:        36
   (Connect: 0, Receive: 0, Length: 36, Exceptions: 0)
Write errors:           0
Total transferred:      252991145 bytes
HTML transferred:       106996255 bytes
Requests per second:    2063.91 [#/sec] (mean)
Time per request:       122.583 [ms] (mean)
Time per request:       0.485 [ms] (mean, across all concurrent requests)
Transfer rate:          509.91 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0   37 224.1      5    7038
Processing:     6   85 554.3     57   66351
Waiting:        0   82 397.8     56   66351
Total:         14  122 655.0     63   71375

Percentage of the requests served within a certain time (ms)
  50%     63
  66%     66
  75%     69
  80%     72
  90%     89
  95%    248
  98%   1047
  99%   1352
 100%  71375 (longest request)
</pre>
]]></content:encoded>
	</item>
	<item>
		<title>By: JM</title>
		<link>http://clubtroppo.com.au/2009/06/25/and-now-for-something-completely-geeky/#comment-358588</link>
		<dc:creator>JM</dc:creator>
		<pubDate>Fri, 26 Jun 2009 13:39:07 +0000</pubDate>
		<guid isPermaLink="false">http://clubtroppo.com.au/?p=8771#comment-358588</guid>
		<description>&quot; is down to 122ms with a standard deviation of 655ms. 95% of requests are served in less than 250ms. &quot;

Jacques, it strikes me that your distribution is not normal here and that a standard deviation (ie. variance) may not even be defined for the actual distribution - cauchy perhaps?

Other than that, good to see that Lua has some real world use.    I looked at it a couple of years ago but couldn&#039;t figure out if it was useful for anything much.</description>
		<content:encoded><![CDATA[<p>&#8221; is down to 122ms with a standard deviation of 655ms. 95% of requests are served in less than 250ms. &#8221;</p>
<p>Jacques, it strikes me that your distribution is not normal here and that a standard deviation (ie. variance) may not even be defined for the actual distribution &#8211; cauchy perhaps?</p>
<p>Other than that, good to see that Lua has some real world use.    I looked at it a couple of years ago but couldn&#8217;t figure out if it was useful for anything much.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Patrick</title>
		<link>http://clubtroppo.com.au/2009/06/25/and-now-for-something-completely-geeky/#comment-358575</link>
		<dc:creator>Patrick</dc:creator>
		<pubDate>Thu, 25 Jun 2009 13:57:20 +0000</pubDate>
		<guid isPermaLink="false">http://clubtroppo.com.au/?p=8771#comment-358575</guid>
		<description>Well I certainly won&#039;t think of anything better, and I&#039;m not even humble.

But good to read about anyway, thanks!</description>
		<content:encoded><![CDATA[<p>Well I certainly won&#8217;t think of anything better, and I&#8217;m not even humble.</p>
<p>But good to read about anyway, thanks!</p>
]]></content:encoded>
	</item>
</channel>
</rss>

