<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Lambda &#187; Sort</title>
	<atom:link href="http://sroucheray.org/blog/tag/sort/feed/" rel="self" type="application/rss+xml" />
	<link>http://sroucheray.org/blog</link>
	<description>Stephane Roucheray's trivial work and thoughts</description>
	<lastBuildDate>Mon, 23 Aug 2010 08:13:42 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.4</generator>
		<item>
		<title>Array.sort() should not be used to shuffle an array</title>
		<link>http://sroucheray.org/blog/2009/11/array-sort-should-not-be-used-to-shuffle-an-array/</link>
		<comments>http://sroucheray.org/blog/2009/11/array-sort-should-not-be-used-to-shuffle-an-array/#comments</comments>
		<pubDate>Mon, 16 Nov 2009 12:48:43 +0000</pubDate>
		<dc:creator>Stéphane Roucheray</dc:creator>
				<category><![CDATA[ActionScript]]></category>
		<category><![CDATA[JavaScript]]></category>
		<category><![CDATA[Array]]></category>
		<category><![CDATA[Random]]></category>
		<category><![CDATA[Randomize]]></category>
		<category><![CDATA[Sort]]></category>

		<guid isPermaLink="false">http://sroucheray.org/blog/?p=375</guid>
		<description><![CDATA[When a developer searches the web for an algorithm to shuffle an array in JavaScript or ActionScript, he will surely find a way to achieve that using the <samp>Array.sort()</samp> method (see references : <a href="https://developer.mozilla.org/en/Core_JavaScript_1.5_Reference/Objects/Array/sort">JavaScript</a>, <a href="http://help.adobe.com/en_US/AS3LCR/Flash_10.0/Array.html#sort%28%29">ActionScript 3</a>)...]]></description>
			<content:encoded><![CDATA[When a developer searches the web for an algorithm to shuffle an array in JavaScript or ActionScript, he will surely find a way to achieve that using the <samp>Array.sort()</samp> method (see references : <a href="https://developer.mozilla.org/en/Core_JavaScript_1.5_Reference/Objects/Array/sort">JavaScript</a>, <a href="http://help.adobe.com/en_US/AS3LCR/Flash_10.0/Array.html#sort%28%29">ActionScript 3</a>). While this method is intended to sort an array, it can also be used to randomize it (!). In general, here and there, the following algorithm comes forth :

<div class="codecolorer-container javascript dawn nowrap" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:435px;"><div class="javascript codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #006600; font-style: italic;">/**<br />
&nbsp;* Add a randomize method to the Array object prototype<br />
&nbsp;* Usage : <br />
&nbsp;* &nbsp;var tmpArray = [&quot;a&quot;, &quot;b&quot;, &quot;c&quot;, &quot;d&quot;, &quot;e&quot;];<br />
&nbsp;* &nbsp;tmpArray.randomize();<br />
&nbsp;*/</span><br />
Array.<span style="color: #660066;">prototype</span>.<span style="color: #660066;">randomize</span> <span style="color: #339933;">=</span> <span style="color: #003366; font-weight: bold;">function</span> <span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span><br />
&nbsp; &nbsp; <span style="color: #000066; font-weight: bold;">this</span>.<span style="color: #660066;">sort</span><span style="color: #009900;">&#40;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #003366; font-weight: bold;">function</span><span style="color: #009900;">&#40;</span>a<span style="color: #339933;">,</span>b<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #000066; font-weight: bold;">return</span> Math.<span style="color: #660066;">round</span><span style="color: #009900;">&#40;</span> Math.<span style="color: #660066;">random</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">*</span> <span style="color: #CC0000;">2</span> <span style="color: #009900;">&#41;</span> <span style="color: #339933;">-</span> <span style="color: #CC0000;">1</span><span style="color: #339933;">;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;">&#125;</span><br />
&nbsp; &nbsp; <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span><br />
<span style="color: #009900;">&#125;</span><span style="color: #339933;">;</span></div></div>

The principle of this implementation is simple. It takes advantage of the comparison <samp>Function</samp> that can be passed to the <samp>Array.sort()</samp> method. This function defines the sort order. It will be passed two objects (from the array) <samp>a</samp> and <samp>b</samp>, compare them and return <samp>-1</samp>, <samp>0</samp> or <samp>1</samp> whether <samp>a</samp> is less, equal or greater than <samp>b</samp>. The idea, to shuffle an array, is to randomly output <samp>-1</samp>, <samp>0</samp> or <samp>1</samp>. This is the purpose of the expression <samp>Math.round( Math.random() * 2 ) &#8211; 1</samp>. This idea is bad.

<h2>Shuffling an array with <samp>Array.sort()</samp> results in&#8230; a not so shuffled array</h2>

Implementing a shuffle function on arrays, this way, is bad. At first glance it seems to produce well randomized arrays, but the random distribution is not uniform. I will illustrate this in a minute. Before that, give a look at the table below. It&#8217;s interactive. When you click start, it will run the following process :
<ol>
	<li>
	Create an array with ten letters in alphabetical orders from A to J

<div class="codecolorer-container javascript dawn nowrap" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:435px;"><div class="javascript codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #003366; font-weight: bold;">var</span> simpleArray <span style="color: #339933;">=</span> <span style="color: #009900;">&#91;</span><span style="color: #3366CC;">&quot;A&quot;</span><span style="color: #339933;">,</span><span style="color: #3366CC;">&quot;B&quot;</span><span style="color: #339933;">,</span><span style="color: #3366CC;">&quot;C&quot;</span><span style="color: #339933;">,</span><span style="color: #3366CC;">&quot;D&quot;</span><span style="color: #339933;">,</span><span style="color: #3366CC;">&quot;E&quot;</span><span style="color: #339933;">,</span><span style="color: #3366CC;">&quot;F&quot;</span><span style="color: #339933;">,</span><span style="color: #3366CC;">&quot;G&quot;</span><span style="color: #339933;">,</span><span style="color: #3366CC;">&quot;H&quot;</span><span style="color: #339933;">,</span><span style="color: #3366CC;">&quot;I&quot;</span><span style="color: #339933;">,</span><span style="color: #3366CC;">&quot;J&quot;</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span></div></div>

</li>
	<li>Copy this array in a second array</li>
	<li>Run a shuffle algorithm against the second array</li>
	<li>Look where the letters are in the second array and record their position in the table columns</li>
	<li>Return to the step 2 and start the process again until a certain number of iterations has been reached</li>
</ol>

This repetitive process is pretty simple. It is not intended to give a mathematical evidence of the problem but to illustrate the consequences of a badly implemented shuffle algorithm. Note that the copy at step 2 is always made from the original array (not from the last shuffled array). 

In this first example and to make sure you understand how this table work, step 3 is bypassed. Thus the copied array is always exactly the same as the original one. Click on <em>Start</em> button will launch the process (The number of iterations can be changed).

<div id="noshuffle"></div>

Normally, at the end of the process, you should see a table full of zeros with only the diagonal (top-left / bottom-right) filled with the number of iterations. It means that, at every steps, letters in the second array are at the same position than in the first one. This is what was expected because there is no shuffle.

Let&#8217;s do the same thing but now step 3 uses the shuffle algorithm we have seen before.

<div id="arraysort"></div>

The result is interesting, isn&#8217;t it ? The position of letters has well been set randomly over iterations. Almost every cells in the table are filled with numbers. Let&#8217;s give a closer look. Actually, the nearer a cell is from the diagonal (top-left / bottom-right) the higher its number. Moreover there are still zeros in some cells far from this diagonal. Obviously, it means letters are more likely to be near their original position, after a shuffle.

<h2>How to better shuffle an array with <samp>Array.sort()</samp> ?</h2>

We said the comparison function returned randomly either <samp>-1</samp>, <samp>0</samp> or <samp>1</samp>. What exactly happens in each case ? Let&#8217;s quote the Mozilla developer Center for the <a href="https://developer.mozilla.org/en/Core_JavaScript_1.5_Reference/Objects/Array/sort#Description"><samp>Array.sort()</samp> comparison function description</a>

<div style="padding:1em 4em; font-size:1.1em;">
If <samp>compareFunction</samp> is supplied, the array elements are sorted according to the return value of the compare function. If a and b are two elements being compared, then:
<ul>
<li>If <samp>compareFunction(a, b)</samp> is less than 0, sort <samp>a</samp> to a lower index than <samp>b</samp>.</li>
<li>If <samp>compareFunction(a, b)</samp> returns 0, leave <samp>a</samp> and <samp>b</samp> unchanged with respect to each other, but sorted with respect to all different elements. Note: the ECMAscript standard does not guarantee this behaviour, and thus not all browsers (e.g. Mozilla versions dating back to at least 2003) respect this.</li>
<li>If <samp>compareFunction(a, b)</samp> is greater than 0, sort <samp>b</samp> to a lower index than <samp>a</samp>.</li>
</ul>
</div>

The <samp>Array.sort()</samp> method uses a <a href="http://en.wikipedia.org/wiki/Bubble_sort">Bubble Sort algorithm</a>. Roughly, it steps through array elements and compare each one with its next adjacent element. It swaps the elements when the comparison function return <samp>1</samp>. What does it do when the comparison function return <samp>0</samp> or <samp>-1</samp> ? Actually nothing and here is the issue ! As the comparison function randomly generate <samp>-1</samp>, <samp>0</samp> or <samp>1</samp>, each of those numbers has the same chance to arise. Unfortunately a letter can change its position only if <samp>1</samp> arises. It means that a letter has twice the chance not to move than to move !

So to resolve the issue the comparison function must not generate 3 numbers randomly but only two : <samp>0</samp> or <samp>1</samp>. Thus a letter has the same chance to move than to stay in place. Below is the new implementation (Note that it is simpler than the previous one).

<div class="codecolorer-container javascript dawn nowrap" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:435px;"><div class="javascript codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #006600; font-style: italic;">/**<br />
&nbsp;* Add a randomize method to the Array object prototype<br />
&nbsp;* Usage : <br />
&nbsp;* &nbsp;var tmpArray = [&quot;a&quot;, &quot;b&quot;, &quot;c&quot;, &quot;d&quot;, &quot;e&quot;];<br />
&nbsp;* &nbsp;tmpArray.randomize2();<br />
&nbsp;*/</span><br />
Array.<span style="color: #660066;">prototype</span>.<span style="color: #660066;">randomize2</span> <span style="color: #339933;">=</span> <span style="color: #003366; font-weight: bold;">function</span> <span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span><br />
&nbsp; &nbsp; <span style="color: #000066; font-weight: bold;">this</span>.<span style="color: #660066;">sort</span><span style="color: #009900;">&#40;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #003366; font-weight: bold;">function</span><span style="color: #009900;">&#40;</span>a<span style="color: #339933;">,</span>b<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #000066; font-weight: bold;">return</span> &nbsp;Math.<span style="color: #660066;">round</span><span style="color: #009900;">&#40;</span> Math.<span style="color: #660066;">random</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#41;</span><span style="color: #339933;">;;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;">&#125;</span><br />
&nbsp; &nbsp; <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span><br />
<span style="color: #009900;">&#125;</span><span style="color: #339933;">;</span></div></div>

The following table shows the result of this new implementation :

<div id="arraysort2"></div>

This result seems much more random than the previous one. There are much less zeros filling the table, however we can still see a kind of diagonal pattern where higher numbers seem much more on the diagonal. It is less obvious than before but still there. The reason for this can be found in the Bubble Sort algorithm and its implementation. Think about if A &lt; B and B &lt; C then not need to compare A and C.Thus, some comparisons are not done and optimization in code results in a greater efficiency to sort but an issue when shuffle.

<em>Update : it seems there are differences between browsers. Notably, the method explained here is more accurate in Firefox but less in IE. Both gives bad uniform random, though. Not tested in other browsers</em>

<h2>Shuffle arrays the good way</h2>
Fortunately it is not required to use <samp>Array.sort()</samp> to shuffle an array. Implement a shuffle function using <a href="http://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle">Fisher-Yates algorithm</a> is better and still easy.

Below is the code implementing a shuffle function using the Fisher-Yates algorithm. This algorithm provides uniform random.

<div class="codecolorer-container javascript dawn nowrap" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:435px;"><div class="javascript codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #006600; font-style: italic;">/*<br />
&nbsp;* Add a shuffle function to Array object prototype<br />
&nbsp;* Usage : <br />
&nbsp;* &nbsp;var tmpArray = [&quot;a&quot;, &quot;b&quot;, &quot;c&quot;, &quot;d&quot;, &quot;e&quot;];<br />
&nbsp;* &nbsp;tmpArray.shuffle();<br />
&nbsp;*/</span><br />
Array.<span style="color: #660066;">prototype</span>.<span style="color: #660066;">shuffle</span> <span style="color: #339933;">=</span> <span style="color: #003366; font-weight: bold;">function</span> <span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span><br />
&nbsp; &nbsp; <span style="color: #003366; font-weight: bold;">var</span> i <span style="color: #339933;">=</span> <span style="color: #000066; font-weight: bold;">this</span>.<span style="color: #660066;">length</span><span style="color: #339933;">,</span> j<span style="color: #339933;">,</span> temp<span style="color: #339933;">;</span><br />
&nbsp; &nbsp; <span style="color: #000066; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span> i <span style="color: #339933;">==</span> <span style="color: #CC0000;">0</span> <span style="color: #009900;">&#41;</span> <span style="color: #000066; font-weight: bold;">return</span><span style="color: #339933;">;</span><br />
&nbsp; &nbsp; <span style="color: #000066; font-weight: bold;">while</span> <span style="color: #009900;">&#40;</span> <span style="color: #339933;">--</span>i <span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; j <span style="color: #339933;">=</span> Math.<span style="color: #660066;">floor</span><span style="color: #009900;">&#40;</span> Math.<span style="color: #660066;">random</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">*</span> <span style="color: #009900;">&#40;</span> i <span style="color: #339933;">+</span> <span style="color: #CC0000;">1</span> <span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; temp <span style="color: #339933;">=</span> <span style="color: #000066; font-weight: bold;">this</span><span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #000066; font-weight: bold;">this</span><span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #000066; font-weight: bold;">this</span><span style="color: #009900;">&#91;</span>j<span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #000066; font-weight: bold;">this</span><span style="color: #009900;">&#91;</span>j<span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> temp<span style="color: #339933;">;</span><br />
&nbsp; &nbsp; <span style="color: #009900;">&#125;</span><br />
<span style="color: #009900;">&#125;</span><span style="color: #339933;">;</span></div></div>

To prove the superiority of Fisher-Yates algorithm in <samp>Array.sort()</samp> implementation play with the table below.

<div id="shuffle"></div>

With these results, it is obvious that this algorithm is far more efficient in distributing uniformly the letters in the array over iterations than <samp>Array.sort()</samp> style algorithms. Distribution is much more the same across cells.

<div class="shr-bookmarks shr-bookmarks-expand shr-bookmarks-center">
<ul class="socials">
		<li class="shr-delicious">
			<a href="http://delicious.com/post?url=http://sroucheray.org/blog/2009/11/array-sort-should-not-be-used-to-shuffle-an-array/&amp;title=Array.sort%28%29+should+not+be+used+to+shuffle+an+array" rel="nofollow" class="external" title="Share this on del.icio.us">Share this on del.icio.us</a>
		</li>
		<li class="shr-digg">
			<a href="http://digg.com/submit?phase=2&amp;url=http://sroucheray.org/blog/2009/11/array-sort-should-not-be-used-to-shuffle-an-array/&amp;title=Array.sort%28%29+should+not+be+used+to+shuffle+an+array" rel="nofollow" class="external" title="Digg this!">Digg this!</a>
		</li>
		<li class="shr-facebook">
			<a href="http://www.facebook.com/share.php?v=4&amp;src=bm&amp;u=http://sroucheray.org/blog/2009/11/array-sort-should-not-be-used-to-shuffle-an-array/&amp;t=Array.sort%28%29+should+not+be+used+to+shuffle+an+array" rel="nofollow" class="external" title="Share this on Facebook">Share this on Facebook</a>
		</li>
		<li class="shr-googlebuzz">
			<a href="http://www.google.com/buzz/post?url=http://sroucheray.org/blog/2009/11/array-sort-should-not-be-used-to-shuffle-an-array/&amp;imageurl=" rel="nofollow" class="external" title="Post on Google Buzz">Post on Google Buzz</a>
		</li>
		<li class="shr-linkedin">
			<a href="http://www.linkedin.com/shareArticle?mini=true&amp;url=http://sroucheray.org/blog/2009/11/array-sort-should-not-be-used-to-shuffle-an-array/&amp;title=Array.sort%28%29+should+not+be+used+to+shuffle+an+array&amp;summary=When%20a%20developer%20searches%20the%20web%20for%20an%20algorithm%20to%20shuffle%20an%20array%20in%20JavaScript%20or%20ActionScript%2C%20he%20will%20surely%20find%20a%20way%20to%20achieve%20that%20using%20the%20Array.sort%28%29%20method%20%28see%20references%20%3A%20JavaScript%2C%20ActionScript%203%29...&amp;source=Lambda" rel="nofollow" class="external" title="Share this on LinkedIn">Share this on LinkedIn</a>
		</li>
		<li class="shr-reddit">
			<a href="http://reddit.com/submit?url=http://sroucheray.org/blog/2009/11/array-sort-should-not-be-used-to-shuffle-an-array/&amp;title=Array.sort%28%29+should+not+be+used+to+shuffle+an+array" rel="nofollow" class="external" title="Share this on Reddit">Share this on Reddit</a>
		</li>
		<li class="shr-stumbleupon">
			<a href="http://www.stumbleupon.com/submit?url=http://sroucheray.org/blog/2009/11/array-sort-should-not-be-used-to-shuffle-an-array/&amp;title=Array.sort%28%29+should+not+be+used+to+shuffle+an+array" rel="nofollow" class="external" title="Stumble upon something good? Share it on StumbleUpon">Stumble upon something good? Share it on StumbleUpon</a>
		</li>
		<li class="shr-twitter">
			<a href="http://twitter.com/home?status=Array.sort%28%29+should+not+be+used+to+shuffle+an+array+-+http://bit.ly/diyrW4&amp;source=shareaholic" rel="nofollow" class="external" title="Tweet This!">Tweet This!</a>
		</li>
		<li class="shr-yahoobuzz">
			<a href="http://buzz.yahoo.com/submit/?submitUrl=http://sroucheray.org/blog/2009/11/array-sort-should-not-be-used-to-shuffle-an-array/&amp;submitHeadline=Array.sort%28%29+should+not+be+used+to+shuffle+an+array&amp;submitSummary=When%20a%20developer%20searches%20the%20web%20for%20an%20algorithm%20to%20shuffle%20an%20array%20in%20JavaScript%20or%20ActionScript%2C%20he%20will%20surely%20find%20a%20way%20to%20achieve%20that%20using%20the%20Array.sort%28%29%20method%20%28see%20references%20%3A%20JavaScript%2C%20ActionScript%203%29...&amp;submitCategory=science&amp;submitAssetType=text" rel="nofollow" class="external" title="Buzz up!">Buzz up!</a>
		</li>
</ul>
<div style="clear:both;"></div>
</div>

]]></content:encoded>
			<wfw:commentRss>http://sroucheray.org/blog/2009/11/array-sort-should-not-be-used-to-shuffle-an-array/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
	</channel>
</rss>

