java实现几种常见排序算法

silentoy 8年前
   <p> </p>    <p>本文介绍几种常见排序算法(选择排序,插入排序,希尔排序,归并排序,快速排序,堆排序),对算法的思路、性质、特点、具体步骤、java实现以及trace图解进行了全面的说明。最后对几种排序算法进行了比较和总结。</p>    <h2>写在前面</h2>    <ul>     <li> <p>本文所有图片均截图自coursera上普林斯顿的课程 <a href="/misc/goto?guid=4959672756994950780" rel="nofollow,noindex">《Algorithms, Part I》</a> 中的Slides</p> </li>     <li> <p>相关命题的证明可参考 <a href="/misc/goto?guid=4959672757076757976" rel="nofollow,noindex">《算法(第4版)》</a></p> </li>     <li> <p>源码可在 <a href="/misc/goto?guid=4959549312507266336" rel="nofollow,noindex">官网</a> 下载,也可以在我的github仓库 <a href="/misc/goto?guid=4959672757187310579" rel="nofollow,noindex">algorithms-learning</a> 下载,已经使用maven构建</p> </li>     <li> <p>仓库下载: git clone git@github.com:brianway/algorithms-learning.git</p> </li>    </ul>    <h2>基础介绍</h2>    <p>java: <a href="/misc/goto?guid=4959672757259990527" rel="nofollow,noindex"> Interface Comparable<T> </a></p>    <p>Java中很多类已经实现了 Comparable<T> 接口,用户也可自定义类型实现该接口</p>    <p>total order:</p>    <ul>     <li> <p>Antisymmetry(反对称性): if v ≤ w and w ≤ v, then v = w.</p> </li>     <li> <p>Transitivity(传递性): if v ≤ w and w ≤ x, then v ≤ x.</p> </li>     <li> <p>Totality: either v ≤ w or w ≤ v or both.</p> </li>    </ul>    <p><em>注意: The <= operator for double is not a total order </em> ,violates totality: (Double.NaN <= Double.NaN) is false</p>    <p>通用代码:</p>    <pre>  <code class="language-java">// Less. Is item v less than w ?  private static boolean less(Comparable v, Comparable w) {      return v.compareTo(w) < 0;  }    //Exchange. Swap item in array a[] at index i with the one at index j  private static void exch(Comparable[] a,, int i, int j) {      Comparable swap = a[i];      a[i] = a[j];      a[j] = swap;  }  </code></pre>    <h2>初级排序算法</h2>    <h3>selection sort(选择排序)</h3>    <p>思路:</p>    <ul>     <li> <p>在第i次迭代中,在剩下的(即未排序的)元素中找到最小的元素</p> </li>     <li> <p>将第i个元素与最小的元素交换位置</p> </li>    </ul>    <p>现象:</p>    <ul>     <li> <p>设已排序的和未排序的交界处为 ↑,则每次循环, ↑ 从左往右移动一个位置</p> </li>     <li> <p>↑ 左边的元素(包括↑)固定了,且升序</p> </li>     <li> <p>↑ 右边的任一元素全部比左边的所有元素都大</p> </li>    </ul>    <p><img src="https://simg.open-open.com/show/cc527d9c567c4b0b154b02a14f88a11b.png"></p>    <p>步骤:</p>    <ul>     <li> <p>move the pointer to the right</p> </li>     <li> <p>indentify index of minimun entry on right</p> </li>     <li> <p>exchange into positon</p> </li>    </ul>    <p><img src="https://simg.open-open.com/show/d75520f9efc28335b1ac2212becf70cd.png"></p>    <p>java实现:</p>    <pre>  <code class="language-java">public static void sort(Comparable[] a) {      int N = a.length;      for (int i = 0; i < N; i++) {          int min = i;          for (int j = i+1; j < N; j++) {              if (less(a[j], a[min])) min = j;          }          exch(a, i, min);      }  }</code></pre>    <p>特点:</p>    <ul>     <li> <p>运行时间和输入无关,无论输入是已排序,时间复杂度都是O(n^2)</p> </li>     <li> <p>数据移动最少,交换的次数和数组大小是线性关系</p> </li>    </ul>    <h3>insertion sort(插入排序)</h3>    <p>思路:</p>    <ul>     <li> <p>在第i次迭代中,将第i个元素与每一个它左边且比它大的的元素交换位置</p> </li>    </ul>    <p>现象:</p>    <ul>     <li> <p>设已排序的和未排序的交界处为 ↑,则每次循环, ↑ 从左往右移动一个位置</p> </li>     <li> <p>↑ 左边的元素(包括↑)且升序,但位置不固定(因为后续可能会因插入而移动)</p> </li>     <li> <p>↑ 右边的元素还不可见</p> </li>    </ul>    <p><img src="https://simg.open-open.com/show/07570f35b835701ebe46f645e2f34ec3.png"></p>    <p>步骤:</p>    <ul>     <li> <p>Move the pointer to the right.</p> </li>     <li> <p>Moving from right to left, exchange a[i] with each larger entry to its left.</p> </li>    </ul>    <p><img src="https://simg.open-open.com/show/cda6fe3122f42f22395a981cbf57390a.png"></p>    <p>java实现:</p>    <pre>  <code class="language-java">public static void sort(Comparable[] a) {      int N = a.length;      for (int i = 0; i < N; i++) {          for (int j = i; j > 0 && less(a[j], a[j-1]); j--) {              exch(a, j, j-1);          }      }  }</code></pre>    <p>inversion(倒置):An inversion is a pair of keys that are out of order</p>    <p>部分有序:An array is partially sorted if the number of inversions is ≤ c N.</p>    <p>特点:</p>    <ul>     <li> <p>运行时间和输入有关,当输入已排序时,时间复杂度是O(n);</p> </li>     <li> <p>For partially-sorted arrays, insertion sort runs in linear time.(交换的次数等于输入中倒置(inversion)的个数)</p> </li>     <li> <p>插入排序适合部分有序数组,也适合小规模数组</p> </li>    </ul>    <h3>ShellSort(希尔排序)</h3>    <p>希尔排序是基于插入排序的。</p>    <p>思路:</p>    <ul>     <li> <p>Move entries more than one position at a time by h-sorting the array</p> </li>     <li> <p>按照h的步长进行插入排序</p> </li>    </ul>    <p>现象:</p>    <ul>     <li> <p>数组中任意间隔为h的元素都是有序的</p> </li>     <li> <p>A g-sorted array remains g-sorted after h-sorting it.</p> </li>    </ul>    <p><img src="https://simg.open-open.com/show/62e28909038be58aa319f79bc25d2cc7.png"></p>    <p>性质:</p>    <ul>     <li> <p>递增数列一般采用3x+1:1,4,13,40,121,364.....,使用这种递增数列的希尔排序所需的比较次数不会超过N的若干倍乘以递增数列的长度。</p> </li>     <li> <p>最坏情况下,使用3x+1递增数列的希尔排序的比较次数是O(N^(3/2))</p> </li>    </ul>    <p>java实现:</p>    <pre>  <code class="language-java">public static void sort(Comparable[] a) {      int N = a.length;        // 3x+1 increment sequence:  1, 4, 13, 40, 121, 364, 1093, ...       int h = 1;      while (h < N/3) h = 3*h + 1;         while (h >= 1) {          // h-sort the array          for (int i = h; i < N; i++) {              for (int j = i; j >= h && less(a[j], a[j-h]); j -= h) {                  exch(a, j, j-h);              }          }          h /= 3;      }  }</code></pre>    <h3>shuffing(不是排序算法)</h3>    <p>目标:Rearrange array so that result is a uniformly random permutation</p>    <p>shuffle sort思路</p>    <ul>     <li> <p>为数组的每一个位置生成一个随机实数</p> </li>     <li> <p>排序这个生成的数组</p> </li>    </ul>    <p>Knuth shuffle demo</p>    <ul>     <li> <p>In iteration i, pick integer r between 0 and i uniformly at random.</p> </li>     <li> <p>Swap a[i] and a[r] .</p> </li>    </ul>    <p>correct variant: between i and N – 1</p>    <ul>     <li> <p>Mergesort--Java sort for objects.</p> </li>     <li> <p>Quicksort--Java sort for primitive types.</p> </li>    </ul>    <p>下面看看这两种排序算法</p>    <h2>merge sort(归并排序)</h2>    <p>思路:</p>    <ul>     <li> <p>Divide array into two halves.</p> </li>     <li> <p>Recursivelysort each half.</p> </li>     <li> <p>Merge two halves.</p> </li>    </ul>    <h3>Abstract in-place merge(原地归并的抽象方法)</h3>    <p>Given two sorted subarrays a[lo] to a[mid] and a[mid+1] to a[hi],replace with sorted subarray a[lo] to a[hi]</p>    <p>步骤:</p>    <ul>     <li> <p>先将所有元素复制到 aux[] 中,再归并回 a[] 中。</p> </li>     <li> <p>归并时的四个判断:</p>      <ul>       <li> <p>左半边用尽(取右半边元素)</p> </li>       <li> <p>右半边用尽(取左半边元素)</p> </li>       <li> <p>右半边的当前元素 <strong>小于</strong> 左半边的当前元素(取右半边的元素)</p> </li>       <li> <p>右半边的当前元素 <strong>大于/等于</strong> 左半边的当前元素(取左半边的元素)</p> </li>      </ul> </li>    </ul>    <p>merging java实现:</p>    <pre>  <code class="language-java"> // stably merge a[lo .. mid] with a[mid+1 ..hi] using aux[lo .. hi]  private static void merge(Comparable[] a, Comparable[] aux, int lo, int mid, int hi) {      // precondition: a[lo .. mid] and a[mid+1 .. hi] are sorted subarrays         // copy to aux[]      for (int k = lo; k <= hi; k++) {          aux[k] = a[k];       }        // merge back to a[]      int i = lo, j = mid+1;      for (int k = lo; k <= hi; k++) {          if      (i > mid)              a[k] = aux[j++];          else if (j > hi)               a[k] = aux[i++];          else if (less(aux[j], aux[i])) a[k] = aux[j++];          else                           a[k] = aux[i++];      }  }</code></pre>    <h3>Top-down mergesort(自顶向下的归并排序)</h3>    <p>mergesort java实现:</p>    <pre>  <code class="language-java">// mergesort a[lo..hi] using auxiliary array aux[lo..hi]  private static void sort(Comparable[] a, Comparable[] aux, int lo, int hi) {      if (hi <= lo) return;      int mid = lo + (hi - lo) / 2;      sort(a, aux, lo, mid);  //将左边排序      sort(a, aux, mid + 1, hi);  //将右边排序      merge(a, aux, lo, mid, hi); //归并结果  }</code></pre>    <p>自顶向下的归并排序的轨迹图</p>    <p><img src="https://simg.open-open.com/show/bac1e37b61577cc58286d0995c890fdd.png"></p>    <p>由图可知,原地归并排序的大致趋势是,先局部排序,再扩大规模;先左边排序,再右边排序;每次都是左边一半局部排完且merge了,右边一半才开始从最局部的地方开始排序。</p>    <p>改进</p>    <ul>     <li> <p>对小规模子数组使用插入排序</p> </li>     <li> <p>测试数组是否已经有序(左边最大<右边最小时,直接返回)</p> </li>     <li> <p>不将元素复制到辅助数组(节省时间而非空间)</p> </li>    </ul>    <h3>Bottom-up mergesort(自底向上的归并排序)</h3>    <p>思路:</p>    <ul>     <li> <p>先归并微型数组,从两两归并开始(每个元素理解为大小为1的数组)</p> </li>     <li> <p>重复上述步骤,逐步扩大归并的规模,2,4,8.....</p> </li>    </ul>    <p>java实现:</p>    <pre>  <code class="language-java">public class MergeBU{     private static void merge(...){ /* as before */ }        public static void sort(Comparable[] a){       int N = a.length;       Comparable[] aux = new Comparable[N];       for (int sz = 1; sz < N; sz = sz+sz)       for (int lo = 0; lo < N-sz; lo += sz+sz)       merge(a, aux, lo, lo+sz-1, Math.min(lo+sz+sz-1, N-1));     }  }</code></pre>    <p>自底向上的归并排序的轨迹图</p>    <p><img src="https://simg.open-open.com/show/de88582ee6ddb8e8dc39a382583c0a1e.png"></p>    <p>由图可知,自底向上归并排序的大致趋势是,先局部排序,逐步扩大到全局排序;步调均匀,稳步扩大</p>    <h2>quicksort</h2>    <p>思路:</p>    <ul>     <li> <p>Shufflethe array.</p> </li>     <li> <p>Partition(切分)so that, for some j</p> </li>    </ul>    <ul>     <li> <p>entry a[j] is in place</p> </li>     <li> <p>no larger entry to the left of j</p> </li>     <li> <p>no smaller entry to the right of j</p> </li>    </ul>    <ul>     <li> <p>Sorteach piece recursively.</p> </li>    </ul>    <p>其中很重要的一步就是 <strong>Partition(切分)</strong> ,这个过程使得满足以下三个条件:</p>    <ul>     <li> <p>对于某个j,a[j]已经排定;</p> </li>     <li> <p>a[lo]到a[j-1]中的所有元素都不大于a[j];</p> </li>     <li> <p>a[j+1]到a[hi]中的所有元素都不小于a[j];</p> </li>    </ul>    <p>partition java实现</p>    <pre>  <code class="language-java">// partition the subarray a[lo..hi] so that a[lo..j-1] <= a[j] <= a[j+1..hi]  // and return the index j.  private static int partition(Comparable[] a, int lo, int hi) {      int i = lo;      int j = hi + 1;      Comparable v = a[lo];      while (true) {             // find item on lo to swap          while (less(a[++i], v))              if (i == hi) break;            // find item on hi to swap          while (less(v, a[--j]))              if (j == lo) break;      // redundant since a[lo] acts as sentinel            // check if pointers cross          if (i >= j) break;            exch(a, i, j);      }        // put partitioning item v at a[j]      exch(a, lo, j);        // now, a[lo .. j-1] <= a[j] <= a[j+1 .. hi]      return j;  }</code></pre>    <p>快排java实现:</p>    <pre>  <code class="language-java">public static void sort(Comparable[] a) {      StdRandom.shuffle(a);      sort(a, 0, a.length - 1);  }    // quicksort the subarray from a[lo] to a[hi]  private static void sort(Comparable[] a, int lo, int hi) {       if (hi <= lo) return;      int j = partition(a, lo, hi);      sort(a, lo, j-1);      sort(a, j+1, hi);      assert isSorted(a, lo, hi);  }</code></pre>    <p>快排的轨迹图</p>    <p><img src="https://simg.open-open.com/show/b77d200b8d02293b4fe71bdb960b8355.png"></p>    <p>由图可知,和归并排序不同,快排的大致趋势是,先全局大体有个走势——左边比右边小,逐步细化到局部;也是先左后右;局部完成时全部排序也就完成了。</p>    <p>一些实现的细节:</p>    <ul>     <li> <p>原地切分:不使用辅助数组</p> </li>     <li> <p>别越界:测试条件(j == lo)是冗余的(a[lo]不可能比自己小);</p> </li>     <li> <p>保持随机性:初始时的随机打乱跟重要</p> </li>     <li> <p>终止循环</p> </li>     <li> <p>处理切分元素值有重复的情况:这里可能出问题</p> </li>    </ul>    <p>性质:</p>    <ul>     <li> <p>快排是in-place的</p> </li>     <li> <p>快排不稳定</p> </li>    </ul>    <p>改进</p>    <ul>     <li> <p>对小规模子数组使用插入排序</p> </li>     <li> <p>三取样切分</p> </li>    </ul>    <h3>三向切分的快速排序</h3>    <p>思路:</p>    <ul>     <li> <p>Let v be partitioning item a[lo].</p> </li>     <li> <p>Scan i from left to right.</p> </li>    </ul>    <ul>     <li> <p>(a[i] < v): exchange a[lt] with a[i]; increment both lt and i</p> </li>     <li> <p>(a[i] > v): exchange a[gt] with a[i]; decrement gt</p> </li>     <li> <p>(a[i] == v): increment i</p> </li>    </ul>    <p>主要是通过增加一个指针来实现的。普通的快拍只有lo和high两个指针,故只能记录 大于 (high右边)和 小于 (lo左边)两个区间, 等于 只能并入其中一个;这里增加了使用了lt,i,gt三个指针,从而达到记录 大于 (gt右边)、 小于 (lt左边)和 等于 (lt和i之间)三个区间。</p>    <p>三切分的示意图</p>    <p><img src="https://simg.open-open.com/show/e7cdcc59b0755a8e1491b2f7794a10cf.png"></p>    <p>三向切分的java实现:</p>    <pre>  <code class="language-java">// quicksort the subarray a[lo .. hi] using 3-way partitioning  private static void sort(Comparable[] a, int lo, int hi) {       if (hi <= lo) return;      int lt = lo, gt = hi;      Comparable v = a[lo];      int i = lo;      while (i <= gt) {          int cmp = a[i].compareTo(v);          if      (cmp < 0) exch(a, lt++, i++);          else if (cmp > 0) exch(a, i, gt--);          else              i++;      }        // a[lo..lt-1] < v = a[lt..gt] < a[gt+1..hi].       sort(a, lo, lt-1);      sort(a, gt+1, hi);  }</code></pre>    <h2>Heapsort(堆排序)</h2>    <p>思路:</p>    <ul>     <li> <p>Create max-heap with all N keys.</p> </li>     <li> <p>Repeatedly remove the maximum key.</p> </li>    </ul>    <ul>     <li> <p>swim:由下至上的堆有序化</p> </li>     <li> <p>sink:由上至下的对有序化</p> </li>    </ul>    <p>堆排序主要分为两个阶段:</p>    <ol>     <li> <p>堆的构造</p> </li>     <li> <p>下沉排序</p> </li>    </ol>    <p>java实现如下:</p>    <pre>  <code class="language-java">public static void sort(Comparable[] pq) {      int N = pq.length;      //堆的构造      for (int k = N/2; k >= 1; k--)          sink(pq, k, N);            //下沉排序      while (N > 1) {          exch(pq, 1, N--);          sink(pq, 1, N);      }  }</code></pre>    <p>堆排序的轨迹图</p>    <p><img src="https://simg.open-open.com/show/1892d984fcb98ddab7d77070a18dd3e3.png"></p>    <p>由图看出,堆排序的趋势是,堆构造阶段,大致是降序的走势,到了下沉阶段,从右到左(或者说从后往前)逐步有序</p>    <p>Significance: In-place sorting algorithm with N log N worst-case.</p>    <ul>     <li> <p>Mergesort: no, linear extra space.</p> </li>     <li> <p>Quicksort: no, quadratic time in worst case</p> </li>    </ul>    <p>缺点</p>    <ul>     <li> <p>Inner loop longer than quicksort’s.</p> </li>     <li> <p>Makes poor use of cache memory.</p> </li>     <li> <p>Not stable(不稳定)</p> </li>    </ul>    <h2>总结和比较</h2>    <p>排序算法总结表</p>    <p><img src="https://simg.open-open.com/show/f331e6e8ea7dadecc45fb3d82e172f2e.png"></p>    <p>最好情况和最坏情况:参见上面的表格</p>    <p>关于稳定性:</p>    <ul>     <li> <p>稳定性,插入排序,归并排序</p> </li>     <li> <p>不稳定:选择排序,快排,希尔排序,堆排序</p> </li>     <li> <p>原因: Long-distance exchange</p> </li>    </ul>    <p>关于额外空间:除了归并排序需要线性的额外空间,其他都是in-place的</p>    <h2>命题</h2>    <ul>     <li> <p>对于长度为N的数组,选择排序需要N^2/2次比较和N次交换(pf见P156)</p> </li>     <li> <p>对于随机排列的长度为N的且主键不重复的数组(pf见P157)</p>      <ul>       <li> <p>平均情况下插入排序需要~N^2/4次比较和~N^2/4次交换</p> </li>       <li> <p>最坏情况下需要~N^2/2次比较和~N^2/2次交换,</p> </li>       <li> <p>最好情况下需要N-1次比较和0次交换。</p> </li>      </ul> </li>     <li> <p>Mergesort uses at most N lg N compares and 6 N lg N array accesses to sort any array of size N. (pf见P173)</p> </li>     <li> <p>Mergesort uses extra space proportional to N.(The array aux[] needs to be of size N for the last merge.)</p> </li>     <li> <p>Any compare-based sorting algorithm must use at least lg ( N ! ) ~ N lg N compares in the worst-case.(pf见P177)</p> </li>     <li> <p>长度为N的无重复数组排序,快速排序平均需要~2N ln N 次比较(以及1/6即1/3 N ln N的交换)</p>      <ul>       <li> <p>最多需要约N^2/2次比较</p> </li>       <li> <p>最少需要~N lg N 次比较</p> </li>      </ul> </li>     <li> <p>用下沉操作由N个元素构造堆只需少于2N次比较以及少于N次交换(pf见P206)</p> </li>     <li> <p>将N个元素排序,堆排序只需少于(2NlgN+2N)次比较以及一半次数的交换(pf见P208)</p> </li>    </ul>    <p>来自: <a href="/misc/goto?guid=4959672757345241487" rel="nofollow">https://segmentfault.com/a/1190000005082467</a></p>