Open-Source Software Optimization - Optimizing the Merge Sort Algorithm of Go Programming Language

Aug 25, 2021

The sorting algorithm is used to sort a series of data in a specific sequence. For example, the sorting algorithm is used to sort the exam scores of a school or data reports of a company in descending order. Common sorting algorithms include bubble sort, quick sort, dictionary sort, merge sort, and heap sort. Merge sort is a stable and efficient sorting algorithm, and the time complexity is N*log₂N. This document uses the merge sort algorithm optimization case of Go in the open-source community as an example to describe the common algorithm analysis and optimization methods.

1. Merge Sort Algorithm of Go

The sorting package of Go supports insertion sort, merge sort, quick sort, and heap sort. Merge sort is a stable sorting algorithm implemented by the sorting package. Go implements the merge sort algorithm based on recursion. The code is as follows:

// The code for merge sort before optimization uses recursion to merge two ordered subsequences: data[a:m] and data[m:b].
func symMerge(data Interface, a, m, b int) {
 // Initialize the values of mid, start, and r. mid indicates the median of [a:b], start indicates the start subscript for data exchange in [a:m], and r indicates the rightmost subscript for data exchange in [a:m].
 mid := int(uint(a+b) >> 1)
 n := mid + m
 var start, r int
 if m > mid {
  start = n - b
  r = mid
 } else {
  start = a
  r = m
 }
 p := n - 1

 // Use binary search for symmetrical comparison of data block [a:m] and [m:b], and find the start subscript position where data block [a:m] begins to be greater than data block [m:b].
 for start < r {
  c := int(uint(start+r) >> 1)
  if !data.Less(p-c, c) {
   start = c + 1
  } else {
   r = c
  }
 }

 // Flip the data blocks [start:m] and [m:end] to ensure that the data block [start:end] is in order.
 end := n - start
 if start < m && m < end {
  rotate(data, start, m, end)
 }
 // Invoke the recursive function to sort two groups of data sub-blocks [a:start] and [start:mid], and [mid:end] and [end:b].
 if a < start && start < mid {
  symMerge(data, a, start, mid)
 }
 if mid < end && end < b {
  symMerge(data, mid, end, b)
 }
}

2. Scenario and Problem Analysis

The merge sort algorithm is mainly used for sequential merging of data blocks, for example, merging of sequential slice data [0:20] and [20:40]. The main processing procedure is as follows:

Invoke the recursive function symMerge to initialize slice parameters.
The slice data [0:20] becomes less than [20:40] through binary search, symmetric comparison, and flipping.
Invoke the recursive function symMerge again to merge sort data sub-slices [0:start] and [start:20], and return to step 1.
Invoke the recursive function symMerge again to merge sort data sub-slices [20:end] and [end:40], and return to step 1.
The entire slices data [0:40] will be in order when all data sub-slices are sequenced.

In processing the common application scenario, it is found that the recursive function needs to be continuously invoked to process the subsequence in merge sort, and this is also a main reason for performance loss in the merge sort algorithm. In merge sort data slices, if recursive processing of sub-slices can be avoided or reduced, running performance of the algorithm can be improved. For example, in a scenario in merge sort the sub-slice data [0] whose length is 1 and the sub-slice data [1:10] whose length is 9, insertion sort is used to avoid invoking recursion, so that running time of an algorithm can be reduced, and performance of the algorithm can be optimized.

3. Optimization Solution and Implementation

Based on the code and scenario analysis of the merge sort algorithm in Go, an optimization method is found. That is, in merge sort, for a data block whose length is 1, a binary insertion sort algorithm is used to process data blocks, so that invoking of the recursive function can be reduced, and algorithm performance can be improved. The open-source Go community also uses the optimization method based on binary insertion sort to improve the performance of the merge sort algorithm. The following figure shows the code comparison before and after the optimization:

Optimization code:

// The optimization code adds a judgment on the slice length. When the slice length is 1, the binary insertion sort is used to directly sort the slices and return the slices.
func symMerge(data Interface, a, m, b int) {
    // If data[a:m] contains only one element, use the binary insertion sort to find the subsequential insertion point and insert data[a] to data[m:b].
    // Avoid invoking recursion function to improve the algorithm performance.
    if m-a == 1 {
        i := m
        j := b
        // Use binary search to find a proper sequential insertion point.
        for i < j {
            h := int(uint(i+j) >> 1)
            if data.Less(h, a) {
                i = h + 1
            } else {
                j = h
            }
        }
        // Insert data.
        for k := a; k < i-1; k++ {
            data.Swap(k, k+1)
        }
        return
    }

    // If data[m:b] contains only one element, use binary insertion to find the proper sequential insertion point and insert data[m] to data[a:m].
    // Avoid invoking recursion function to improve the algorithm performance.
    if b-m == 1 {
        i := a
        j := m
        // Use binary search to find a proper sequential insertion point.
        for i < j {
            h := int(uint(i+j) >> 1)
            if !data.Less(m, h) {
                i = h + 1
            } else {
                j = h
            }
        }
        // Insert data.
        for k := m; k > i; k-- {
            data.Swap(k, k-1)
        }
        return
    }

    // Recursive implementation of merge sort.
    ......
}

4. Optimization Result

Use Go benchmark to test the algorithm performance before and after the optimization, use Go benchstat to compare the performance test results before and after the optimization, and collect statistics on the number of times that the recursive function is invoked before and after the optimization. The following table lists the statistics:

Test Item	Test Case	Time per Operation Before Optimization	Time per Operation Time After Optimization	Operation Time Comparison	Total Recursions Before Optimization	Total Recursions After Optimization	Comparison of Total Recursions
BenchmarkStableString1K-8	1 KB string slice	302278 ns/op	288879 ns/op	-2.71%	1110	790	-28.8%
BenchmarkStableInt1K-8	1 KB integer slice	144207 ns/op	139911 ns/op	-0.96%	689	586	-14.9%
BenchmarkStableInt64K-8	64 KB integer slice	12291195 ns/op	12119536 ns/op	-0.75%	48465	41280	-14.8%
BenchmarkStable1e2-8	1e2(100) structure slice	135357 ns/op	124875 ns/op	-7.17%	1079	807	-25.2%
BenchmarkStable1e4-8	1e4(10000) structure slice	43507732 ns/op	40183173 ns/op	-7.64%	449735	350205	-22.1%
BenchmarkStable1e6-8	1e6(1000000) structure slice	9005038733 ns/op	8440007994 ns/op	-7.69%	79347843	61483110	-22.5%

Note: -8 indicates the GOMAXPROCS value when the function is running, and ns/op indicates the average nanosecond duration for running the function each time.

The preceding table lists the performance test results of different data types, such as string slice, integer slice, and structure slice. The comparison analysis is as follows:

Compare the 1 KB string slice data with the integer slice data. After the optimization, the recursion times of the string slice data are reduced by 28.8%, and the recursion times of the integer slice data are reduced by only 14.9%. This indicates that the optimization technology plays a greater role in the string slice dataset and the performance is improved greatly.
The comparison of integer slice data with the length of 64 KB and 1 KB respectively shows that the total number of recursion times decreases by about 14.8% after the optimization, and the performance improvement is 0.5% to 1%.
Compare the structure slice data whose lengths are 1e2, 1e4, and 1e6. After the optimization, the total numbers of recursion times are all reduced by about 22%, and the performance is all improved by about 7%.

The comparison and analysis of the performance test results show that the performance improvement after optimization is closely related to the test dataset. The more the data slice length of 1 occurred in the recursion, the greater effect the optimization will make, and the more the number of recursion times is reduced, the further the performance will be improved.

5. Summary

The optimization case of the merge sort algorithm analyzes the performance problems in a specific scenario, provides an optimization solution based on binary insertion sort, and finally verifies the optimization result. It is an algorithm optimization practice worth learning.

Select a Country or Region

Products

Connectivity

Computing

Cloud

Services

Industry Solutions

Consumer Support

Huawei Cloud Support

Enterprise Support

Carrier Support

Partners

Training & Certification

Developers

About Us

News & Events

Explore More

SEARCH HISTORY

Open-Source Software Optimization - Optimizing the Merge Sort Algorithm of Go Programming Language

1. Merge Sort Algorithm of Go

2. Scenario and Problem Analysis

3. Optimization Solution and Implementation

4. Optimization Result

5. Summary

Qiyu

Related Articles

Select a Country or Region

Connectivity

Computing

Cloud

SEARCH HISTORY

Open-Source Software Optimization - Optimizing the Merge Sort Algorithm of Go Programming Language

1. Merge Sort Algorithm of Go

2. Scenario and Problem Analysis

3. Optimization Solution and Implementation

4. Optimization Result

5. Summary

Qiyu

Related Articles

Online Services

Consumer Products

Huawei Cloud

Enterprise

Carrier Network