CSC300: Lecture 10 (Mergesort and Review (2.2))

 Contents [0/2]

 Homework [1/2]

```+ Study for the final exam.

The exam covers chapters 1-2 of the text (except 2.3).
Mergesort will be on the exam.
UnionFind will be on the exam.
Quicksort will NOT be on the exam.

The exam is closed book/notes.

Do not hand anything in this week.

+ Read Algorithms through the end of 2.4 (skip 2.3)
Skim pages 853-865 (chapter 6: "Event-Driven Simulation").

+ Do these problems on paper, but don't hand them in.
Be sure to do the starred ones.
2.1.1
2.1.2*
2.1.4
2.1.6*
2.1.7*
2.1.8*
2.1.15*
2.1.24
2.1.25 (optional)

+ Here are some mergesort problems for study.

2.2.1
2.2.2
2.2.4
2.2.5
2.2.17
```

```2.1.1
Show, in the style of the example trace with Algorithm 2.1,
how selection sort sorts the array
E A S Y Q U E S T I O N.

2.1.2*
What is the maximum number of exchanges involving any particular item
during selection sort? What is the average number of exchanges
involving an item?

max # of exchanges for selection sort for a particular element: N
An example:
501234
051234
015234
012534
012354
012345

total # of exchanges <= N
total # of elements = N
average <= 1

2.1.1
Show, in the style of the example trace with Algorithm 2.1,
how insertion sort sorts the array
E A S Y Q U E S T I O N.

2.1.6*
Which method runs faster for an array with all keys identical,
selection sort or insertion sort?

faster with identical keys:
insertion=N compares and 0 swaps
selection=N^2 compares and 0 swaps

Three important facts for this problem:
1. if all keys are identical then it's sorted
2. for a sorted array, insert sort is linear (it just checks every
3. for any array, selection sort does a quadratic number of comparisons.

Since the comparison pattern of selection sort is fixed, it still has
worst case performance.

2.1.7*
Which method runs faster for an array in reverse order,
selection sort or insertion sort?

faster for reversed array:
insertion=N^2 compares and N^2 swaps
selection=N^2 compares and N swaps

Experimentally
insertion with reversed array:
4000   15999998   4.0 [0.049 0.721]
8000   63999998   4.0 [0.173 3.531]
16000  255999998   4.0 [0.761 4.399]
32000 1023999998   4.0 [2.911 3.825]
64000 4095999998   4.0 [12.374 4.251]

selection with reversed array:
4000    8005999   4.0 [0.026 0.226]
8000   32011999   4.0 [0.115 4.423]
16000  128023999   4.0 [0.489 4.252]
32000  512047999   4.0 [1.964 4.016]
64000 2048095999   4.0 [8.282 4.217]

2.1.8*
Suppose that we use insertion sort on a randomly ordered array where items
have only one of three values. Is the running time linear, quadratic, or
something in between?

should be quadratic.  Number of values does not matter, it's their distribution.

three values:
4000    5203612   3.8 [0.016 0.364]
8000   21434206   4.1 [0.053 3.313]
16000   84502719   3.9 [0.210 3.962]
32000  342564934   4.1 [0.886 4.219]
64000 1360115026   4.0 [3.488 3.937]
128000 5419985602   4.0 [12.636 3.623]

two values:
4000    4039762   4.1 [0.012 0.279]
8000   16011924   4.0 [0.041 3.417]
16000   64837772   4.0 [0.155 3.780]
32000  254433286   3.9 [0.599 3.865]
64000 1022514350   4.0 [2.537 4.235]
128000 4097713293   4.0 [9.868 3.890]

one value (therefore presorted):
4000       7998   2.0 [0.002 Infinity]
8000      15998   2.0 [0.001 0.500]
16000      31998   2.0 [0.001 1.000]
32000      63998   2.0 [0.000 0.000]
64000     127998   2.0 [0.001 Infinity]
128000     255998   2.0 [0.000 0.000]

2.1.15*
Expensive exchange. A clerk at a shipping company is charged with the task
of rearranging a number of large crates in order of the time they are to be
shipped out. Thus, the cost of compares is very low (just look at the labels)
relative to the cost of ex- changes (move the crates). The warehouse is
nearly full --- there is extra space sufficient to hold any one of the crates,
but not two. What sorting method should the clerk use?

selection sort # exchanges <= N.

2.1.24
2.1.25
See XInsertionX.java

2.1.28
insertion with two values:
4000    4039762   4.1 [0.012 0.279]
8000   16011924   4.0 [0.041 3.417]
16000   64837772   4.0 [0.155 3.780]
32000  254433286   3.9 [0.599 3.865]
64000 1022514350   4.0 [2.537 4.235]
128000 4097713293   4.0 [9.868 3.890]

selection with two values:
4000    8005999   4.0 [0.017 0.586]
8000   32011999   4.0 [0.061 3.588]
16000  128023999   4.0 [0.241 3.951]
32000  512047999   4.0 [0.938 3.892]
64000 2048095999   4.0 [3.806 4.058]
128000 8192191999   4.0 [15.237 4.003]

To get test, initialize as
Integer[] a = new Integer[N];
for (int i = 0; i < a.length; i++)
a[i] = (i*2)/N;
StdRandom.shuffle (a);

2.2.1
Give a trace, in the style of the trace given at the beginning of this section,
showing how the keys A E Q S U Y E I N O S T are merged with the abstract
in-place merge() method.

2.2.2
Give traces, in the style of the trace given with Algorithm 2.4, showing how the
keys E A S Y Q U E S T I O N are sorted with top-down mergesort.

2.2.4
Does the abstract in-place merge produce proper output if and only if the two
input subarrays are in sorted order? Prove your answer, or provide a counterexample.

Yes, the abstract in-place merge only works if the two input subarrays are sorted.
To prove this, it is sufficient to demonstrate that whenever one of the subarrays
is not sorted, then the resulting array is not sorted.  This follows immediately
from the algorithm, since the order elements in the subarrays is preserved in the
merged array.

2.2.5
Give the sequence of subarray sizes in the merges performed by both the top-down
and the bottom-up mergesort algorithms, for N = 39.

For top down, you start with 39, then cut that in half, then in half
again. So the sizes as you merge are:

39
20 19
10 10 10 9
5 5 5 5 5 5 5 4
3 2 3 2 3 2 3 2 3 2 3 2 3 2 2 2
2 2 2 2 2 2 2

You could also write that as:

39
20 + 19
3*10 + 9
7*5 + 4
7*3 + 9*2
7*2

For bottom up, it goes the other way.

19*2 + 1
9*4 + 3
4*8 + 7
2*16 + 7
32 + 7
39

You can draw that out as (more or less)

2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1
4 4 4 4 4 4 4 4 4 3
8 8 8 8 7
16 16 7
32 7
39

2.2.17
Linked-list sort. Implement a natural mergesort for linked lists. (This is the
method of choice for sorting linked lists because it uses no extra space and
is guaranteed to be linearithmic.)
```

Revised: 2008/03/17 13:01