Outline of a Simple Sample Sort

Architecture: A p processor machine with processors Pi, 0 <= i < p.
Input:
Each processors Pi on a  stores Ai[], an array of n/p integers
Output: Each processor Pi stores Ai [] an array of approximately n/p elements drawn from the
              union of Ai[] for 0<i<=p (i.e. the original data set) such that Ai []  is sorted and all elements
              of Ai []  <=  Aj[]  (0<= i < j <= p)

  1. Each processors Pi
  2. All splitter sets Si' are sent (gathered) to processor P0. 
  3. Processor P0
  4. The final splitters set S is broadcast to all processors
  5. Each processor Pi,
  6. Each processor Pi,

Note: you will need to build some support code for your algorithm.  You will need to generate the original data, communicate it to the processors and get it back to processor  P0 at the end.

Note: This algorithm can be implemented using MPI collective operations.  Of particular use, are the MPI operations MPI_Alltoall and MPI_Alltoallv

 

At the heart of your code there will be a method to redistribute the data based on the set of splitters.  It might look something like the following:

/*********************************************************************/
void Redistribute_keys(
         LOCAL_LIST_T* local_keys  /* in/out */) {

    int new_list_size, i, error = 0;
    int* send_counts;
    int* send_displacements;
    int* recv_counts;
    int* recv_displacements;
    KEY_T* new_keys;
    
    /* Allocate space for the counts and displacements */
    send_counts = (int*) malloc(p*sizeof(int));
    send_displacements = (int*) malloc(p*sizeof(int));
    recv_counts = (int*) malloc(p*sizeof(int));
    recv_displacements = (int*) malloc(p*sizeof(int));

    Local_sort(local_keys);
    Find_alltoall_send_params(local_keys, 
        send_counts, send_displacements);

    /* Distribute the counts */
    MPI_Alltoall(send_counts, 1, MPI_INT, recv_counts,
        1, MPI_INT, MPI_COMM_WORLD);

    /* Allocate space for new list */
    new_list_size = recv_counts[0];
    for (i = 1; i < p; i++)
        new_list_size += recv_counts[i];
    new_keys = (KEY_T*) 
        malloc(new_list_size*sizeof(KEY_T));

    Find_recv_displacements(recv_counts, recv_displacements);

    /* Exchange the keys */
    MPI_Alltoallv(List(local_keys), send_counts, 
        send_displacements, key_mpi_t, new_keys, 
        recv_counts, recv_displacements, key_mpi_t,
        MPI_COMM_WORLD);

    /* Replace old list with new list */
    List_free(local_keys);
    List_allocated_size(local_keys) = new_list_size;
    List_size(local_keys) = new_list_size;
    List(local_keys) = new_keys;

    /* Free temporary storage */
    free(send_counts);
    free(send_displacements);
    free(recv_counts);
    free(recv_displacements);

} /* Redistribute_keys */