Communication Patterns

Optimizing communication patterns is crucial for achieving high performance in parallel applications. This guide provides patterns and best practices for efficient communication in MPI applications.

Common Communication Patterns

1. Master-Worker Pattern

// Master process
if (rank == 0) {
    for (int i = 1; i < size; i++) {
        MPI_Send(data, count, MPI_FLOAT, i, tag, comm);
    }
}

// Worker processes
if (rank > 0) {
    MPI_Recv(data, count, MPI_FLOAT, 0, tag, comm, &status);
}

2. Ring Pattern

// Ring communication
int left = (rank + size - 1) % size;
int right = (rank + 1) % size;

// Send to right, receive from left
MPI_Sendrecv(sendbuf, count, MPI_FLOAT, right, tag,
             recvbuf, count, MPI_FLOAT, left, tag,
             comm, &status);

3. Pipeline Pattern

// Pipeline communication
if (rank == 0) {
    // Process data and send to next
    MPI_Send(processed_data, count, MPI_FLOAT, 1, tag, comm);
} else if (rank < size - 1) {
    // Receive from previous, process, send to next
    MPI_Recv(data, count, MPI_FLOAT, rank - 1, tag, comm, &status);
    MPI_Send(processed_data, count, MPI_FLOAT, rank + 1, tag, comm);
} else {
    // Last process: receive and finalize
    MPI_Recv(data, count, MPI_FLOAT, rank - 1, tag, comm, &status);
}

Optimized Communication Patterns

1. Communication-Avoiding Algorithms

// Example: Communication-avoiding matrix multiplication
// Instead of full matrix exchange, use partial updates
for (int i = 0; i < local_size; i++) {
    // Local computation
    for (int j = 0; j < local_size; j++) {
        local_result[i][j] += local_a[i][k] * local_b[k][j];
    }
    // Periodic communication
    if (i % communication_interval == 0) {
        MPI_Allreduce(local_result, global_result, count, MPI_FLOAT, MPI_SUM, comm);
    }
}

2. Hybrid Communication Patterns

// Example: Hybrid MPI+OpenMP communication
#pragma omp parallel
{
    int thread_id = omp_get_thread_num();
    int num_threads = omp_get_num_threads();

    // Thread-local computation
    #pragma omp for
    for (int i = 0; i < local_size; i++) {
        // Local computation
    }

    // Thread-local reduction
    #pragma omp critical
    {
        // Update shared data
    }
}

// MPI communication between processes
MPI_Allreduce(local_data, global_data, count, MPI_FLOAT, MPI_SUM, comm);

Best Practices

Minimize Communication
Use local computation where possible
Combine multiple messages into single transfers
Use non-blocking communication for overlapping computation
Optimize Collective Operations
Choose appropriate collective operation based on data pattern
Use MPI_IN_PLACE when possible
Consider non-blocking collectives for large data
Load Balancing
Distribute work evenly across processes
Use dynamic load balancing for irregular workloads
Consider process migration for load balancing
Communication-Avoiding Algorithms
Implement algorithms that minimize global communication
Use local reductions before global operations
Consider domain decomposition strategies

Performance Considerations

Network Topology
Use topology-aware communication patterns
Consider network distance in process placement
Use process affinity for better cache utilization
Memory Bandwidth
Optimize data layout for better cache utilization
Use appropriate data types for communication
Consider memory pinning for DMA transfers
Synchronization
Minimize synchronization points
Use non-blocking operations where possible
Consider using one-sided communication

MPI Communication Guide

Discussion

Join the discussion about communication patterns and optimization techniques in our community forum.