This class represents an MPI communicator. The communicator holds information about a ‘group’ of processes and allows for inter communication between these.
It’s not possible from within a communicator to talk with processes outside. Remember you have the MPI_COMM_WORLD communicator holding all the started proceses.
Communicators should not be created directly, but created through the comm_create, comm_split or comm_dup methods.
The allgather function will gather all the data variables from the participants and return it in a rank-order. All the involved processes will receive the result.
As an example each process can send its rank:
from mpi import MPI
mpi = MPI()
world = mpi.MPI_COMM_WORLD
rank = world.rank()
size = world.size()
received = world.allgather(rank)
assert received == range(size)
mpi.finalize()
Note
All processes in the communicator must participate in this operation. The operation will block until every process has entered the call.
Note
See also the alltoall() function where each process sends individual data to each other process.
A non blocking version of the collective allgather() operation. This function will return a handle (request object). The handle gives a number of options for testing / waiting on the operation progress.
As an example each process can send its rank:
from mpi import MPI
mpi = MPI()
def calc():
import time
time.sleep(5)
world = mpi.MPI_COMM_WORLD
rank = world.rank()
size = world.size()
handle = world.iallgather(rank)
# calculate while we wait for the collective operation to finish.
while not handle.test():
calc() # do some calculation
received = handle.wait()
assert received == range(size)
mpi.finalize()
Note
See the reference for non blocking collective operations for more information regarding how to use the non blocking operations.
Combines values from all the processes with the op-function. You can write your own operations, see the operations module. There is also a number of predefined operations, like sum, average, min, max and prod. An example for writing a really bad factorial function would be:
from mpi import MPI
from mpi.collective.operations import MPI_prod
mpi = MPI()
# We start n processes, and try to calculate n!
rank = mpi.MPI_COMM_WORLD.rank()
fact = mpi.MPI_COMM_WORLD.allreduce(rank, MPI_prod)
print "I'm rank %d and I also got the result %d. So cool" % (rank, fact)
mpi.finalize()
Note
The allreduce function will raise an exception if you pass anything else than a function as an operation.
Note
All processes in the communicator must participate in this operation. The operation will block until every process has entered the call.
A non blocking version of the collective allreduce() operation. This function will return a handle (request object). The handle gives a number of options for testing / waiting on the operation progress:
from mpi import MPI
from mpi.collective.operations import MPI_prod
mpi = MPI()
# We start n processes, and try to calculate n!
rank = mpi.MPI_COMM_WORLD.rank()
handle = mpi.MPI_COMM_WORLD.iallreduce(rank, MPI_prod)
# ...
fact = handle.wait()
print "I'm rank %d and I also got the result %d. So cool" % (rank, fact)
mpi.finalize()
Note
See the reference for non blocking collective operations for more information regarding how to use the non blocking operations.
This meethod extends the scatter() in the situation where you need all-to-all instead of one-to-all.
The input data should be list with the same number of elements as the size of the communicator. If you supply something else an Exception is raised.
If for example a process wants to send a string prefixed by the sending AND the recipient rank we could use the following code:
from mpi import MPI
mpi = MPI()
world = mpi.MPI_COMM_WORLD
rank = world.rank()
size = world.size()
send_data = ["%d --> %d" % (rank, x) for x in range(size)]
# For size of 4 and rank 2 this looks like
# ['2 --> 0', '2 --> 1', '2 --> 2', '2 --> 3']
recv_data = world.alltoall(send_data)
# This will then look like the following. We're still rank 2
# ['0 --> 2', '1 --> 2', '2 --> 2', '3 --> 2']
print recv_data
mpi.finalize()
Note
All processes in the communicator must participate in this operation. The operation will block until every process has entered the call.
A non blocking version of the alltoall() function. This makes it possible to latency hide calculation as normal the operations irecv() and isend() does.
See the reference for non blocking collective operations for more information regarding how to use the non blocking operations.
Blocks all the processes in the communicator until all have reached this call.
Example usage: The following code will iterate 10 loops in sync by calling the barrier at the end of each loop:
from mpi import MPI
mpi = MPI()
for i in range(10):
# Do some tedious calculation here
mpi.MPI_COMM_WORLD.barrier()
mpi.finalize()
The performance of the barrier function is the same as an bcast() call if the data in the call is small. Use this fact to piggybag data to other processes about status or whatever you need.
Note
All processes in the communicator must participate in this operation. The operation will block until every process has entered the call.
A non blocking version of the barrier() function. This makes it possible to latency hide calculation as normal the operations irecv() and isend() does.
See the reference for non blocking collective operations for more information regarding how to use the non blocking operations.
Broadcast a message (data) from the process with rank <root> to all other participants in the communicator.
This examples shows howto broadcast from rank 3 to all other processes who will print the message:
from mpi import MPI
mpi = MPI()
world = mpi.MPI_COMM_WORLD
if world.rank() == 3:
world.bcast("Test message", 3)
else:
message = world.bcast(root=3)
print message
mpi.finalize()
Note
All processes in the communicator must participate in this operation. The operation will block until every process has entered the call.
Note
An MPINoSuchRankException is raised if the provided root is not a member of this communicator.
A non blocking broad cast operation, taking the same parameters (with the same meaning) as bcast(). Instead of returning the broadcasted data a handle is returned. It is possible to test()` or ``wait() for the operation completion.
It is possible to use the non blocking collective request as any other request object. This means that test_some`(), wait_all() support the collective requests.
MPI_IDENT results if and only if comm1 and comm2 is exactly the same object (identical groups and same contexts). MPI_CONGRUENT results if the underlying groups are identical in constituents and rank order; these communicators differ only by context. MPI_SIMILAR results if the group members of both communicators are the same but the rank order differs. MPI_UNEQUAL results otherwise.
Original MPI 1.1 specification at http://www.mpi-forum.org/docs/mpi-11-html/node101.html#Node101
This function creates a new communicator with communication group defined by the group parameter and a new context. No cached information propagates from the existing communicator to the new. The function returns None to processes that are not in group.
The call is erroneous if not all group arguments have the same value, or if group is not a subset of the group associated with comm. Note that the call is to be executed by all processes in comm, even if they do not belong to the new group.
Note
This call is internally implemented either locally, in which case a maximum of 32 new can be derived from a single communicator, or collective with no (realistic) limit on the amount of created communicators but is significantly slower.
Duplicates the existing communicator comm with associated key values. For each key value, the respective copy callback function determines the attribute value associated with this key in the new communicator; one particular action that a copy callback may take is to delete the attribute from the new communicator. Returns in newcomm a new communicator with the same group, any copied cached information, but a new context (see http://www.mpi-forum.org/docs/mpi-11-html/node119.html#Node119).
Original MPI 1.1 specification at http://www.mpi-forum.org/docs/mpi-11-html/node102.html
This operation marks the communicator object as closed.
Note
Deviation: This method deviates from the MPI standard by not being collective, and by not actually deallocating the object itself.
This function partitions the group associated with comm into disjoint subgroups, one for each value of color. Each subgroup contains all processes of the same color. Within each subgroup, the processes are ranked in the order defined by the value of the argument key, with ties broken according to their rank in the old group. A new communicator is created for each subgroup and returned in newcomm. A process may supply the color value MPI_UNDEFINED, in which case newcomm returns MPI_COMM_NULL. This is a collective call, but each process is permitted to provide different values for color and key.
A call to MPI_COMM_CREATE(comm, group, newcomm) is equivalent to a call to MPI_COMM_SPLIT(comm, color, key, newcomm), where all members of group provide color~ =~0 and key~=~ rank in group, and all processes that are not members of group provide color~ =~ MPI_UNDEFINED. The function MPI_COMM_SPLIT allows more general partitioning of a group into one or more subgroups with optional reordering.
This call applies only intra-communicators.
Warning
This function is presently NOT IMPLEMENTED because it does not do anything that cannot otherwise be done with groups (albeit this is simpler), and it requires special handling. Target implementation version: 1.1
Each process (root process included) sends the contents of its send buffer to the root process. The root process receives the messages and stores them in rank order.
As an example each of the processes could send their rank to the root:
from mpi import MPI
mpi = MPI()
world = mpi.MPI_COMM_WORLD
rank = world.rank()
size = world.size()
# Make every processes send their rank to the root.
ROOT = 3
received = world.gather(rank+1, root=ROOT)
if ROOT == rank:
assert received == range(1, size+1)
else:
assert received == None
mpi.finalize()
Note
All processes in the communicator must participate in this operation. The operation will block until every process has entered the call.
Note
An MPINoSuchRankException is raised if the provided root is not a member of this communicator.
Note
See also the allgather() and alltoall() functions.
A non blocking version of the gather() function. This makes it possible to latency hide calculation as normal the operations irecv() and isend() does.
See the reference for non blocking collective operations for more information regarding how to use the non blocking operations.
returns the group associated with a communicator
Basic receive function. Receives from the destination rank a message with the specified tag.
This is a blocking operation. Look into irecv if you can start your receive early and do some computing while you wait for the receive result.
This method will not return if the destination process never sends data to this with the specified tag. See send() documentation for full working example.
POSSIBLE ERRORS: If you specify a destination rank out of scope for this communicator.
Note
It’s possible for rank N to receive data from N.
Note
See the Rules for tags page for rules about your custom tags
Starts a non-blocking receive, returning a handle like isend(). The following example shows how to prepare a receive request but perform some larger calculation while the MPI environment completes the receive:
from mpi import MPI
mpi = MPI()
world = mpi.MPI_COMM_WORLD
if world.rank() == 0:
world.send( "My message", 1)
else:
handle = world.irecv(0)
# Do some large calculation here
pass
# Receive the data from process 0
data = handle.wait()
mpi.finalize()
Note
It’s possible for rank N to receive data from N.
Basic send function. Send to the destination rank a message with the specified tag.
This is a blocking operation. Look into isend if you can start your send early and do some computing while you wait for the send to finish.
Example Rank 0 sends “Hello world!” to rank 1. Rank 1 receives the message and prints it:
from mpi import MPI
mpi = MPI()
TAG = 1 # optional. If omitted, MPI_TAG_ANY is assumed.
if mpi.MPI_COMM_WORLD.rank() == 0:
mpi.MPI_COMM_WORLD.send("Hello World!", 1, TAG)
else:
message = mpi.MPI_COMM_WORLD.recv(0, TAG)
print message
mpi.finalize()
Note
The above program will only work if run with -c 2 parameter (see Starting your processes with mpirun.py). If invoked with more processes there will be size-2 processes waiting for a message that will never come.
Note
It’s possible for rank N to send data to N. The data will not be transferred on the network but take a faster path.
POSSIBLE ERRORS: If you specify a destination rank out of scope for this communicator.
Note
See the Rules for tags page for rules about your custom tags
Starts a non-blocking send. The function will return as soon as the data has been copied into a internal buffer making it safe for the user to alter the data.
The function will return a handle making it possible cancel the request, wait until the sending has completed or simply test if the request is complete like the following example shows:
from mpi import MPI
mpi = MPI()
world = mpi.MPI_COMM_WORLD
if world.rank() == 0:
handle1 = world.isend("My message to 1", 1)
handle2 = world.isend("My message to 2", 2)
# Wait until the message to 1 is sent
handle1.wait()
# Check if the second message has completed. Cancel the
# request otherwise
if handle2.test():
handle2.wait() # This will complete right away
# due to the test.
else:
handle2.cancel()
else:
# This might not complete if the request gets
# cancelled on the other end
message = world.recv(0)
mpi.finalize()
Note
It’s possible for rank N to send data to N (itself). The data will not be transferred on the network but take a faster path.
Synchronized send function. Send to the destination rank a message with the specified tag.
Ssend blocks until a matching receieve is posted. That is when ssend returns you know the receiver has asked for something matching your message and most likely has also gotten the message.
Example Rank 0 sends “Hello world!” to rank 1. Rank 1 posts a matching receive and rank 0 can be sure the message has gotten through.:
from mpi import MPI
mpi = MPI()
TAG = 1 # optional. If omitted, MPI_TAG_ANY is assumed.
if mpi.MPI_COMM_WORLD.rank() == 0:
mpi.MPI_COMM_WORLD.ssend("Hello World!", 1, TAG)
print "Now rank 1 must have asked for the message"
elif mpi.MPI_COMM_WORLD.rank() == 1:
message = mpi.MPI_COMM_WORLD.recv(0, TAG)
else:
pass
mpi.finalize()
POSSIBLE ERRORS: If you specify a destination rank out of scope for this communicator.
See also: issend()
Note
See the Rules for tags page for rules about your custom tags
Synchronized non-blocking send function. The function will return as soon as the data has been copied into a internal buffer for subsequent sending.
The function will return a handle to the request on which is is possible to cancel, wait until the sending has completed or simply test if the request is complete.
Until the receiving party in the communication has posted a receive of some kind matching the issend the request is not complete. Meaning that when a wait or a test on the request handle is succesful it is guaranteed that a matching receive is posted on the other side.
Example Rank 0 posts an isend but rank 1 delays before posting a recieve that matches. Meanwhile rank 0 can test to see if the message got through:
import time
from mpi import MPI
mpi = MPI()
rank = mpi.MPI_COMM_WORLD.rank()
size = mpi.MPI_COMM_WORLD.size()
message = "Just a basic message from %d" % (rank)
DUMMY_TAG = 1
if rank == 0: # Send
neighbour = 1
handle = mpi.MPI_COMM_WORLD.issend(message, neighbour, DUMMY_TAG)
# Since reciever waits 4 seconds before posting matching recieve
# the first test should fail
if not handle.test():
print "Yawn, still not getting through..."
# By the time we wake up the receiver should have posted a
# matching receive
time.sleep(5)
if handle.test():
print "Finally got through."
handle.wait() # This is not strictly needed but good form
elif rank == 1: # Recieve
neighbour = 0
time.sleep(4) # Wait a while to build tension
recieved = mpi.MPI_COMM_WORLD.recv(neighbour, DUMMY_TAG)
mpi.finalize()
See also: ssend() and test()
Reduces the data given by each process by the “op” operator. As with the allreduce method you can “for example” use this to calculate the factorial number of size:
from mpi import MPI
from mpi.collective.operations import MPI_prod
mpi = MPI()
root = 4
# We start n processes, and try to calculate n!
rank = mpi.MPI_COMM_WORLD.rank()
size = mpi.MPI_COMM_WORLD.size()
dist_fact = mpi.MPI_COMM_WORLD.reduce(rank+1, MPI_prod, root=root)
mpi.finalize()
Note
All processes in the communicator must participate in this operation. The operation will block until every process has entered the call.
Note
An MPINoSuchRankException is raised if the provided root is not a member of this communicator.
A non blocking version of the reduce() function. This makes it possible to latency hide calculation as normal the operations irecv() and isend() does.
See the reference for non blocking collective operations for more information regarding how to use the non blocking operations.
The scan function can be through of a partial reducing involving process 0 to i, where i is the rank of any given process in this communicator.
To calculate the partial sum of the ranks upto the process self the following code could be used:
from mpi import MPI
mpi = MPI()
world = mpi.MPI_COMM_WORLD
rank = world.rank()
size = world.size()
partial_sum = world.scan(rank, sum)
print "%d: Got partial sum of %d" % (rank, partial_sum)
assert partial_sum == sum(range(rank+1))
mpi.finalize()
Note
All processes in the communicator must participate in this operation. The operation will block until every process has entered the call.
A non blocking version of the scan() function. This makes it possible to latency hide calculation as normal the operations irecv() and isend() does.
See the reference for non blocking collective operations for more information regarding how to use the non blocking operations.
Takes a list with the size M*N, where N is also the number of participants in this communicator. It distributes the elements so each participant gets M elements. Notice the difference in scattering where M is 1 or above in the following example:
from mpi import MPI
mpi = MPI()
world = mpi.MPI_COMM_WORLD
rank = world.rank()
size = world.size()
# Scatter a list with the same number of elements in the
# list as there are processes in th world communicator
SCATTER_ROOT = 3
if rank == SCATTER_ROOT:
scatter_data = range(size)
else:
scatter_data = None
my_data = world.scatter(scatter_data, root=SCATTER_ROOT)
print "Rank %d:" % rank, my_data
# Scatter a list with 10 times the number of elements
# in the list as there are processes in the world
# communicator. This will give each process a list
# with 10 items in it.
if rank == SCATTER_ROOT:
scatter_data = range(size*10)
else:
scatter_data = None
my_data = world.scatter(scatter_data, root=SCATTER_ROOT)
print "Rank %d:" % rank, my_data
mpi.finalize()
Running the above example with 5 processes will yield something like:
Rank 4: 4
Rank 2: 2
Rank 1: 1
Rank 3: 3
Rank 0: 0
Rank 2: [20, 21, 22, 23, 24, 25, 26, 27, 28, 29]
Rank 0: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Rank 1: [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
Rank 4: [40, 41, 42, 43, 44, 45, 46, 47, 48, 49]
Rank 3: [30, 31, 32, 33, 34, 35, 36, 37, 38, 39]
Note
All processes in the communicator must participate in this operation. The operation will block until every process has entered the call.
Note
An MPINoSuchRankException is raised if the provided root is not a member of this communicator.
A non blocking version of the scatter() function. This makes it possible to latency hide calculation as normal the operations irecv() and isend() does.
See the reference for non blocking collective operations for more information regarding how to use the non blocking operations.
The send-receive operation combine in one call the sending of a message to one destination and the receiving of another message, from another destination. The two destinations can be the same.
A send-receive operation is very useful for executing a shift operation across a chain of processes. A message sent by a send-receive operation can be received by a regular receive operation or probed by a probe operation; a send-receive operation can receive a message sent by a regular send operation.
Default values: If no source is defined it is defined same as the dest.
Example usage: The following code will send a token string between all messages. All ranks receive the token from their lower neighbour and pass it to the upper neighbour:
from mpi import MPI
mpi = MPI()
rank = mpi.MPI_COMM_WORLD.rank()
size = mpi.MPI_COMM_WORLD.size()
content = "conch"
DUMMY_TAG = 1
# Send up in chain, recv from lower
# modulo with size is to wrap around for lowest and highest rank
dest = (rank + 1) % size
source = (rank - 1) % size
recvdata = mpi.MPI_COMM_WORLD.sendrecv(content+" from "+str(rank),
dest,
source,
DUMMY_TAG,
DUMMY_TAG)
print "Rank %i got %s" % (rank,recvdata)
mpi.finalize()
Note
There is no sequential ordering here, as the print output will show. All that is guaranteed is that every process has sent and received, not in any particular order.
Test if all the requests in the request list are finished. Returns a boolean indicating this. The following test shows the expected behaviour:
from mpi import MPI
import time
mpi = MPI()
world = mpi.MPI_COMM_WORLD
rank = world.rank()
size = world.size()
handles = []
if rank == 0:
# Sleep so the sending will be delayed
time.sleep(3)
for i in range(10):
if rank == 0:
world.send( i, 1)
else:
handle = world.irecv(0)
handles.append(handle)
if rank == 1:
# It will probably not be ready the first time
ready = world.testall(handles)
assert not ready
# Give time time for the sending to complete
time.sleep(4)
# It should be ready now
ready = world.testall(handles)
assert ready
mpi.finalize()
Test if any of the requests in the request list has completed and return the first one it encounters. If none of the processes has completed the returned request will be None.
This function returns a tuple with a boolean flag to indicate if any of the requests is completed and the request object:
from mpi import MPI
from mpi import constants
mpi = MPI()
world = mpi.MPI_COMM_WORLD
rank = world.rank()
size = world.size()
handles = []
for i in range(100):
if rank == 0:
# This rank receives every message received by the other
# processes.
for j in range(size-1):
handle = world.irecv(constants.MPI_SOURCE_ANY)
handles.append(handle)
while handles:
(found, request) = world.testany(handles)
if found:
# Finish the request
request.wait()
handles.remove(request)
else:
world.send( "My data", 0, constants.MPI_TAG_ANY)
mpi.finalize()
Tests if some of the operations has completed. Return a list of requst objects from that list that’s completed. If none of the operations has completed the empty list is returned.
To receive a number of messages and print them you would do something like this:
from mpi import MPI
from mpi import constants
mpi = MPI()
world = mpi.MPI_COMM_WORLD
rank = world.rank()
size = world.size()
handles = []
for i in range(100):
if rank == 0:
# This rank receives every message received by the other
# processes.
for j in range(size-1):
handle = world.irecv(constants.MPI_SOURCE_ANY)
handles.append(handle)
while handles:
request_list = world.testsome(handles)
if request_list:
# Finish the request
data_list = world.waitall(request_list)
print "\n".join(data_list)
handles = [ r for r in handles if r not in request_list]
else:
world.send("My data", 0, constants.MPI_TAG_ANY)
mpi.finalize()
Waits for all the requests in the given list and returns a list with the returned data.
Example Rank 0 sends 10 messages to and 1 and then receives 10. We’re waiting for the receive completions with a waitall call:
from mpi import MPI
mpi = MPI()
rank = mpi.MPI_COMM_WORLD.rank()
size = mpi.MPI_COMM_WORLD.size()
request_list = []
if mpi.MPI_COMM_WORLD.rank() == 0:
for i in range(10):
mpi.MPI_COMM_WORLD.send("Hello World!", 1)
for i in range(10):
handle = mpi.MPI_COMM_WORLD.irecv(1)
request_list.append(handle)
messages = mpi.MPI_COMM_WORLD.waitall(request_list)
elif mpi.MPI_COMM_WORLD.rank() == 1:
for i in range(10):
handle = mpi.MPI_COMM_WORLD.irecv(0)
request_list.append(handle)
for i in range(10):
mpi.MPI_COMM_WORLD.send("Hello World!", 0)
messages = mpi.MPI_COMM_WORLD.waitall(request_list)
else:
pass
mpi.finalize()
Note
See also the waitany() and waitsome() functions.
Wait for one request in the request list and return a tuple with the request and the data from the wait().
This method will raise an MPIException if the supplied return_list is empty.
The following example shows rank 0 receiving 10 messages from every other process. Rank 0 wait for one request at the time, but does not specify which one. This allows for smoother progres:
from mpi import MPI
mpi = MPI()
world = mpi.MPI_COMM_WORLD
request_list = []
if world.rank() == 0:
for i in range(10):
for rank in range(0, world.size()):
if rank != 0:
request = world.irecv(rank)
request_list.append(request)
while request_list:
(request, data) = world.waitany(request_list)
request_list.remove(request)
else:
for i in range(10):
world.send( "Message", 0)
mpi.finalize()
Note
See also the waitall() and waitsome() functions.
Waits for some requests in the given request list to complete. Returns a list with (request, data) pairs for the completed requests.
The following example shows rank 0 receiving 10 messages from every other process. Instead of waiting for each one sequentially or waiting for them all at the same time it’s possible to fetch the already completed handles:
from mpi import MPI
mpi = MPI()
world = mpi.MPI_COMM_WORLD
request_list = []
if world.rank() == 0:
for i in range(10):
for rank in range(0, world.size()):
if rank != 0:
request = world.irecv(rank)
request_list.append(request)
while request_list:
items = world.waitsome(request_list)
print "Waited for %d handles" % len(items)
for item in items:
(request, data) = item
request_list.remove(request)
else:
for i in range(10):
world.send( "Message", 0)
mpi.finalize()
One possible output from the above code (with 10 processes) is:
Waited for 1 handles
Waited for 45 handles
Waited for 31 handles
Waited for 1 handles
Waited for 12 handles
Note
This function works in many aspects as the unix select functionality. You can use it as a simple way to just work on the messages that are actually ready without coding all the boiler-plate yourself.
Note however that it’s not given that this function will include all the requests that are ready. It will however include some.
Return the measured resolution of the Wtime() call. This is not to be taken super strict. The Wtime() call is a Python floating point but not all systems will actually provide any such resolution.
Returns the wall clock time this rank has been active. This time is not syncronized between the ranks.