Module for implementing parallel with AWS Batch.
See the class AWSBatchJobManager for the implementation.
Using AWS Services for parallel processing in RIOS¶
This directory holds implementations of per tile parallel processing using AWS services. Currently only AWS Batch is supported but it is intended that other services will be added in future.
Refer to jobmanager.py for an overview of how RIOS handles parallel processing.
Creating the infrastructure¶
This implementation comes with a CloudFormation script (
to create a separate VPC with all the infrastructure required. It is recommended
to use the script templates/createbatch.py for the creation or modification (via the
command line option) of this CloudFormation stack. There are also options for
overriding some of the input parameters - see the output of createbatch.py –help
for more information. NB: when running in a region that is NOT ap-southeast-2 you will
need to update the availability zones (–az command line option).
When you have completed processing you can run
templates/deletebatch.py to delete
all resources so you aren’t paying for it. Note that you specify the region and stack
name for this script via the RIOS_AWSBATCH_REGION and RIOS_AWSBATCH_STACK environment variables.
Note that both
deletebatch.py have a
--wait option that causes the
script to keep running until creation/deletion is complete.
Creating the Docker image¶
AWS Batch requires you to provide a Docker image with the required software installed. A Dockerfile is provided for this, but it it recommended that you use the Makefile to build the image as this handles the details of pulling the names out of the CloudFormation stack and creating a tar file of RIOS for copying into the Docker image. To build and push to ECR simply run (being careful to set RIOS_AWSBATCH_REGION to the correct AWS region):
By default this image includes GDAL, boto3 and RIOS.
Normally your script will need extra packages to run. You can specify the names of Ubuntu packages to also install with the environment variable EXTRA_PACKAGES like this:
EXTRA_PACKAGES="python3-sklearn python3-skimage" RIOS_AWSBATCH_REGION=ap-southeast-2 make
You can also use the PIP_PACKAGES environment variable to set the name of any pip packages like this:
PIP_PACKAGES="pydantic python-dateutil" RIOS_AWSBATCH_REGION=ap-southeast-2 make
You can also specify both if needed:
EXTRA_PACKAGES="python3-sklearn python3-skimage" PIP_PACKAGES="pydantic python-dateutil" RIOS_AWSBATCH_REGION=ap-southeast-2 make
Setting up your main script¶
To enable parallel processing using AWS Batch in your RIOS script you must import the batch module:
from rios.parallel.aws import batch
controls.setNumThreads(4) # or whatever number you want controls.setJobManagerType('AWSBatch')
Note that the number of AWS Batch jobs started will be (numThreads - 1) as one job is done by the main RIOS script.
It is recommended that you run this main script within a container based on the one above. This reduces the likelihood of problems introduced by different versions of Python or other packages your script needs between the main RIOS script and the AWS Batch workers.
To do this, create a Dockerfile like the one below (replacing myscript.py with the name of your script):
# Created by make command above FROM rios:latest COPY myscript.py /usr/local/bin RUN chmod +x /usr/local/bin/myscript.py ENTRYPOINT ["/usr/bin/python3", "/usr/local/bin/myscript.py"]
Don’t forget to pass in your
AWS_SECRET_ACCESS_KEY environment variables to this
container when it runs (these variables are automatically set if running as a AWS Batch job but you’ll
need to set them otherwise).
Also a good idea to pass in your RIOS_AWSBATCH_REGION and RIOS_AWSBATCH_STACK environment variables if the defaults have been overridden so that RIOS can find the CloudFormation stack.
To also run you “main” Dockerfile as a batch job, push to the “RIOSecrMain” repository created by
templates/batch.yaml. You can then submit jobs to the RIOSJobQueue using the RIOSJobDefinitionMain.
- exception rios.parallel.aws.batch.AWSBatchException¶
- class rios.parallel.aws.batch.AWSBatchJobManager(numSubJobs)¶
Implementation of parallelisation via AWS Batch. This uses 2 SQS queues for communication between the ‘main’ RIOS script and the subprocesses (which run on Batch) and an S3 bucket to hold the pickled data (which the SQS messages refer to).
Stop our AWS Batch jobs by sending a special message to the queue
Gather all the results. Checks the output SQS Queue
- startOneJob(userFunc, jobInfo)¶
Start one sub job
Wait on all the jobs. Do nothing.
- jobMgrType = 'AWSBatch'¶
Helper function to query the CloudFormation stack for outputs.
Uses the RIOS_AWSBATCH_STACK and RIOS_AWSBATCH_REGION env vars to determine which stack and region to query.