Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
222 views
in Technique[技术] by (71.8m points)

How to bootstrap installation of Python modules on Amazon EMR?

I want to do something really basic, simply fire up a Spark cluster through the EMR console and run a Spark script that depends on a Python package (for example, Arrow). What is the most straightforward way of doing this?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

The most straightforward way would be to create a bash script containing your installation commands, copy it to S3, and set a bootstrap action from the console to point to your script.

Here's an example I'm using in production:

s3://mybucket/bootstrap/install_python_modules.sh

#!/bin/bash -xe

# Non-standard and non-Amazon Machine Image Python modules:
sudo pip install -U 
  awscli            
  boto              
  ciso8601          
  ujson             
  workalendar

sudo yum install -y python-psycopg2

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...