Previously, we created an AWS S3 bucket, uploaded files, and then launched an AWS EMR cluster via AWS CLI. Now, we are going to allow various ports to open so we can view the web interfaces via a web browser and SSH into the cluster.
Open IP Ports
To use RStudio and Hue, you will need to open ports.
To open ports go to the EC2 console
Security Groups page, select the
Actions dropdown menu and choose
Edit inbound rules.
Add Rule on the bottom left. Enter
8787 for the Port Range and underneath Source pull the dropdown menu to
WARNING: SELECTING ANYWHERE HAS SECURITY IMPLICATIONS SINCE ANYONE CAN THEN ACCESS THE CLUSTER!
Repeat this process to open Hue to the outside world with
8888 for the Port Range.
SSH into the cluster
To SSH into the cluster you need to know the Public DNS. This is available on the EC2 console running instance page.
The ssh command is then:
# Use ec2-user for admin rights ssh -i "<YOUR_KEYPAIR>".pem ec2-user@"<PUBLIC DNS>" # Use hadoop in order to hadoop jobs ssh -i "<YOUR_KEYPAIR>".pem hadoop@"<PUBLIC DNS>"
So, in my example it would be:
# Use hadoop in order to hadoop jobs ssh -i jjb_keypair.pem firstname.lastname@example.org