Intro

This section of the <STATS@UIUC> Big Data Image help documentation covers the necessary software, how to install, and the basic startup and shut down procedures for the image.

Getting the Environment Setup

Acquire VirtualBox

The image we are using is built ontop of VirtualBox. At the time of this writing, the latest version is 4.3.20.

Please download VirtualBox 4.3.20 and then install it.

Loading the image

To load the image adhere to the following steps:

File > Import Application

Select File and then Import Application

Select the folder with a green arrow, navigate to where UIUC_STAT490_HDP_image.ova is, press open, and then next.

Find hdp image

Next, you can specify settings for the image such as: the location in which it is stored, the amount of RAM it can use, and the amount of CPU cores available for it.

Import options

Note: If you are planning on using an external drive, here is how you might modify the storage location.

external drive options

See additional information for further ways to configure the image.

Launching the Image

To start the image, select it in the left hand side menu and press the start button

select image in pull out menu and press start

This will lead to a new window opening.

new open window

Feel free to click the (X) on the overlaid information bubble to close it.

After the image is loaded, there will be a start up script that runs:

start up script running

The script should finish the initialization process in about 3-5 minutes.

Start up

During start up, please make sure to only have Virtual Box and Chrome open. Close all other programs. Even ones living in task bar such as Skype or Lync. They may interfere with certain ports that are needed by the image.

Please make sure that during startup the zookeeper section of the script does not display the following error:

Call from sandbox.hortonworks.com/10.0.2.15 to sandbox.hortonworks.com:8020 failed on connection exception: java.net.ConnectionException: Connection refused: For more details see: <http://wiki.apache.org/hadoop/Connectionrefused>

If you receive this issues, please note that this may cause issues interacting with Hadoop. You are advised to restart the image and ensure no other applications are open besides Virtual Box and Chrome.

See the known issues section at the end for more information.

Welcome to Shell

Login on

When the image is done starting up, you should see:

finished view

The image is set up so that you can go directly into shell by pressing either ALT + F5 for Windows and Linux or Function Key + ALT + F5 for OS X.

There are two accounts available. One is the root account and the other is the rstudio account.

In practice, the root account will never be available to you. Therefore, it is ideal if you use the rstudio account. To use the rstudio account, at the login prompt use the following information to login:

Username: rstudio

Password: rstudio

Note: The rstudio account is specific to the UIUC Modified Image of HDP.

Shutdown Procedures

There are many ways that Virtual Box allows for the image to be turned off. The following two ways are the best.

Shutting the image down

To turn off the image, please use the following shutdown sequence:

  1. Log into shell
  2. Type in:
# Stop R Studio from running
sudo rstudio-server stop

# Turn off the image
sudo poweroff

Save the state

The other option for shutdown the image is to save the state of the image.

On the shell window, press the X button on the top right of the window.

snapshot restore previous state

Select the radio button next to Save the machine state and then press OK.

snapshot restore previous state

The window will close and the machine state will be displayed as:

snapshot restore previous state

These shutdown procedures are meant to prevent the possibility of a loss of an R session and a manual start of RStudio Server. For more information, see the known issues section at the end.