Background</h1>

Great news everyone! Next semester, a course that I proposed about a year ago now is going to be offered as a Methods of Applied Statistics course at the University of Illinois at Urbana-Champaign (e.g. STAT 430) taught by Darren Glosemeyer. This course is acting as a trial of run of how the course I proposed would look like. The course I proposed was STAT 490 - Big Data Analysis Foundations</strong>. The goal behind the course was to orient students to view large data in a more reasonable scope. </p>

More specifically, the goal is fulfilled by pulling back the mythical curtain on the application of the MapReduce algorithm</a> using Hadoop for historical data</a> and Storm for real time data</a>. This is further accompanied by the use of Pig</a>, Hive</a>, and HBase</a>. The course also views different iterative estimation methods and data manipulation techniques available. Lastly, the course looks at providing visualizations for big data. </p>

Whether all of these sub goals, as stated in the initial proposal, are achieved within the course on the first go around may or may not be the case due to how recent these technologies are. Regardless, I'm very excited that this course is seeing the light of day. One day soon, I hope to be able to teach this course over the summer if my schedule allows.</p>

STATS@UIUC's Big Data Image (Modified HDP 2.2)</h1>

Over the last month, the image that will be used by students in the STAT 430 - Big Data Analysis Foundations has been completed. Unfortunately, due to issues with the resources available to virtual environments at UIUC, students will not be able to use this image within a virtual environment supported by ATLAS. The initial tests of very basic code took considerably longer than one would like. </p>

As a result, students are encouraged to download the virtual image and adjust it for their machine.</strong></p>

Click here to download the virtual image via Box.</a></p>

The virtual image is a modified version of HortonWorks' Sandbox with HDP 2.2</a> that contains preinstalled and configured versions of R Studio Server</a> and Revolution Analytics</a> rmr2</a>. </p>

The instructions for manually creating this image are available.</a></p>