Posts

subscribe via RSS

  • Configuring Jekyll

    Intro In the previous post, Setting up Jekyll with Dreamhost. I detailed how I was able to get a Jekyll site working on a shared server in dreamhost. Within this post, the objective is to demonstrate the backend configuration for publishing. Setting up repositories We now need to setup the repositories. To avoid confusion, for this section I’ve opted to use: * server( to indicate tasks that must be done on the server / remote....

    Read more...
  • Setting up Jekyll with Dreamhost

    Intro Huh? Wait? Isn’t TheCoatlessProfessor powered by Wordpress? Well, it use to be. I grew weary of hearing complaints about TCP being unavailable or just slow. So, I started looking for alternatives. I’ve been a huge supporter of RMarkdown as of late (see: Reproducible Research). So, part of my needs for the next iteration of TCP was the easy of publishing with RMarkdown. Why? I can’t stand to use WYSIWYG editors after having spent so...

    Read more...
  • Transitioning TCP to a new platform

    Greetings and Salutations All, The site is currently down since I am changing over to a new codebase! The site should be back in action tomorrow July 8th, 2015. Also, prepare for a lot more indepth posts on different projects I am working on! In the interim, please note that I am sorry for the downtime of the site. 6 Sincerely, JJB

    Read more...
  • R to Armadillo using RcppArmadillo for Speed and Portability

    Intro A lot of the research that I do at the University of Illinois at Urbana-Champaign (UIUC) is computationally intensive. As a result, I’m always looking into ways to speed up the computations. One of the ways I’ve found to be very helpful is to become knowledgeable about High Performance Computing (HPC). In a nut shell, HPC techniques enable the use of a typical desktop computer to solve computationally intense problems through the use of...

    Read more...
  • STATS@UIUC Big Data Image for VirtualBox

    Intro This guide is meant to provide helpful information on working with the <STATS@UIUC> Big Data Image. As a result, some of the material covered here will not be available on a different image or in a production environment. However, the concepts will most certainly be relevant. Background Information on Image Software For starters, the <STATS@UIUC> Big Data Image is a modification of the Hortonwork’s Data Platform v2.2 VirtualBox image. The modifications that have been...

    Read more...
  • Environment Variables, Compiling a MapReduce job via Java, Known & Resolved Issue within the STATS@UIUC Big Data Image

    Intro In the previous entry, the web components and how to save the <STATS@UIUC> Big Data Image state were discussed. For this post, all the remaining information such as the environmental variables available, compilation instructions for hadoop jobs, and known issues with the image exist. Environmental Variables The image is modified so that there are variables that contain important path information to various hadoop component locations. These variables are not created by a hadoop installation....

    Read more...
  • Web Interfaces and Saving within the STATS@UIUC Big Data Image

    Intro In the previous post, the ability to interact within the STATS@UIUC Big Data Image was discussed. Within this post, we detail different websites that are available with different ports and how to create snapshots or saves of the virtual image state. In the Browser Within this section, we will detail various web components of the image. These web components are going to be accessed by using different ports on the localhost domain from your...

    Read more...
  • Working within shell and SSHing into the STATS@UIUC Big Data Image

    Intro In the previous post, we looked at the installation and use requirements for the <STATS@UIUC> Big Data Image. Within this section, we cover some necessary shell commands, SSH, and how to use copy and paste. Working within Shell Linux, just like Windows, is powered by an underlying command line. In general, we usually will use a graphical user interface that allow us to point our mouse at a button, click the button, and have...

    Read more...
  • Installing and Using the STATS@UIUC Big Data Image

    Intro This section of the <STATS@UIUC> Big Data Image help documentation covers the necessary software, how to install, and the basic startup and shut down procedures for the image. Getting the Environment Setup Acquire VirtualBox The image we are using is built ontop of VirtualBox. At the time of this writing, the latest version is 4.3.20. Please download VirtualBox 4.3.20 and then install it. Loading the image To load the image adhere to the following...

    Read more...
  • Working with AWS to Obtain a Hadoop cluster with RStudio Server, Hue, Pig, and Hive

    Intro The guide below is meant to illustrate the installation process of the STATS@UIUC Big Data Image on Amazon’s EMR platform. By using Amazon’s EMR platform, we are operating on a different configuration of Hadoop than what was distributed on the UIUC Big Data Virtual Image. At a later time, we will make available an installation process that mimics the UIUC Big Data Image on Amazon’s EC2 platform. A foreword…. This is new and there...

    Read more...