Posts

subscribe via RSS

  • NumPy Arrays to R Array Objects

    Intro One of the downsides of having multiple programming languages is that each have their own defined niches in both academia and industry worlds that results in data not flowing as easily as spice between them. For example, engineers have an affinity for Python while Statisticans are in love R. Thus, to process data in one language and then use an algorithm in another language is a headache in itself. Recently, support has begun to...

    Read more...
  • OpenMP in R on OS X

    Intro Lately, I’ve spent the past year or so working with parallelization techniques. In particular, I’ve grown accustom to using OpenMP, which follows a shared memory paradigm that enables parallel computing on single computer by taking advantage of the multiple cores shipped on modern CPUs (more info in the HPC Parallel Talk). However, a lot of the work that I do within the SMAC Group is done on OS X instead of a windows or...

    Read more...
  • Working with R on a Cluster

    Intro Often times I receive inquiries on how to deploy R packages or conduct simulation studies on the Illinois Campus Cluster (ICC). After writing a few responses, I realized that it would probably benefit not only the Illinois R community but also the larger R community if this information was more widely available. The information is primarily a pointed discussion on using R non-interactively (e.g. command line, shell or terminal) that follows from Invoking R...

    Read more...
  • Detecting If R is in RStudio and Changing RStudio's Default Graphing Device

    Intro RStudio is a great integrated developer environment (IDE) for R. I’m constantly recommending it to not only students but also faculty. The primary reason for this being: I am infatuated with how feature rich it is… Except for one, itsy bitsy feature… the plot window / graphs. Shudders. Not only is it inconvenient to have to switch back and forth between graphs you wish to compare, it is especially problematic with ggplot2 on OS...

    Read more...
  • Set R's Seed in Rcpp (Sequential Case)

    Intro The goal behind this post is to quickly point out a solution for setting the random seed from within Rcpp. This is particularly helpful to control the reproducibility of the RNG. RNG Seed Control from C++ By default in R, you can set a random seed using: set.seed(seed_here) However, the ability to modify the seed from within C++ is particularly problematic. This is most notably the case if you are trying to perform indirect...

    Read more...
  • Automatically Check if R Package is the Latest Version on Package Load

    Intro Recently, I’ve had to think about a lot of things as it relates to simplifying the R experience. Specifically, how do you ease engineers, who are fluent in MATLAB, into working with R? As part of this brainstorming session, I’ve stumbled upon quite a few important realizations. One of these realizations is that there is clear lack of indication as to whether or not a loaded package is up-to-date. That is, when a package...

    Read more...
  • R Data Packages in External Data Repositories using the Additional_repositories field

    Intro In the prior series entry on data packages, there was a discussion about how to create an R data package. Within the final entry in the series, the goal is to address the unthinkable: a data packages rejected from CRAN. Rejected data packages are particularly problematic as they show up as a missing dependency on the statistical methodology package under R CMD check. Fear not though, one can still use the data package that...

    Read more...
  • Creating an R Data Package

    Intro In the previous entry, there was a discussion regarding CRAN’s R package policy, specifically on the size of R data packages. Within this post, the aim is to address the best way to create a data package that is able to be distributed via CRAN. To do so, we reflect upon different methods used to construct a data package on CRAN. The next entry deals with constructing an external repository when the size of...

    Read more...
  • Size and Limitations of Packages on CRAN

    Intro This is the first entry out of three writings to address the nature of Data Packages within the R ecosystem. Within this post, we’ll talk about R package guidelines, distribution of a package, and the amount of data that is able to be shipped. In the next entry, the focus is on the best ways to create an R data package. For the third and final entry, the discussion turns to the creation of...

    Read more...
  • R Compiler Tools for Rcpp on OS X

    Intro The objective behind this post is to provide users with information on how to associate a compiler with the OS X version of R. This has been a bit problematic for many R users since OS X Mavericks, which resulted in gfortran binaries being dropped from the R installer. More curiously, the additional demand to have access to a compiler vs. downloading a binary from CRAN became apparent slightly after Rcpp’s 0.10.0 version, when...

    Read more...