This post is part of our Bookshelf series organized by the Data Science R&D department at Civis Analytics. In this series, Civis data scientists share links to interesting software tools, blog posts, scientific articles, and other things that they have read about recently, along with a little commentary about why these things are worth checking out. Are you reading anything interesting? We’d love to hear from you on Twitter.
This is an exciting week for R! In addition to R 3.4.2, Rstudio 1.1 was released, with the all-important dark theme to help you feel more badass while you’re hacking on that critical ETL pipeline that got broken by something.
This is one of my new favorite packages, written by Henrik Bengtsson. While futures are common in many other programming languages (including Python), this package is an excellent port to R. It provides an easy-to-use API for distributed and asynchronous computing. It’s pretty slick:
Many different backends for distributed computing are available, and you can easily switch between them by specifying different plans, such as sequential, multicore, cluster, or remote.
Because the backends are modular, it’s possible to write custom backends that use the same API. For example, future.batchtools defines the future API for all backends in batchools, which includes Slurm, Sun Grid Engine, OpenLava, and TORQUE/OpenPBS. We’re playing around with a future backend for the Civis Platform in our civis-r API client.
Couldn’t let this opportunity pass without mentioning Mathpix, an R package by Jonathan Carrol that wraps the mathpix api which converts pictures of equations into latex. Here’s the cropped image I started with:
Well, you might not want to typeset your next paper this way, but it’s fun to play with nonetheless.