This post is part of our Bookshelf series organized by the Data Science R&D department at Civis Analytics. In this series, Civis data scientists share links to interesting software tools, blog posts, scientific articles, and other things that they have read about recently, along with a little commentary about why these things are worth checking out. Are you reading anything interesting? We’d love to hear from you on Twitter.
Neural networks are a surprisingly useful tool, helping us classify images, translate text, and even play Atari. They provide a very general framework for creating algorithms. Interestingly, the process of training a neural network is itself an algorithm. The process of training a neural network can actually be performed by another neural network. This blog post summarizes two papers which demonstrate the idea. Their evidence suggests a neural network training another neural network might speed up training time over traditional methods based on stochastic gradient descent.
Sarcasm is a notoriously difficult concept to identify with the methods of natural language processing. For example, in this paper the authors found that even people have a hard idea agreeing whether a tweet was sarcastic or not! With DeepMoji, the authors go about classifying sarcasm (and other NLP tasks like sentiment) using a fun and indirect approach. First, the authors build a model to predict which emoji a tweet contains. There is a lot of labeled data for this task, so the model is well-tuned. Next, the authors apply this emoji model to a new task like sarcasm detection. They slightly adjust the emoji model using the small amount of labeled data they have for the specific sarcasm detection task. Surprisingly, this technique works 💯😃👍. It seems that building a model on emoji results in a model that can (approximately) predict the emotion of a tweet, which is useful for predicting sarcasm.
On the R&D team at Civis, we write a lot of tools which are used around the company (some of which we open source!). This article has some great tips on developing and releasing software to colleagues. Briefly, the article suggests to pay attention to how your colleagues use your software, make it easier for them based on what you see, and be up front about what you won’t or can’t do. I couldn’t agree more with the author on how rewarding it is to create internal tools. Talking to and hearing from users is a great part of writing software.
Machine learning is developing so rapidly it can be hard to keep up. From SELU to ByteNet to Sparsely-Gated Mixture-of-Experts, researchers are presenting new techniques at a blistering rate and it’s difficult to understand what’s most important. Karpathy, Tesla’s Director of AI, created a website which follows the papers mentioned by machine learning experts on Twitter. While I still find talking to my colleagues as the best source for finding out about novel findings, this aggregator often picks up on the biggest findings very quickly.