Codemods: Path to painless upgrades in Ember

Wed, Jan 13, 2016

Note: This post assumes some knowledge of JS features from ES2015

This is blog post is continuation to the my other blog post on How to write a codemod

Introduction

After I wrote my last blog post on How to write a codemod, I was searching for problems that I can use for this blog post to talk more about codemods and I remembered about a blog post complaining the Ember 2.0 churn. I felt that codemods could have prevented some of the pain in the upgrading process. So, I wanted to write codemods for ember to show the community that, they can really benefit from codemods. React already provides codemods for some of its breaking changes. In fact any framework/library can use codemods to reduce some of the churn. All that said, codemods by no means is silver bullet but they solve some trivial problems in a very effective way.

Now the problem is I have absolutely zero knowledge of ember. But somehow, I landed on the ember deprecations page and I was immediately excited. It’s because they gave the code before deprecation and after deprecation and that’s exactly what we need to write a codemod. I also felt a huge sigh of relief because I don’t have to learn a new framework just to write a few codemods.

In this blog post, we will write codemods for two of the deprecations listed on that page. The first one is extremely simple and second one is slightly more complex and tricky. Let’s start!

A little recap

In my previous blog post, I haven’t summarized what we have learned about codemods. Let’s quickly go over things we have learned so far.

The first step to write a codemod is to dump the both input and output code in ASTExplorer. Then look at both the AST’s and identify the nodes you want to change and finally start writing your codemod.

A codemod is essentially a function which takes file and api (jscodeshift API) as arguments and returns the transformed JavaScript code as the output. Inside our codemod function, first we convert our given JavaScript code into AST. Then we select all the identified nodes we want to modify and apply the corresponding modifications on them. Once we are done, we convert the modified AST back into JavaScript code.

API summary

Note: j simply refers to api.jscodeshift if you have forgot.

These are the jscodeshift APIs that we have learnt so far.

Pascal case version of node type (for example j.FunctionExpression) is used to check the type of node
Camel case version of node type (for example j.functionExpression) is used to construct new nodes
j(...) takes the source of file and converts it into an AST
.find(...) finds particular nodes in the AST and returns a list of paths. jscodeshift doesn’t use an array to represent these paths. Instead it uses custom Collection class to represents these paths and provides its own custom methods such as map, filter, replaceWith etc., In fact, the AST that is generated by j(...) is also a Collection.

These are not documented anywhere so for now you have to read the jscodeshift’s source to find all such methods.
.replaceWith(...) replaces the node at each path inside the Collection.
.toSource() converts the transformed AST into JavaScript code.

Problem 1

Let’s look at the first deprecation code that we will transform.

// Ember < 2.1

import layout from '../templates/some-thing-lol';

export default Ember.Component.extend({
  defaultLayout: layout
});

// Ember 2.1 and later

import layout from '../templates/some-thing-lol';

export default Ember.Component.extend({
  layout: layout
});

Solution

The simplest solution is to replace defaultLayout identifier by layout identifier. But it may do some false positives transformations. We can avoid this problem by looking for some more context i.e, defaultLayout is key of an object literal and that object itself is an argument to Ember.Component.extend function call. So we will use this context to avoid any false positives.

So, the algorithm will be as follows:

Find all instances of Identifier’s whose name is defaultLayout and this will give us paths to all such nodes.
Go to the path of parent node (p.parent) and check if it’s a Property node and the defaultLayout is the key of this property. This also confirms Property node parent is an ObjectExpression node.
Check if the parent of ObjectExpression node is CallExpression and the CallExpression’s callee is Ember.Component.extend.
If path a satisfies these properties, then replace the defaultLayout identifier with layout identifier.

export default function (file, api) {
  const j = api.jscodeshift;

  const isProperty = p => {
    return (
    p.parent.node.type === 'Property' &&
    p.parent.node.key.type === 'Identifier' &&
    p.parent.node.key.name === 'defaultLayout'
    );
  };

  const checkCallee = node => {
    const types = (
      node.type === 'MemberExpression' &&
      node.object.type === 'MemberExpression' &&
      node.object.object.type === 'Identifier' &&
      node.object.property.type === 'Identifier' &&
      node.property.type === 'Identifier'
    );

    const identifiers = (
      node.object.object.name === 'Ember' &&
      node.object.property.name === 'Component' &&
      node.property.name === 'extend'
    );

    return types && identifiers;
  }

  const isArgument = p => {
    if (p.parent.parent.parent.node.type === 'CallExpression') {
      const call = p.parent.parent.parent.node;
      return checkCallee(call.callee);
    }
  }

  const replaceDefaultLayout = p => {
    p.node.name = 'layout';
    return p.node;
  }

  return (
    j(file.source)
    .find(j.Identifier, { name: 'defaultLayout' })
    .filter(isProperty)
    .filter(isArgument)
    .replaceWith(replaceDefaultLayout)
    .toSource()
  );
}

Live Version: http://astexplorer.net/#/bq5fA4afqR

There is not much here to explain. The code here feels a bit long but most of it is checking types of nodes and the Identifier’s name. Just look at return value of our transform, it gives the gist of what the codemod is. Each line follows the steps we have described above with the exception of first and last line which are common to every codemod. filter method here behaves exactly similar to Array.filter.

Note: I am not trying to write the most efficient code but I am trying to write modular code. The emphasis here is on teaching you how to write the codemod you want. Writing the most optimal code is left to you, the reader, as an exercise.

Also I should mention to those people who may want to try out this codemod on their own ember codebase that this codemod may not be complete. The reason being you may have code which relies on the defaultLayout property in other places. So, you may want to extend this codemod or write another codemod that complements this if you want achieve correct results.

Problem 2

Here is the second deprecation code that we will transform.

// Ember < 2.1

export function initialize(container, application) {
  application.inject('route', 'service:session');
}

export default {
  name: 'inject-session',
  initialize: initialize
};

// Ember 2.1 and later

export function initialize(application) {
  application.inject('route', 'service:session');
}

export default {
  name: 'inject-session',
  initialize: initialize
}

The solution I had in my mind was find the initialize function declaration and see if container was being inside it and if not just remove that variable. Here conveniently both key and value have same name but for the sake of learning let’s assume a slightly pathetic case where the initialize function is named something else other than initialize.

While I was thinking about the problem, I wanted to find ember 2.0 codebase to my apply previous transform. I came across a few repos in which I saw another pattern for ember initializers.

export default {
  name: 'inject-session',
  initialize: function (container, application) {
    application.inject('route', 'service:session');
  }
};

I wanted to handle this case too in my codemod.

Although it may feel a bit overkill, this problem helped me understand that I still had a lot to learn about jscodeshift. I went and read a lot of examples and jscodeshift’s source to get understand how I can solve this problem. At the end, I got a few insights on how to write a codemod which is I want to share with you in this blog post.

Solution (Attempt 1)

The outline of my first solution is to find the object which has name & initialize keys and depending the type of value of initialize key we can apply a suitable transform. The case where initialize key’s value is a FunctionExpression seemed easier to handle. If you don’t know what a FunctionExpression is, a function is called FunctionExpression if it is passed around as value for example to initialize a variable, passed as argument to function etc.,

This is my first half of the solution:

export default function (file, api) {
  const j = api.jscodeshift;

  const hasKey = (object, key) => {
    const { properties } = object;
    return properties.some(property => property.key.name === key);
  };

  const transformArity = fnNode => {
    if (fnNode.params.length === 2) {
      if (
        j(fnNode.body).find(j.Identifier, { name: 'container' }).size() === 0
      ) {
        fnNode.params = [fnNode.params[1]];
      }
    }
  };

  const changeArity = p => {
    const { node } = p;
    if (hasKey(node, 'name') && hasKey(node, 'initialize')) {
      const initialize = node.properties.find(
        property => property.key.name === 'initialize'
      );

      if (initialize.value.type === 'FunctionExpression') {
        transformArity(initialize.value);
      } else if (initialize.value.type === 'Identifier') {
        // todo
      }
    }

    return p.node;
  };

  return j(file.source)
    .find(j.ObjectExpression)
    .replaceWith(changeArity)
    .toSource();
}

Live Version: http://astexplorer.net/#/K0xJ26FT0Z

Tip: Always read the source of codemod from the bottom to top, so that you don’t have to read through plethora of implementation details to understand the gist of it.

Note: Arity is defined as the number of arguments of that given function accepts.

Let’s try to understand the code. In the return value of the codemod function, first we convert the file into an AST and find all the ObjectExpression nodes and then apply the changeArity function on each of them finally convert it back to JavaScript code. Inside the changeArity function, we are checking if the ObjectExpression node we have contains the keys name and initialize using the hasKey function. Since assumed the type of value of initialize key is FunctionExpression, then we apply the transformArity function which will check if the FunctionExpression has two arguments and if container has been used or not in the function body. If it is used, we will leave the function as is, otherwise we will remove the container argument.

Here, I got stuck trying to implement the transformArity function. Given a node, I didn’t know how to count all instances of an Identifier (contianer in this case). After reading some examples, I realized you can call j with a node (AST) as argument and it will return a Collection of that node then we can perform find, filter etc., on it. Collection has a size method which will return the no. of paths it has. With this information, we are back on track. So, we need to create a Collection from the function body and find all Identifier nodes with name container and check if that collection’s size is 0.

Now we will look at how to handle the other case where type of value of initialize key is an Identifier. First, we need look up the name of the Identifier and then find it’s FunctionDeclaration node then apply transformArity function on that node. That’s it we are done because we already know how to transform the arity of the function and all we needed to do was to find where it is.

So, the question is how to find this FunctionDeclaration node. The best way I could think of is to go the root of your AST and search from there as you will have access to everything from the root. So, you recursively travel to the parent of the current path and find the root. Although the idea seemed okay, I was looking for simpler solution. After some struggle, I found the solution in one of the transforms in react-codemod repo. I will now present a very simplified version of it.

export default function (file, api) {
  const j = api.jscodeshift;
  const root = j(file.source);

  // Some methods here

  const didTransform1 = root.find(...).replaceWith(...).size();

  const didTransform2 = root.find(...).replaceWith(...).size();

  if (didTransform1 + didTransform2 > 0) {
    return root.toSource();
  }

  return null;
}

There are several ideas that presented in this short piece. But we will not right away delve into all of them right away. Let’s just look at the most important one of them.

In the third line, j(file.source) is declared as root. As we have discussed, j when used a function returns a Collection and this time it’s a collection that contains the entire program.

This solves of our problem of finding the FunctionDeclaration node because you can simply do root.find(j.FunctionDeclaration). There is absolutely no need to write any recursive functions to get to the root. Voila! We can get back to solving our main problem.

This is what the final code looks like:


export default function (file, api) {
  const j = api.jscodeshift;
  const root = j(file.source);

  const hasKey = (object, key) => {
    const { properties } = object;
    return properties.some(property => property.key.name === key);
  };

  const transformArity = fnNode => {
    if (fnNode.params.length === 2) {
      if (
        j(fnNode.body).find(j.Identifier, { name: 'container' }).size() === 0
      ) {
        fnNode.params = [fnNode.params[1]];
      }
    }
  };

  const changeArity = p => {
    const { node } = p;
    if (hasKey(node, 'name') && hasKey(node, 'initialize')) {
      const [initialize] = node.properties.filter(
        property => property.key.name === 'initialize'
      );

      if (initialize.value.type === 'FunctionExpression') {
        transformArity(initialize.value);
      } else if (initialize.value.type === 'Identifier') {
        root
          .find(j.FunctionDeclaration, {
            id: { name: initialize.value.name }
          })
          .replaceWith(p => {
            transformArity(p.node);
            return p.node;
          });
      }
    }

    return p.node;
  };

  return root.find(j.ObjectExpression).replaceWith(changeArity).toSource();
}

Live Version: http://astexplorer.net/#/K0xJ26FT0Z/1

All I did here was to replace the TODO block with some code and the rest is same. We actually didn’t write much code this time. Let’s try to understand it.

We started here with a familiar root.find(j.FunctionDeclaration, ...) call & the other argument is just to find the function with correct name and this filter the function with the name we want. Then we are calling .replaceWith(...) to change it’s arity using transformArity function. Some here may say “Hey! wait a second. Are you trying to pull a fast one over me?” Actually no, I actually struggled quite a lot for writing these few lines.

We are actually inside a transformation call already and we are invoking another transformation call. It is a non-trivial idea but all we are doing is mutating AST within root Collection and the important point to note is we are mutating a node that is out of the scope of current node. This is first time I realized that I was mutating root AST, I was a bit unhappy because I prefer immutability but if we are not mutating, both the transformations would end up creating two different AST’s but we want to change same AST, so we are kind forced to mutate it. Anyway, we solved the problem.

I was not satisfied with solution because the changeArity function didn’t look clean at all and it was doing too much stuff. I wanted to modularize it. And thats where I used the ideas in the above snippet again.

Solution (Attempt 2)

I am just re-pasting the above snippet for the sake of convenience. Let’s dive more into it this time.

export default function (file, api) {
  const j = api.jscodeshift;
  const root = j(file.source);

  // Some methods here

  const didTransform1 = root.find(...).replaceWith(...).size();

  const didTransform2 = root.find(...).replaceWith(...).size();

  if (didTransform1 + didTransform2 > 0) {
    return root.toSource();
  }

  return null;
}

Here, I treat root.find(...).replaceWith(...) as a single transform. If you look at the above snippet, we are actually doing to two transforms. This pattern is immensely useful when your codemod has to deal with multiple styles of writing the same code like the case we are currently dealing with. The .size() calls at the end give no. of paths that have transformed in your transformation. So, the didTransform1 and didTransform2 give a sense of how many transformations occurred and this can be used to prevent errors.

There are two kinds of errors that can happen while writing a codemod. The first kind is where you select an incorrect node type while finding the code you want to transform or even if you are finding the correct type, you may not be filtering it enough. The second kind is where you have a bug in your transformation logic. The didTransform1 or didTransform2 helps to avoid mistakes of the first kind.

Finally the if conditional at the end says if their sum is zero, then its unnecessary to convert an unmodified AST to back to JavaScript, we can just leave the file as is. So, we return null instead the root.toSource().

Anyway before going into the second solution, I will introduce Extensions in jscodeshift API. This allows us to add custom methods on the Collection’s prototype, so you use use these methods on Collection’s as if they were normal methods. Here is sample.

j.registerMethods({
  customMethod() {
    return this.find(j.ObjectExpression).filter(isIntializer);
  }
})

If you don’t like this, just simply write a function and it works too. There are reasons why you may want to use registerMethods but we won’t cover it here.

In our problem, we have two different styles of code which we have to codemod, one where value of initialize key is an Identifier node and the other where it’s a FunctionExpression node. So, the idea is to solve them independently using didTransform pattern (I made this up) we just saw.

This is what the final code looks like:

export default function (file, api) {
  const j = api.jscodeshift;
  const root = j(file.source);

  const transformArity = node => {
    if (node.params.length === 2) {
      if (j(node.body).find(j.Identifier, { name: 'container' }).size() === 0) {
        node.params = [node.params[1]];
      }
    }
  };

  const hasKey = (object, key) => {
    const { properties } = object;
    return properties.some(property => property.key.name === key);
  };

  const isIntializer = p => {
    return hasKey(p.node, 'name') && hasKey(p.node, 'initialize');
  };

  const findInitialize = p => {
    const { properties } = p.node;
    const [initialize] = properties.filter(
      property => property.key.name === 'initialize'
    );

    return initialize;
  };

  const isIntializeMethod = p => {
    const type = findInitialize(p).value.type;
    return type === 'FunctionExpression';
  };

  const isIntializeIdentifier = p => {
    const type = findInitialize(p).value.type;
    return type === 'Identifier';
  };

  const changeMethod = p => {
    const method = findInitialize(p).value;
    transformArity(method);

    return p.node;
  };

  const changeIdentifierDeclaration = p => {
    const name = findInitialize(p).value.name;
    root.find(j.FunctionDeclaration, { id: { name } }).replaceWith(p => {
      transformArity(p.node);
      return p.node;
    });

    return p.node;
  };

  j.registerMethods({
    findInitializeMethod() {
      return this.find(j.ObjectExpression)
        .filter(isIntializer)
        .filter(isIntializeMethod);
    },
    findInitializeIdentifier() {
      return this.find(j.ObjectExpression)
        .filter(isIntializer)
        .filter(isIntializeIdentifier);
    }
  });

  const didTransform1 = root
    .findInitializeMethod()
    .replaceWith(changeMethod)
    .size();

  const didTransform2 = root
    .findInitializeIdentifier()
    .replaceWith(changeIdentifierDeclaration)
    .size();

  if (didTransform1 + didTransform2 > 0) {
    return root.toSource();
  }

  return null;
}

Live Version: http://astexplorer.net/#/K0xJ26FT0Z/2

This looks a lot longer than the previous solution but don’t worry you have already seen most it and the difference lies in the organization of the code.

Let’s start reading the code from bottom to top. First thing to notice is using the didTransform pattern, we divided the transform into two sub-transforms which are simpler to reason about (Divide and Conquer FTW!!).

Without reading the implementation details, lets look at what didTransform1 and didTransform2 are doing. The didTransform1 handles the case where initialize is a FunctionExpression node and didTransform2 handles the case where initialize is an Identifier node. Here,findInitializeMethod and findInitializeIdentifier are extensions which return the path of the ObjectExpression’s that have initialize defined as FunctionExpression and Identifier respectively. We transform both the collections using changeMethod and changeIdentifierDeclaration functions respectively. Then we check if there any changes and convert the AST to JavaScript if something changes.

The implementation of findInitializeMethod is make a collection of ObjectExpression’s and filter the Initializer objects and check if the value of initialize key is actually a FunctionExpression. The implementation of findInitializeIdentifier is similar to findInitializeMethod except in the last step, instead of filtering FunctionExpression we filter Identifier nodes.

changeMethod and changeIdentifierDeclaration code is already presented in Solution 1. We just repackaged it under a different name here so I am not explaining it again. We are still doing a transformation call inside another transformation call in changeIdentifierDeclaration. I currently don’t know any way around it. But the point of writing this approach is to learn a way to write a codemod that targets two different styles of writing code at the same time i.e, it does two different kind of transforms at the same time. I also think it’s a bit more declaritive but it’s purely subjective opinion.

Summary

Let’s summarize quickly, the important things what we have learned so far.

By declaring j(file.source) as root let’s you mutate the AST outside the scope of your current node.
The didTransform pattern allows you to write two transforms in a single codemod.

That is all we have learned but I hoped to illustrate the need for these powerful ideas and how one can exploit these ideas to write a codemod. I think I tried my best to explain most of the concepts of codemods. Please go through the README of jscodeshift, it should not feel so alienistic. It will fill a few more things I have omitted here.

Currently, jscodeshift’s documentation is in a poor state (not even an API reference). I hope to improve it in the coming weeks as much as possible (I started it but I haven’t done too much). Also, I want to find way another way to avoid calling a transformation inside another transformation. I have recently heard about Lenses in this talk probably they are meant to solve this kind of problem I don’t know I have to experiment with them.

Thanks for reading this till the end. I appreciate it. If you have any comments or feedback, tweet at @_vramana

Also big thanks to @cpojer and @kentcdodds for reviewing this post.