Hacker Noon: How Immutable Data Structures (E.g. Immutable.js) are Optimized

  • Tuesday, 13 February 2018 03:45
http://leonov.netRecently I have been learning some Functional Programming using JavaScript. I started to really like Functional Programming due to the elegance of functionally written code.Immutability is one of the building blocks of Functional Programming. Here are some of the advantages of using immutable objects.Immutable objects are simpler to construct, test, and useTruly immutable objects are always thread-safeThey help to avoid temporal couplingTheir usage is side-effect freeIf you want to learn more about why you should use immutable objects, this is a nice article to read.But the first question that comes to the mind when talking about immutability is the performance. For an example, say that we have an array of integers. We need to change one of the integers in the array. Now if we want to stay immutable, instead of changing the array in place, we need to keep the original array intact and return a new array with the changed integer. For this we need to create a new array and copy over the old elements. This is much more expensive than changing the array in place.In this article I am going to talk about a method that is used to optimize immutable data structures, called “Structural Sharing”. To begin let’s learn what a persistent data structure is.Persistent data structuresThis is how Wikipedia introduces a persistent data structure,In computing, a persistent data structure is a data structure that always preserves the previous version of itself when it is modified. Such data structures are effectively immutable, as their operations do not (visibly) update the structure in-place, but instead always yield a new updated structure.Persistent data structures are commonly used in Functional Programming as this enforces immutability. Almost all functional programming languages has implementations of persistent data structures. Immutable.js is a JavaScript library that implements persistent data structures.These implementations are optimized a lot to improve performance. Structural sharing is one of the techniques used for optimization.TriesFirst, if you are not familiar with Trie data structure, we need to first understand what is a Trie. Trie is a special kind of a tree data structure. Take a look at the following image from Wikipedia.wikipediaThis is different from a binary tree since no nodes specifically stores the key associated with that node. The node’s position in the tree defines the key it’s associated with. All nodes under a certain node have a common prefix. Values tend to be only associated with leaves (and some inner nodes if they have a significance). Tries are commonly used to store dictionaries. As we can see from this example we can easily validate words and obtain suggestions for partial words. Please refer here to understand Tries in detail.Using a Trie to represent an arraySince now we know what a Trie is, let’s take a look at how to represent an array using a Trie. Take the following array as an example,[“red”, “green”, “blue”, “yellow”, “pink”, “purple”, “black”, “white”]Let’s see how we can try and represent this using a Trie.Nice huh!So now you would ask me how do you get the element at index 1. If you follow the path “001” (0 means the left node, 1 means the right node. look at the diagram) from the root node you can find element with index 1 (Green). Each leaf node has a unique address. Using those, we can index elements.Now let’s see why we represented the array as a Trie. Say we need to change the last element of this array from “White” to “Brown”. We want this to be done in a persistent manner. Therefore changing the original data structure is not a solution. Let’s take a look at the following diagram.You can see that the old root is still there and you can access the old array using that. And the array after the new element is added, is the structure with the new root. We are creating a new array with the new element by reusing the old structure. If we had a traditional array we had to copy all the elements.Using a Trie to Represent a hash map or an object (JavaScript)We have taken a look at how to represent a numerically indexed data structure as a Trie and use structural sharing to optimize it as a persistent data structure. Next let’s take a look how to represent a object with non numeric keys.It is pretty simple. In a hash map we get a numerical hash for each key and use that to create a data structure. We can use the same idea here.hash('a') = 97Lets take a hash value of a string as the some of it’s character ASCII values. Then we get the hash of ‘a’ as 97. We can use this value along with modular operation to create a Trie to represent a data structure. If you want to know more about how hash maps work please refer here. With the Trie data structure we can handle modifications to the data structure persistently in a more efficient way.Branching factorConsidering branching factor, we used two way branching in the example for simplicity. But for a big array this would mean a very deep tree. Deep trees offer a lot of sharing thus reduces memory usage, but as the tree gets deeper time for modification increases as well. There should be a balance. Clojure (a functional programming language) uses 32 way branching. This provides a good balance. A Clojure array of one billion elements only goes 6 nodes deep. You need 35 billion nodes to hit 8 nodes deep.Time complexities for modificationsTraditionally if you want to create a new array and change an element, it would take O(n) time. But with the representation of tries and using structural sharing it could be taken down to O(log(branching_factor) n). Since branching factor is a constant this would mean O(log n).ConclusionIn this article we have learned how to implement efficient persistent data structures using structural sharing with Tries. For more information please look at the references.References:Immutable.js docs https://facebook.github.io/immutable-js/docs/#/Clojure persistent vectors http://hypirion.com/musings/understanding-persistent-vector-pt-1How Immutable Data Structures (E.g. Immutable.js) are Optimized was originally published in Hacker Noon on Medium, where people are continuing the conversation by highlighting and responding to this story.

Additional Info

Leave a comment

Make sure you enter all the required information, indicated by an asterisk (*). HTML code is not allowed.

Disclaimer: As a news and information platform, also aggregate headlines from other sites, and republish small text snippets and images. We always link to original content on other sites, and thus follow a 'Fair Use' policy. For further content, we take great care to only publish original material, but since part of the content is user generated, we cannot guarantee this 100%. If you believe we violate this policy in any particular case, please contact us and we'll take appropriate action immediately.

Our main goal is to make crypto grow by making news and information more accessible for the masses.