Chain Junkies

Tales from the Crypt: The stack discovery

Tuesday, 16 May 2017 14:27

A few weeks ago I tried to cover our work on the portfolio but I did not explain any of the coding behind it. The best subject to start with would be the foundation layer of our service, the systems behind our pricing and historical data.First of all, it makes sense to give a short overview of what our VMs look like and how they all fit into the overall server architecture.What does our sever architecture look like?Hosting and VM management: Microsoft AzureExchange integration: 60 x Node.js VMs — shared core, 750mb of RAM, 29GB disk space.Long term storage (trades): Azure blob storageHistorical and block explorer data: 4 x PostgreSQL — single core, 1.5GB of RAM, 29GB disk spaceSnapshot data (what is the price now): 2 x Redis — single core, 1.5GB of RAM, 29GB disk spaceStreaming servers: 12 x Node.js + Socket.IO (load balanced)— single core, 1.5GB of RAM, 29GB disk spaceStreaming broadcast server — 1 x Node.js (VM that broadcasts messages from the exchanges to the streamers, lets us scale the streamers with ease) — single core, 3.5Gb of RAM, 50GB disk spaceWidget servers: 2 x Node.js (load balanced — tested to support over 1 million calls per hour, currently at 50,000 on average) — single core, 3.5GB of RAM, 50GB SSD disk spaceAPI servers: 3 x Node.js (load balanced — tested to support over 10 million calls per hour, currently at 1.7 M on average) — single core, 3.5GB of RAM, 50GB SSD disk spaceScript servers: 1x NodeJS — single core, 1.5GB of RAM, 29GB disk spaceData Aggregation Server: 1 x Node.js (creates our aggregated price for all the trades on all the exchanges we integrate with) — single core, 3.5GB of RAM, 50GB SSD disk spaceCMS/front-end web server: 2 x Umbraco (C#) + IIS (load balanced — tested to support over 100,000 page views per hour, currently at 12,000) — 4 cores, 7GB or RAM, 120GB SSD disk spaceImage CDN: Azure blob storageForum, reviews, static content and member data: 1 x Azure SQL ServerForum and member stats: 1 x Redis — single core, 1.5GB or RAM, 29GB disk spaceEmails — SparkPost (forum notifications and daily update email ) and Mandrill (member management emails: registration, activation, password reset)How do we get the data?We have one virtual machine for each exchange we integrate with. Each exchange virtual machine runs the following set of Node.js services:the trades service — this is specific for each exchange and it is either a polling or a web sockets API. The sole purpose of this service is to get trades data and save it in our trades format.the historical data service — reads the trade file and saves minute and hourly data to the historical databasethe snapshot service — reads the trade file and saves current aggregated data to the Redis server (current price, total volume, last trades etc)the streaming service — reads the trade file and streams aggregated data to the streaming serversthe trades archive service — reads the trade files and saves trade data in the azure blob storagethe order book service — connects to each exchange and streams order book data to the streamers (this will be expanded to store the data as well)How does the data reach our members?The front-end Umbraco server has Redis and PostgreSQL integration but most of the data comes from our API servers. We use the same API servers that we have publicly available to everyone, it makes a lot more sense to consume the same APIs as it’s a lot less work in the long run (eat your own dog food). It also helps us find and solve scaling issues as both the API usage and our web servers grow.A lot of the front-end is built using AngularJS and since the beginning, we’ve had to build all our product around APIs. It was one of the best decisions we took early on and it’s helped us be a lot more efficient in our development. It’s also allowed us to share some code between the Node.js back-end servers and the AngularJS front-end.Most of our price pages load with a Redis snapshot of the prices and then connect to our streaming servers and start streaming the rest of the data.Why this stack?Back-end — When we started the website (late 2014) Node.js and AngularJS were gaining a lot of popularity and I wanted to try them out. Without AngularJS I doubt we could have built the website in under 1 year. I’ve previously worked on a similar project and it took us a lot longer just to build the front-end in jQuery (and it was really messy). Before Paul Dobre joined our team the front-end was an afterthought, I did work on it for a bit, but didn’t do a very good job as I’m generally bad at design. Most of the early work went into developing the exchange integration and the data flow. On the Node.js side, we don’t use any frameworks, just vanilla node since most of the code is just used to fetch or retrieve JSON data. Redis and PostgreSQL are just some of the best tools when it comes to snapshot and historical data.Front-end — We chose Umbraco because I find it the easiest CMS to work with from the editor’s point of view. Also, I was comfortable enough with it to make it work for our project and I knew that I would just use it as an API layer for our AngularJS application.Streaming — Socket.IO was really easy to use and it saved me a lot of time. So far, everything has worked really well both in terms of efficiency and scalability. All our external services are load balanced and scalable and our internal servers have a full backup available at a click of the mouse.Next week I’m planning on writing a blog post about how our API works.As always, if you have any questions or suggestions, feel free to comment.The stack discovery was originally published in Tales from the Crypto on Medium, where people are continuing the conversation by highlighting and responding to this story.

Additional Info

Read full article on: Tales from the Crypt

Rate this item

(1 Vote)

Published in Audio/Video

Make sure you enter all the required information, indicated by an asterisk (*). HTML code is not allowed.

Tales from the Crypt: The stack discovery

Additional Info

Leave a comment