Thursday, December 10, 2015

Using Scala to implement a DevOps System

1. Background

In our company, the main programming language is Java, so all of our servers on the Internet have deployed jdk6 & jdk7. Java is easy to master and less error prone compared to C/C++ which I’m also familiar with. But sometimes I feel that the grammar of Java is a little weak. For example before Java 8, when transforming collections, we will either have to write some referenced-once classes or iterate through collections by using loops.
Then there was a need to optimize the DevOp system for our game. The system is used to:
  • maintain machine database
  • maintain service setting (Java, PHP, Static web resource, DB etc)
  • deploy new version of services
  • upgrade DB schema etc.
For service deployment, there are mainly four steps:
  • upload service package to dest machine, which is very bandwidth consuming as the size of packages ranges from 5M to 500M
  • stop services machines (each machine can have several types of services with each have multiple instances)
  • install package and transfer configuration
  • start services on machines
It is originally implemented with PHP and can able to manage about 100 machines and 1000 services. Due to its design limitation, the deploying process would takes to long, say more than 3 hours, when number of machines and services increase beyond 200 and 2000 respectively.

2. Refactor or Reimplement

After some discussion, we decided to re-implement it with java or some JVM language. The reasons is as following:
  • Its code is too lousy to refactor. In fact, lots of code are duplicated.
  • We want to use a static typing programming language that can ease future refactor progress. There might some tools (e.g. Facebook HipHop) for PHP we can resort to to improve the situation, but as a small team whose main members are Java programmers, we don’t wish to introduce extra complexity into team.
  • Some functions are more easy to implement in Java, for example background tasks.
  • We have confidence that we can successfully reimplement it, but not so much of refactoring it.
  • It’s an internal system. So we can make brave decisions without considering much about backward compatibility.

3. The architecture

We first encounter scala while investigating RPC frameworks during which we found twitter finagle (implemented in Scala) is quite impressive. After reading so articles introducing Scala, we decide we should at least try this language to see if it’s useful for us.
We chose a architecture as following
  • Play! framework 2 as the main framework
  • We didn’t use Twirl template engine but instead chose AngularJS+Json way. The later allow frontend/backend developers develop concurrently.
  • Using Slick to access database
  • Using H2/Mysql as dev/prod time database respectives
  • Using Akka to implement background tasks, to be more precise, service status monitoring and service deployment
  • We at first write user/group management module by our own, but later adapted Keycloak as SSO since it’s much more flexible.

4. Learning

I have learned a little bit of Scala while reading source code of finagle. Then I read the book Programming in Scala and some online manuals. Still, I found there are so many syntax tricks that can easily surprise me.
The document of Play! framework is quite friendly actually, so I don’t met much difficulty at first. But as soon as I want something not mentioned in the manual or something complex, I found I need to refer to source code for help. Two examples are:
  • Using the build-in form handling library provided by Play! framework to handle of dynamic form that contains dynamic fields
  • Using slick to accomplish some complex SQL queries.
AngularJS is also easy to grape at beginning, but need to read sources of libraries to have deeper understanding it’s spirits.
There is another challenge for me: living in China, behind the GFW, presented a lot trouble to me while setting up building envrionment for SBT, NPM and Bower.

5. The Outcome

Fortunately The outcome is quite pleasant, the system eased us a lot pain managing more than 1000 machines and more than 6000 services.
  • By using multiple proxy nodes, we can send installation packages (size ranged from 5M to 500M) to destination machine much faster than before.
  • By enabling concurrent deployment, deploy time dramatically dropped.
  • By adding query friendly logs, it’s much easier to trace history and fix errors.
  • By allowing adding new services via configurations, all of our services are now under its management which greatly reduced configuration errors.
Below is a screenshot of the system:
DevOps Tool

6. Conclusion

After the project, I guess scala is not suitable for us or at least not suitable for projects like this.
My major concern is maintainability. One thing those whose argue that Scala is better than Java is that we can achieve the same function with far less source code when using Scala. But I also see this as a weakness. In fact, more than often, our team members complained that the context of code was too little for someone new to read the source code written by others.
Another reason is that Scala permit domain specific language, which may be attractive at first. But as soon as we want adapt new libraries that introduce some 'weird' grammar, we may shout WTF. It’s like besides referencing API doc, we also have to learn a new language which usually is not so well defined.
The finally complain is about reactive programming. The Play! framework 2 recommend us to use this technique which is also fancy at first. Then when you need to debug or try to figure out the reason of some error log, you would soon be frustrated.
One may argue that all the above drawback can be avoided by some programming standard/convention, but the truth is that you can stop yourself or your team from making bad decision but there is little you can do to prevent other people, say library writers. Nowadays, it would be unreasonable not to utilize open source projects.

Tuesday, December 01, 2015

Javascript Module Dependency Resolving Technics

1. Javascript的package dependency

有三种specification
相应的implementation:
相关的语法:

1.1. AMD

Defining the module in somethingAwesome.js
define(function {
  console.log('Awesome');
});
Consuming the module, note that you do not necessary need to follow the same name of the module.
define(['somethingAwesome'], function (thatAwesomeSomething) {
  thatAwesomeSomething();
});

1.2. CommonJS

Defining the module in somethingAwesome.js
module.exports = function(){
  console.log('Awesome');
}
When consuming it somewhere else.
// Import our awesome module, you can call it anything you want
var thatAwesomeSomething = require('somethingAwesome');
// Use it
thatAwesomeSomething();

1.3. ES6

//ES6
import Class1 from 'file1';
import Class2 from 'file2';

let obj = newClass1(),
    obj2 = new Class2();

export default obj.foo(obj2);

AngularJs Code reading

1. application life-cycle

  • configuration: configures and instantiates all providers
  • run: interaction with providers is disallowed and the process of creating services starts
在optool scala中:
  • configuration阶段(app.js):配置ngRouteProvider, httpProvider, ngDialogProvider
  • run阶段:统一ping,其他行为

Thursday, November 19, 2015

Classes related to Netty Transport - Part2

1. 相关流程

下面的相关流程特指NioServerSocketChannel/NioSocketChannel

1.1. bind流程

  • 新建一个Channel
  • 注册到boss group,注册时触发事件
    • fireChannelRegistered
    • fireChannelActive(如果是第一次注册,例如unregister后再register没有这个事件)
  • 调用Java NIO bind到端口

Classes related to Netty Transport - Part1

1. EventLoop

EventLoop

1.1. 相关类

  • 方框内是interface,方框外是class
  • 浅蓝色是java.util.concurrent包中的executor
  • 绿色是 io.netty.util.concurrent包中的EventExecutor
  • 黄色是 io.netty.util.concurrent包中的EventLoop
Executor相关类主要负责(立即或者定时)执行Runnable/Callable;EventExecutor类增加了判断当前线程是否在EventLoop中的方法;EventLoop相关类提供接口将Channel register/unregister到一个EventLoop,另外返回一个返回ChannelHandlerInvoker用于在EventLoop触发register/active/write等行为或事件
提示
EventLoopGroup
EventLoop集合;register channel; 返回EventExecutor(next函数)
Executor
执行Runnable
ExecutorService
执行Callable; shutdown
ScheduledExecutorService
定时执行Runnable和Callable
EventExecutor
判断当前线程是否在本executor:inEventLoop
EventExecutorGroup
shudown; EventExector集合
EventLoop
返回ChannelHandlerInvoker,用于在EventLoop触发register,active, write等行为或事件

1.2. NioEventLoop相关实现

类NioEventLoop实现了EventLoop,但其具体的实现通过上层的类提供: AbstractScheduledEventExecutor和SingleThreadEventExecutor和SingleThreadEventLoop。
AbstractScheduledEventExecutor
  • 实现schedule Runnable/Callable功能;
  • 有一个scheduledTaskQueue,用PriorityQueue实现,头部是到期时间最早的task
SingleThreadEventExecutor
基于AbstractScheduledEventExecutor提供了addTask和pollTask的功能。
拥有两个queue:
  • 本身有一个taskQueue,用来保存马上需要执行的任务
  • 继承自AbstractScheduledEventExecutor的scheduledTaskQueue
函数takeTask用来获取一个需要马上执行的runnable,函数的实现如下:
  • 查看scheduledTaskQueue头,如果没有任何scheduled task,则从taskQueue中take(无限等待)。
可能会感觉这里有bug,即在无限等待过程中,新加入scheduled tastk queue的task不会被执行。事实是在向其提交ScheduledTask时,会做以下操作:
 <V> ScheduledFuture<V> schedule(final ScheduledFutureTask<V> task) {
        if (inEventLoop()) { // (1)
            scheduledTaskQueue().add(task);
        } else { // (2)
            execute(new OneTimeTask() {
                @Override
                public void run() {
                    scheduledTaskQueue().add(task);
                }
            });
        }

        return task;
    }
  • 在(1)情况下,下一个循环又会先检查scheduledTaskQueue,所以会被执行
  • 在(2)情况下,线程会先被唤醒执行OneTimeTask,下一个循环还是会去先scheduledTaskQueue
因此是没有bug的
  • 如果有scheduled task,则检查dueTime。
    • 如果已经到期,则返回。
    • 如果没有到期,从等待taskQueue,一直到dueTime。
      • 如果dueTime到了taskQueue还没有东西,则返回scheduled task。
      • 否则返回taskQueue里的东西。
boolean runAllTasks(long timeoutNanos) 会被子类用到,执行task直到timeout。
SingleThreadEventLoop
这个类没啥东西
NioEventLoop本身
依靠Java NIO(如Selector),提供了register channel的相关功能(中间需要调用Channel.register)。同时,override SingleThreadEventExecutor中的run函数,从而定义线程主循环。
主循环逻辑如下:
  • select并等待唤醒(这里逻辑复杂,看不懂)
  • 根据设置的io/task执行时间比例,按比例执select到的任务和task任务
select到的任务主要指socket的read/write,task任务主要指提交的Runnable和Callable

2. Channel

2.1. 相关类

netty channel.png

2.2. NioServerSocketChannel

在使用ServerBootStrap时,需要提供一个Channel类,用来处理server socket的read操作(即accept)。使用NIO + TCP时,一般使用NioServerSocketChannel。
很奇怪的是,该类继承AbstractNioMessageChannel,这是因为AbstractNioMessageChannel实现了read函数(确切的说是实现了AbstractNioUnsafe子类),并会要求子类实现一个readMessage。而NioServerSocketChannel.readMessage则通过调用accept返回一个NioSocketChannel对象。

2.3. AbstractChannel

  • 包含了register,write,bind等实现框架
  • 需要子类去实现doRegister等函数,例如AbstractNioChannel实现了doRegister, NioSocketChannel实现了doWrite等
netty AbstractChannel do.png

3. Pipeline

netty pipeline
  • DefaultChannelPipeLien包含了以AbstractChannelHandlerContext为元素的双线链表。该双线链表的Head和Tail都是内建不能更改的。当我们向pipeline增加ChannelHandler时(如调用addLast),实际上是创建一个AbstractChannelHandlerContext并添加到这个双向链表中。
  • event类消息,从head开始向下遍历,比如fireChannelRegistered,fireChannelActive,fireChannelRead等
  • action类消息,从tail开始遍历,比如write,read,connect,close等
  • 内建的HeadContext会处理connect,write等调用,并调用AbstractChannel中相应的函数。具体的处理细节根据action和channel的类型都会不同。
  • 内建的TailContext没什么特别处理,只对没人处理的ByteBuf做释放操作

4. AddressResolver

Netty中的AddressResolver负责将名字翻译成SocketAddress。为了提高速度(Or somthing else),Netty定义了AddressResolverGroup,每个EventExecutor关联一个Resolver
Netty有三种AddressResolverGroup实现:
  • DefaultAddressResolverGroup: 通过Java自带的InetAddress.getByName来解析
  • DnsAddressResolverGroup:Netty自己实现的一套DNS解析,有Cache功能
  • NoopAddressResolverGroup: 啥都不干
在调用BootStrap.connect()时,会在当前线程调用AddressResolver.resolve(remoteAddress),而默认的实现DefaultAddressResolverGroup这个操作是 阻塞 的!
DnsAddressResolverGroup实现没看(貌似是4.1新出的功能),不过应该不会是阻塞的(但是也会在Channel的EventGroup上执行)。构造的时候需要:
  • EventLoop
  • name server list