node.js源码研究(启动与模块加载)

4960阅读 0评论2014-03-05 ygfinsight
分类:JavaScript

一、源码组成
1.它有8000行C++代码,2000行javascript代码
2.node.js里内置的javascript包括了主程序src/node.js和模块程序lib/*.js
3.node.js所依赖的主要的库:v8,uv,zlib
二、js2c.py工具
node.js使用了V8附带的js2c.py工具把所有内置的js代码转换成C++里的数组,
生成node_natives.h直接include到程序中,成了C++源码的一部分,
这样做能提高内置js模块的编译效率。
通过js2c.py让每一个js文件都生成一个源码数组,转换后存放在build/src/node_natives.h里,
node_natives.h在node.js编译后才会生成。
大致代码结构如下:
struct _native {
  const char* name;
  const char* source;
  size_t source_len;
};
static const struct _native natives[] = {
  { "node", node_native, sizeof(node_native)-1 },
  { "_debugger", _debugger_native, sizeof(_debugger_native)-1 },
  { "_linklist", _linklist_native, sizeof(_linklist_native)-1 },
  { "assert", assert_native, sizeof(assert_native)-1 },
  { "buffer", buffer_native, sizeof(buffer_native)-1 }
}
三、启动和加载
1.入口:node_main.cc,对命令行参数做解析处理后调用node::Start(argc, argv)启动
2.初始化v8接口,V8::Initialize()
3.在v8中创建并设置process对象:Handle SetupProcessObject(int argc, char *argv[]) {}
4.加载node.js,void Load(Handle process_l)
node.js说明:
这个文件是启动node的核心,由node::Load in src/node.cc调用,考虑到启动启动过程的性能,
所有依赖采用延迟加载
通过MainSource()获取已转化的src/node.js源码,并执行它:源码已经通过转化包含在node_natives.h,
node_native是一个字符串,包含着源码。
Local f_value = ExecuteString(MainSource(),
IMMUTABLE_STRING("node.js"));

node.js代码:
(function(process){
.........
});
由此看出,执行完node.js得到的是一个函数
在node.cc中对函数处理:
assert(f_value->IsFunction());
 
 Local f = Local::Cast(f_value);
创建函数执行环境,调用函数,把process传入
Local global = v8::Context::GetCurrent()->Global();
 
Local args[1] = { Local::New(process_l) };
f->Call(global, 1, args);
执行:
uv_run(uv_default_loop(), UV_RUN_DEFAULT); 


启动入口函数代码参考:
int Start(int argc, char *argv[]) {
  // Hack aroung with the argv pointer. Used for process.title = "blah".
  argv = uv_setup_args(argc, argv);


  // Logic to duplicate argv as Init() modifies arguments
  // that are passed into it.
  char **argv_copy = copy_argv(argc, argv);


  // This needs to run *before* V8::Initialize()
  // Use copy here as to not modify the original argv:
  Init(argc, argv_copy);
  V8::Initialize();
  {
    Locker locker;
    HandleScope handle_scope;


    // Create the one and only Context.
    Persistent context = Context::New();
    Context::Scope context_scope(context);


    // Use original argv, as we're just copying values out of it.
    Handle process_l = SetupProcessObject(argc, argv);
    v8_typed_array::AttachBindings(context->Global());


    // Create all the objects, load modules, do everything.
    // so your next reading stop should be node::Load()!
    Load(process_l);


    // All our arguments are loaded. We've evaluated all of the scripts. We
    // might even have created TCP servers. Now we enter the main eventloop. If
    // there are no watchers on the loop (except for the ones that were
    // uv_unref'd) then this function exits. As long as there are active
    // watchers, it blocks.
    uv_run(uv_default_loop(), UV_RUN_DEFAULT);


    EmitExit(process_l);
    RunAtExit();


#ifndef NDEBUG
    context.Dispose();
#endif
  }


#ifndef NDEBUG
  // Clean up. Not strictly necessary.
  V8::Dispose();
#endif  // NDEBUG


  // Clean up the copy:
  free(argv_copy);


  return 0;
}
四、系统内置c++模块的加载
node.js的模块除了lib/*.js里用js语言编写的以外,还有一些系统模块使用C++编写,
这些模块都通过node.h提供的NODE_MODULE方法存储在变量_module里。
node_extensions.cc提供了get_builtin_module(name)接口从一个哈希表查找这些模块。


static Handle Binding(const Arguments& args) {
  HandleScope scope;


  Local module = args[0]->ToString();
  String::Utf8Value module_v(module);
  node_module_struct* modp;


  if (binding_cache.IsEmpty()) {
    binding_cache = Persistent::New(Object::New());
  }


  Local exports;


  if (binding_cache->Has(module)) {
    exports = binding_cache->Get(module)->ToObject();
    return scope.Close(exports);
  }


  // Append a string to process.moduleLoadList
  char buf[1024];
  snprintf(buf, 1024, "Binding %s", *module_v);
  uint32_t l = module_load_list->Length();
  module_load_list->Set(l, String::New(buf));


  if ((modp = get_builtin_module(*module_v)) != NULL) {
    exports = Object::New();
    // Internal bindings don't have a "module" object,
    // only exports.
    modp->register_func(exports, Undefined());
    binding_cache->Set(module, exports);


  } else if (!strcmp(*module_v, "constants")) {
    exports = Object::New();
    DefineConstants(exports);
    binding_cache->Set(module, exports);


  } else if (!strcmp(*module_v, "natives")) {
    exports = Object::New();
    DefineJavaScript(exports);
    binding_cache->Set(module, exports);


  } else {


    return ThrowException(Exception::Error(String::New("No such module")));
  }


  return scope.Close(exports);
}
从源码中可以看出,加载c++模块时先从缓存里面找,找不到再get_builtin_module查找C++内置模块,
找到的话获取后绑定在exports上,在最后返回exports。
五、c++扩展模块加载
对于以.node为扩展名的模块,采用Dlopen加载
Handle DLOpen(const v8::Arguments& args) {
  HandleScope scope;
  char symbol[1024], *base, *pos;
  uv_lib_t lib;
  int r;


  if (args.Length() < 2) {
    Local exception = Exception::Error(
        String::New("process.dlopen takes exactly 2 arguments."));
    return ThrowException(exception);
  }


  Local module = args[0]->ToObject(); // Cast
  String::Utf8Value filename(args[1]); // Cast


  if (exports_symbol.IsEmpty()) {
    exports_symbol = NODE_PSYMBOL("exports");
  }
  Local exports = module->Get(exports_symbol)->ToObject();


  if (uv_dlopen(*filename, &lib)) {
    Local errmsg = String::New(uv_dlerror(&lib));
#ifdef _WIN32
    // Windows needs to add the filename into the error message
    errmsg = String::Concat(errmsg, args[1]->ToString());
#endif
    return ThrowException(Exception::Error(errmsg));
  }


  String::Utf8Value path(args[1]);
  base = *path;


  /* Find the shared library filename within the full path. */
#ifdef __POSIX__
  pos = strrchr(base, '/');
  if (pos != NULL) {
    base = pos + 1;
  }
#else // Windows
  for (;;) {
    pos = strpbrk(base, "\\/:");
    if (pos == NULL) {
      break;
    }
    base = pos + 1;
  }
#endif


  /* Strip the .node extension. */
  pos = strrchr(base, '.');
  if (pos != NULL) {
    *pos = '\0';
  }


  /* Add the `_module` suffix to the extension name. */
  r = snprintf(symbol, sizeof symbol, "%s_module", base);
  if (r <= 0 || static_cast(r) >= sizeof symbol) {
    Local exception =
        Exception::Error(String::New("Out of memory."));
    return ThrowException(exception);
  }


  /* Replace dashes with underscores. When loading foo-bar.node,
  * look for foo_bar_module, not foo-bar_module.
   */
  for (pos = symbol; *pos != '\0'; ++pos) {
    if (*pos == '-') *pos = '_';
  }


  node_module_struct *mod;
  if (uv_dlsym(&lib, symbol, reinterpret_cast(&mod))) {
    char errmsg[1024];
    snprintf(errmsg, sizeof(errmsg), "Symbol %s not found.", symbol);
    return ThrowError(errmsg);
  }


  if (mod->version != NODE_MODULE_VERSION) {
    char errmsg[1024];
    snprintf(errmsg,
             sizeof(errmsg),
             "Module version mismatch. Expected %d, got %d.",
             NODE_MODULE_VERSION, mod->version);
    return ThrowError(errmsg);
  }


  //  
  mod->register_func(exports, module);


  // Tell coverity that 'handle' should not be freed when we return.
  // coverity[leaked_storage]
  return Undefined();
}


Dlopen打开动态链接库之后,调用libuv的uv_dlsym, 找到动态链接库中通过NODE_MODULE定义方法的地址。挂载到exports对象上。
六、js模块加载
src/node.js上实现了一个NativeModule对象用于管理js模块,它通过调用process.binding(“natives”)把所有内置的js模块放在NativeModule._source上,
并提供require接口供调用。在require里会给代码加一层包装,把一些变量传给这个模块。
NativeModule.wrapper = [
    '(function (exports, require, module, __filename, __dirname) { ',
    '\n});'
  ];
  再用process提供的其中一个js编译接口process.runInThisContext执行代码。
  var Script = process.binding('evals').NodeScript;
  var runInThisContext = Script.runInThisContext;
  var fn = runInThisContext(source, this.filename, true);
    fn(this.exports, NativeModule.require, this, this.filename);