探索 Linux bootloader 的佳作-dlwcn-ChinaUnix博客

探索 Linux bootloader 的目的才不是為了多重開機或讓自己陷於一堆無趣的指令操作，重點在於：

理解 bootloader 的行為模式
思考 Linux boot process
創作或移植 bootloader
以 bootloader 為基礎，創造更大的價值

最後一項看似唐突，事實上在專案開發或研究上，是我們必須面對的議題。以手持式裝置來說，"anti-hacking" 成為相當重要的考量，但採用開放系統又是必經之路，該如何立足於較有利之處呢？自 2003 年起，許多 bootloader 逐漸應用密碼學的範疇技術，整個 kernel image 乃至 partition 都是經過特定加密處理，如何在兼顧實用性與安全性，bootloader 就被賦予重大的使命。又，像 [SoG] 學長以知名的 bootloader [] 為基礎，發展了一套 RTOS，就是創造了新價值，特別在 time-critical 的系統，更能彰顯。

之前的 blog [Micromonitor 簡介] 提過服務於 Lucent 的 [Ed Sutter] 詳盡介紹 Firmware 相關技術與指導如何使用並移植 MicroMonitor 的大作，不過如果要更深入探索，一定要拜讀 Christopher Hallinan 的大作《Embedded Linux Primer》裡面的第七章〈Bootloaders〉。Christopher Hallinan 大名鼎鼎，我想也不需多加介紹，MontaVista software 有太多創作是出自此大師之手，同時，也感謝 Christopher Hallinan 願意將〈Chapter 7: Bootloaders〉開放下載，請參考以下資訊：

[LinuxDevices.com 的複本] (PDF 格式)
[] (PDF 格式)

該章節包含五個子主題：

bootloader 扮演的角色
bootloader 設計上的挑戰
通用性 bootloader 實做： Das U-Boot
移植 U-Boot
其他 bootloader

這讓我想到面試新人時，常問及一個問題：

既然 bootloader 是 loader for booting，為何又要大費周章地把 Kernel 載入呢？為何不能直接寫入對應硬體的線性位址呢？這個問題算是一種 pattern，因為還可套用在特定的硬體架構，比方說 ARM、MIPS，或者 PowerPC 等等，端看應試者的技術背景。x86 bootloader 是個特例，而 Linux bootloader 也可說是一種典型設計下的產物，事實上能發揮的空間很大，比方說 "no-kernel design" 或 exokernel 思維下，bootloader 的身份就有極大的轉變，bootloader 與 kernel 共存是可能的，而且也是相當務實的考量 (想像 watchdog 一類的應用)。不過，這個問題的答案到底是什麼呢？最重要的當然是「適當」去初始化硬體裝置並給予「適當」的狀態，其中需要考量的重點就像是 Flash memory，光是其排列的方式就會引發許多細部的問題。

許多入門書籍總是以一系列的方塊圖，悄悄帶過 bootloader 的原理，好似 kernel 載入前那段「美好時光」中發生的變化、當試著以機械語文與電腦硬體對話時，只是哲學性地闡述海德格的名言：「語言乃存在之家園」，不，黑格爾告訴我們：「存在必合理」。Christopher Hallinan 一針見血指出 boot loader 設計上面臨的挑戰：

Even a simple "Hello World" program written in C requires significant hardware and software resources. The application developer does not need to know or care much about these details because the C runtime environment transparently provides this infrastructure. A bootloader developer has no such luxury. Every resource that a bootloader requires must be carefully initialized and allocated before it is used. One of the most visible examples of this is Dynamic Random Access Memory (DRAM). C Programming Language 設計的首要目標就是撰寫作業系統，但有意思的是，即使簡單的 C 程式如 "Hello World" 者，都涉及許多軟體與硬體的交互作用 (微觀角度)，對於程式開發者而言，RAM 顧名思義就是「隨機存取」的「記憶體」(按：此說法有爭議，事實上，多數的記憶裝置的確在某些程度上都算「隨機」性「存取」，不過這裡不細究)，想當然爾可用 C 語言最美妙與強大的 pointer 進行操作，而這也涉及 C Runtime stack 的微妙變化，不過，這些對 bootloader 來說，都是艱鉅的挑戰，因為 bootloader 得挑起初始化與基本配置的責任，更明確來說，開機初期根本沒有 stack 的概念與具體呈現。

問題才要開始，"7.2.3 Image Complexity" 指出許多經典的議題，C compiler & linker 怎麼去安排機械碼與目的輸出，一般而言，程式設計師不需要太花時間思索，不過對 bootloader 的設計可全然不是這麼一回事。這夢魘的開端就是 C runtime stack，於是作者指出：

The bootloader must create this execution context before any C functions are called. When the bootloader is compiled and linked, the developer must exercise complete control over how the image is constructed and linked. This is especially true if the bootloader is to relocate itself from Flash to RAM. The compiler and linker must be passed a handful of parameters defining the characteristics and layout of the final executable image. Two primary characteristics conspire to add complexity to the final binary executable image. 所以 "execution context" 的建立成為如此以 C 語言為主體的作業系統或環境之首要任務 (對許多系統是如此，不過像是 Forth 一類的系統則巧妙的以自身的 stack 處理)。再者，自 Flash memory 到 RAM 作重新定址的動作相當重要，如此低階的動作與 image layout 息息相關。所以說，困難處就在於需要建立一套與處理器啟動程序相容的 startup code，而且開頭要符合硬體架構規範之位址，書中提到一個典型的個案：

For example, the AMCC PowerPC 405GP processor seeks its first machine instructions from a hard-coded address of 0xFFFF_FFFC. Other processors use similar methods with different details. Some processors are configurable at power-on to seek code from one of several predefined locations, depending on hardware configuration signals. 為了明確處理這個問題，作者引導我們接觸眾多 ld (Linker) script，接下來繼續探討 "execution context" 的細節。思考之前提到 "Hello World" 的例子，我們面臨的挑戰如作者指出：

Indeed, most processors have no DRAM available at startup for temporary storage of variables or, worse, for a stack that is required to use C program calling conventions. If you were forced to write a "Hello World" program with no DRAM and, therefore, no stack, it would be quite different from the traditional "Hello World" example. 以剛剛提到 PowerPC 405GP 硬體來說，實際的狀況就是：

This limitation places significant challenges on the initial body of code designed to initialize the hardware. As a result, one of the first tasks the bootloader performs on startup is to configure enough of the hardware to enable at least some minimal amount of RAM. Some processors designed for embedded use have small amounts of on-chip static RAM available. This is the case with the PPC 405GP we've been discussing. When RAM is available, a stack can be allocated using part of that RAM, and a proper context can be constructed to run higher-level languages such as C. This allows the rest of the processor and platform initialization to be written in something other than assembly language. 我們可發現，在 PowerPC 硬體架構來說，我們發現 bootloader 竟然須顧及如此低階的行為，才得以讓後續的作業環境得以維護 "execution context"。然而，組合語言並非萬靈丹，如何兼顧快速開發與多種平台支援，又是新的挑戰，為此，[] 於是生焉。不過這裡就不作導讀，因為 Christopher Hallinan 所作深入的探討實在太精彩，不容我等愚人置喙，總之，那是篇值得反覆拜讀與思索的佳文，唯一的缺點就是 PowerPC 對不少開發者 (包含小弟我) 是不熟悉的硬體架構，所以組合語言列表比較沒有感覺。

另外，大陸網友詹榮開三年前在 IBM developerWorks 發表了一篇文章 [嵌入式系統 Boot Loader 技術內幕] 也很值得一看，建議可搭配以上兩篇文章閱讀，其中有許多圖文恰好可互補，又其中一者談 [] ，另一者談 []，真是恰到好處。有了上述的基礎後，再回頭研讀 ARM-Linux hacker - Vincent Sanders - 的經典作品 []，就可得心應手。