xref: /DragonStub/docs/README.gnuefi (revision f412fd2a1a248b546b7085648dece8d908077fab)
1	-------------------------------------------------
2	Building EFI Applications Using the GNU Toolchain
3	-------------------------------------------------
4
5		David Mosberger <davidm@hpl.hp.com>
6
7			23 September 1999
8
9
10		Copyright (c) 1999-2007 Hewlett-Packard Co.
11		Copyright (c) 2006-2010 Intel Co.
12
13Last update: 04/09/2007
14
15* Introduction
16
17This document has two parts: the first part describes how to develop
18EFI applications for IA-64,x86 and x86_64 using the GNU toolchain and the EFI
19development environment contained in this directory.  The second part
20describes some of the more subtle aspects of how this development
21environment works.
22
23
24
25* Part 1: Developing EFI Applications
26
27
28** Prerequisites:
29
30 To develop x86 and x86_64 EFI applications, the following tools are needed:
31
32	- gcc-3.0 or newer (gcc 2.7.2 is NOT sufficient!)
33	  As of gnu-efi-3.0b, the Redhat 8.0 toolchain is known to work,
34	  but the Redhat 9.0 toolchain is not currently supported.
35
36	- A version of "objcopy" that supports EFI applications.  To
37	  check if your version includes EFI support, issue the
38	  command:
39
40		objcopy --help
41
42	  Verify that the line "supported targets" contains the string
43	  "efi-app-ia32" and "efi-app-x86_64" and that the "-j" option
44	  accepts wildcards. The binutils release binutils-2.24
45	  supports Intel64 EFI and accepts wildcard section names.
46
47	- For debugging purposes, it's useful to have a version of
48	  "objdump" that supports EFI applications as well.  This
49	  allows inspect and disassemble EFI binaries.
50
51 To develop IA-64 EFI applications, the following tools are needed:
52
53	- A version of gcc newer than July 30th 1999 (older versions
54	  had problems with generating position independent code).
55	  As of gnu-efi-3.0b, gcc-3.1 is known to work well.
56
57	- A version of "objcopy" that supports EFI applications.  To
58	  check if your version includes EFI support, issue the
59	  command:
60
61		objcopy --help
62
63	  Verify that the line "supported targets" contains the string
64	  "efi-app-ia64" and that the "-j" option accepts wildcards.
65
66	- For debugging purposes, it's useful to have a version of
67	  "objdump" that supports EFI applications as well.  This
68	  allows inspect and disassemble EFI binaries.
69
70
71** Directory Structure
72
73This EFI development environment contains the following
74subdirectories:
75
76 inc:   This directory contains the EFI-related include files.  The
77	files are taken from Intel's EFI source distribution, except
78	that various fixes were applied to make it compile with the
79	GNU toolchain.
80
81 lib:   This directory contains the source code for Intel's EFI library.
82	Again, the files are taken from Intel's EFI source
83	distribution, with changes to make them compile with the GNU
84	toolchain.
85
86 gnuefi: This directory contains the glue necessary to convert ELF64
87	binaries to EFI binaries.  Various runtime code bits, such as
88	a self-relocator are included as well.  This code has been
89	contributed by the Hewlett-Packard Company and is distributed
90	under the GNU GPL.
91
92 apps:	This directory contains a few simple EFI test apps.
93
94** Setup
95
96It is necessary to edit the Makefile in the directory containing this
97README file before EFI applications can be built.  Specifically, you
98should verify that macros CC, AS, LD, AR, RANLIB, and OBJCOPY point to
99the appropriate compiler, assembler, linker, ar, and ranlib binaries,
100respectively.
101
102If you're working in a cross-development environment, be sure to set
103macro ARCH to the desired target architecture ("ia32" for x86, "x86_64" for
104x86_64 and "ia64" for IA-64).  For convenience, this can also be done from
105the make command line (e.g., "make ARCH=ia64").
106
107
108** Building
109
110To build the sample EFI applications provided in subdirectory "apps",
111simply invoke "make" in the toplevel directory (the directory
112containing this README file).  This should build lib/libefi.a and
113gnuefi/libgnuefi.a first and then all the EFI applications such as a
114apps/t6.efi.
115
116
117** Running
118
119Just copy the EFI application (e.g., apps/t6.efi) to the EFI
120filesystem, boot EFI, and then select "Invoke EFI application" to run
121the application you want to test.  Alternatively, you can invoke the
122Intel-provided "nshell" application and then invoke your test binary
123via the command line interface that "nshell" provides.
124
125
126** Writing Your Own EFI Application
127
128Suppose you have your own EFI application in a file called
129"apps/myefiapp.c".  To get this application built by the GNU EFI build
130environment, simply add "myefiapp.efi" to macro TARGETS in
131apps/Makefile.  Once this is done, invoke "make" in the top level
132directory.  This should result in EFI application apps/myefiapp.efi,
133ready for execution.
134
135The GNU EFI build environment allows to write EFI applications as
136described in Intel's EFI documentation, except for two differences:
137
138 - The EFI application's entry point is always called "efi_main".  The
139   declaration of this routine is:
140
141    EFI_STATUS efi_main (EFI_HANDLE image, EFI_SYSTEM_TABLE *systab);
142
143 - UNICODE string literals must be written as W2U(L"Sample String")
144   instead of just L"Sample String".  The W2U() macro is defined in
145   <efilib.h>.  This header file also declares the function W2UCpy()
146   which allows to convert a wide string into a UNICODE string and
147   store the result in a programmer-supplied buffer.
148
149 - Calls to EFI services should be made via uefi_call_wrapper(). This
150   ensures appropriate parameter passing for the architecture.
151
152
153* Part 2: Inner Workings
154
155WARNING: This part contains all the gory detail of how the GNU EFI
156toolchain works.  Normal users do not have to worry about such
157details.  Reading this part incurs a definite risk of inducing severe
158headaches or other maladies.
159
160The basic idea behind the GNU EFI build environment is to use the GNU
161toolchain to build a normal ELF binary that, at the end, is converted
162to an EFI binary.  EFI binaries are really just PE32+ binaries.  PE
163stands for "Portable Executable" and is the object file format
164Microsoft is using on its Windows platforms.  PE is basically the COFF
165object file format with an MS-DOS2.0 compatible header slapped on in
166front of it.  The "32" in PE32+ stands for 32 bits, meaning that PE32
167is a 32-bit object file format.  The plus in "PE32+" indicates that
168this format has been hacked to allow loading a 4GB binary anywhere in
169a 64-bit address space (unlike ELF64, however, this is not a full
17064-bit object file format because the entire binary cannot span more
171than 4GB of address space).  EFI binaries are plain PE32+ binaries
172except that the "subsystem id" differs from normal Windows binaries.
173There are two flavors of EFI binaries: "applications" and "drivers"
174and each has there own subsystem id and are identical otherwise.  At
175present, the GNU EFI build environment supports the building of EFI
176applications only, though it would be trivial to generate drivers, as
177the only difference is the subsystem id.  For more details on PE32+,
178see the spec at
179
180	http://msdn.microsoft.com/library/specs/msdn_pecoff.htm.
181
182In theory, converting a suitable ELF64 binary to PE32+ is easy and
183could be accomplished with the "objcopy" utility by specifying option
184--target=efi-app-ia32 (x86) or --target=efi-app-ia64 (IA-64).  But
185life never is that easy, so here some complicating factors:
186
187 (1) COFF sections are very different from ELF sections.
188
189	ELF binaries distinguish between program headers and sections.
190	The program headers describe the memory segments that need to
191	be loaded/initialized, whereas the sections describe what
192	constitutes those segments.  In COFF (and therefore PE32+) no
193	such distinction is made.  Thus, COFF sections need to be page
194	aligned and have a size that is a multiple of the page size
195	(4KB for EFI), whereas ELF allows sections at arbitrary
196	addresses and with arbitrary sizes.
197
198 (2) EFI binaries should be relocatable.
199
200	Since EFI binaries are executed in physical mode, EFI cannot
201	guarantee that a given binary can be loaded at its preferred
202	address.  EFI does _try_ to load a binary at it's preferred
203	address, but if it can't do so, it will load it at another
204	address and then relocate the binary using the contents of the
205	.reloc section.
206
207 (3) On IA-64, the EFI entry point needs to point to a function
208     descriptor, not to the code address of the entry point.
209
210 (4) The EFI specification assumes that wide characters use UNICODE
211     encoding.
212
213	ANSI C does not specify the size or encoding that a wide
214	character uses.  These choices are "implementation defined".
215	On most UNIX systems, the GNU toolchain uses a wchar_t that is
216	4 bytes in size.  The encoding used for such characters is
217	(mostly) UCS4.
218
219In the following sections, we address how the GNU EFI build
220environment addresses each of these issues.
221
222
223** (1) Accommodating COFF Sections
224
225In order to satisfy the COFF constraint of page-sized and page-aligned
226sections, the GNU EFI build environment uses the special linker script
227in gnuefi/elf_$(ARCH)_efi.lds where $(ARCH) is the target architecture
228("ia32" for x86, "x86_64" for x86_64 and "ia64" for IA-64).
229This script is set up to create only eight COFF section, each page aligned
230and page sized.These eight sections are used to group together the much
231greater number of sections that are typically present in ELF object files.
232Specifically:
233
234 .hash (and/or .gnu.hash)
235	Collects the ELF .hash info (this section _must_ be the first
236	section in order to build a shared object file; the section is
237	not actually loaded or used at runtime).
238
239	GNU binutils provides a mechanism to generate different hash info
240	via --hash-style=<sysv|gnu|both> option. In this case output
241	shared object will contain .hash section, .gnu.hash section or
242	both. In order to generate correct output linker script preserves
243	both types of hash sections.
244
245 .text
246	Collects all sections containing executable code.
247
248 .data
249	Collects read-only and read-write data, literal string data,
250	global offset tables, the uninitialized data segment (bss) and
251	various other sections containing data.
252
253	The reason read-only data is placed here instead of the in
254	.text is to make it possible to disassemble the .text section
255	without getting garbage due to read-only data.  Besides, since
256	EFI binaries execute in physical mode, differences in page
257	protection do not matter.
258
259	The reason the uninitialized data is placed in this section is
260	that the EFI loader appears to be unable to handle sections
261	that are allocated but not loaded from the binary.
262
263 .dynamic, .dynsym, .rela, .rel, .reloc
264	These sections contains the dynamic information necessary to
265	self-relocate the binary (see below).
266
267A couple of more points worth noting about the linker script:
268
269 o On IA-64, the global pointer symbol (__gp) needs to be placed such
270   that the _entire_ EFI binary can be addressed using the signed
271   22-bit offset that the "addl" instruction affords.  Specifically,
272   this means that __gp should be placed at ImageBase + 0x200000.
273   Strictly speaking, only a couple of symbols need to be addressable
274   in this fashion, so with some care it should be possible to build
275   binaries much larger than 4MB.  To get a list of symbols that need
276   to be addressable in this fashion, grep the assembly files in
277   directory gnuefi for the string "@gprel".
278
279 o The link address (ImageBase) of the binary is (arbitrarily) set to
280   zero.  This could be set to something larger to increase the chance
281   of EFI being able to load the binary without requiring relocation.
282   However, a start address of 0 makes debugging a wee bit easier
283   (great for those of us who can add, but not subtract... ;-).
284
285 o The relocation related sections (.dynamic, .rel, .rela, .reloc)
286   cannot be placed inside .data because some tools in the GNU
287   toolchain rely on the existence of these sections.
288
289 o Some sections in the ELF binary intentionally get dropped when
290   building the EFI binary.  Particularly noteworthy are the dynamic
291   relocation sections for the .plabel and .reloc sections.  It would
292   be _wrong_ to include these sections in the EFI binary because it
293   would result in .reloc and .plabel being relocated twice (once by
294   the EFI loader and once by the self-relocator; see below for a
295   description of the latter).  Specifically, only the sections
296   mentioned with the -j option in the final "objcopy" command are
297   retained in the EFI binary (see Make.rules).
298
299
300** (2) Building Relocatable Binaries
301
302ELF binaries are normally linked for a fixed load address and are thus
303not relocatable.  The only kind of ELF object that is relocatable are
304shared objects ("shared libraries").  However, even those objects are
305usually not completely position independent and therefore require
306runtime relocation by the dynamic loader.  For example, IA-64 binaries
307normally require relocation of the global offset table.
308
309The approach to building relocatable binaries in the GNU EFI build
310environment is to:
311
312 (a) build an ELF shared object
313
314 (b) link it together with a self-relocator that takes care of
315     applying the dynamic relocations that may be present in the
316     ELF shared object
317
318 (c) convert the resulting image to an EFI binary
319
320The self-relocator is of course architecture dependent.  The x86
321version can be found in gnuefi/reloc_ia32.c, the x86_64 version
322can be found in gnuefi/reloc_x86_64.c and the IA-64 version can be
323found in gnuefi/reloc_ia64.S.
324
325The self-relocator operates as follows: the startup code invokes it
326right after EFI has handed off control to the EFI binary at symbol
327"_start".  Upon activation, the self-relocator searches the .dynamic
328section (whose starting address is given by symbol _DYNAMIC) for the
329dynamic relocation information, which can be found in the DT_REL,
330DT_RELSZ, and DT_RELENT entries of the dynamic table (DT_RELA,
331DT_RELASZ, and DT_RELAENT in the case of rela relocations, as is the
332case for IA-64).  The dynamic relocation information points to the ELF
333relocation table.  Once this table is found, the self-relocator walks
334through it, applying each relocation one by one.  Since the EFI
335binaries are fully resolved shared objects, only a subset of all
336possible relocations need to be supported.  Specifically, on x86 only
337the R_386_RELATIVE relocation is needed.  On IA-64, the relocations
338R_IA64_DIR64LSB, R_IA64_REL64LSB, and R_IA64_FPTR64LSB are needed.
339Note that the R_IA64_FPTR64LSB relocation requires access to the
340dynamic symbol table.  This is why the .dynsym section is included in
341the EFI binary.  Another complication is that this relocation requires
342memory to hold the function descriptors (aka "procedure labels" or
343"plabels").  Each function descriptor uses 16 bytes of memory.  The
344IA-64 self-relocator currently reserves a static memory area that can
345hold 100 of these descriptors.  If the self-relocator runs out of
346space, it causes the EFI binary to fail with error code 5
347(EFI_BUFFER_TOO_SMALL).  When this happens, the manifest constant
348MAX_FUNCTION_DESCRIPTORS in gnuefi/reloc_ia64.S should be increased
349and the application recompiled.  An easy way to count the number of
350function descriptors required by an EFI application is to run the
351command:
352
353  objdump --dynamic-reloc example.so | fgrep FPTR64 | wc -l
354
355assuming "example" is the name of the desired EFI application.
356
357
358** (3) Creating the Function Descriptor for the IA-64 EFI Binaries
359
360As mentioned above, the IA-64 PE32+ format assumes that the entry
361point of the binary is a function descriptor.  A function descriptors
362consists of two double words: the first one is the code entry point
363and the second is the global pointer that should be loaded before
364calling the entry point.  Since the ELF toolchain doesn't know how to
365generate a function descriptor for the entry point, the startup code
366in gnuefi/crt0-efi-ia64.S crafts one manually by with the code:
367
368	        .section .plabel, "a"
369	_start_plabel:
370	        data8   _start
371	        data8   __gp
372
373this places the procedure label for entry point _start in a section
374called ".plabel".  Now, the only problem is that _start and __gp need
375to be relocated _before_ EFI hands control over to the EFI binary.
376Fortunately, PE32+ defines a section called ".reloc" that can achieve
377this.  Thus, in addition to manually crafting the function descriptor,
378the startup code also crafts a ".reloc" section that has will cause
379the EFI loader to relocate the function descriptor before handing over
380control to the EFI binary (again, see the PECOFF spec mentioned above
381for details).
382
383A final question may be why .plabel and .reloc need to go in their own
384COFF sections.  The answer is simply: we need to be able to discard
385the relocation entries that are generated for these sections.  By
386placing them in these sections, the relocations end up in sections
387".rela.plabel" and ".rela.reloc" which makes it easy to filter them
388out in the filter script.  Also, the ".reloc" section needs to be in
389its own section so that the objcopy program can recognize it and can
390create the correct directory entries in the PE32+ binary.
391
392
393** (4) Convenient and Portable Generation of UNICODE String Literals
394
395As of gnu-efi-3.0, we make use (and somewhat abuse) the gcc option
396that forces wide characters (WCHAR_T) to use short integers (2 bytes)
397instead of integers (4 bytes). This way we match the Unicode character
398size. By abuse, we mean that we rely on the fact that the regular ASCII
399characters are encoded the same way between (short) wide characters
400and Unicode and basically only use the first byte. This allows us
401to just use them interchangeably.
402
403The gcc option to force short wide characters is : -fshort-wchar
404
405			* * * The End * * *
406