The 'Zapf' Table

General table information

One of the assumptions underlying modern layout engines is that characters and glyphs are different. The mapping of character to glyph and the subsequent processing of glyphs are clearly separated steps. This separation permits tremendous flexibility in how fonts are laid out, and removes us from the old straitjacket of fixed glyph repertoires per script. It means the font's 'cmap' table doesn't need to cover every glyph in the font.

However, because of this separation, as well as the intensive processing which happens to the glyphs themselves during layout, it's difficult at best to deduce which characters were originally associated with the final resulting glyphs in a layout. For instance, while Adobe Acrobat now provides users with the ability to search PDFs for content strings, this capability is difficult to implement for embedded TrueType fonts, since the streams were captured as just bunches of glyphs. The original strings aren't there any more to be searched.

What we need for a given font is a collection of information, per glyph, which can be used to associate semantic information with that glyph. For a given glyph, we'd like to be able to answer these questions:

By collecting all this information into one placeÑthe 'Zapf' table—named with permission after legendary type designer Hermann Zapf—can make it much easier for the user interface parts of applications to get what they need to present sensible choices to their users. This document proposes a format for this table which represents all this information in a relatively compact form.


Table Format

Header

The 'Zapf' table has the following header:

Type Name Description
Fixed32 version Set to fixed 1.0 (0x00010000)
UInt32 extraInfo Offset from start of table to start of extra info space (added to groupOffset and featOffset in GlyphInfo)
UInt32 offsets[] Array of offsets, indexed by glyphcode, from start of table to GlyphInfo structure for a glyph

The GlyphInfo structure

The header is followed immediately by an array of GlyphInfo structures. Each entry in this array is of variable size; the offset to a particular GlyphInfo is contained in the offsets[] array in the header. A GlyphInfo structure has the following format:

Type Name Description
UInt32 groupOffset Byte offset from start of extraInfo to GroupInfo or GroupInfoGroup for this glyph, or 0xFFFFFFFF if none
UInt32 featOffset Byte offset from start of extraInfo to FeatureInfo for this glyph, or 0xFFFFFFFF if none
UInt16 n16BitUnicodes Number of UInt16 values in the following array; note that this requires the Unicode code points for this character be in the UTF-16 encoding form, but allows for surrogate pairs
UInt16 unicodes[] Unicode code points for this glyph
UInt16 nNames Number of KindNames that follow (0 is permitted)
KindName kindNames[] KindNames for this glyph. This is an optional array of variable-length items!
(UInt8 padding2[1..3] If needed, to pad to 32-bit alignment before next GlyphInfo in the array)

The KindName structure

A KindName identifies a name for this glyph, and what kind of name it is. In order to accommodate the need for both machine-readable names (for font tools) and human-readable names (for constructing font UI), the KindName's kind byte has its top two bits interpreted differently from its bottom six bits. This has the effect of dividing KindNames into three groups, as follows:

kind value Type Description
0 to 63 Pascal string The bytes immediately following the kind byte comprise a Pascal string
64 to 127 2-byte binary The bytes immediately following the kind byte should be treated as the two bytes (big-endian) of a 16-bit quantity. Note that alignment is not guaranteed, so software reading this value should do it bytewise. The interpretation of the value, whether a direct numeric value, a 'name' table index, or something else, is controlled by the actual value. See the description below for more details.
128 to 255 reserved These values are not yet defined and must not be used.

The format of a KindName is as follows:

Type Name Description
UInt8 kind (The following indicate Pascal strings)
0=Universal name (that is, a name which complies with all naming conventions)
1=Apple name
2=Adobe name
3=AFII name
4=Unicode name
(The following indicate direct binary values)
64=CID for Japanese
65=CID for Traditional Chinese
66=CID for Simplified Chinese
67=CID for Korean
(The following indicate 'name' table indices)
68=Version history note, which allows the designer to keep track of versions of a glyph
69=Designer's short name, intended to be a short but unique name for the glyph
70=Designer's long name, intended to be a fuller name, if needed
71=Designer's usage notes, intended to guide users under what circumstances the glyph was intended to be used (and not used!)
72=Designer's historical notes, including information on how this glyph arose, and how it fits stylistically in the world of type
(variable) name A Pascal string or binary value, depending on the top two bits of the kind, as listed in the above descriptions.

The Unicode KindName should only be used for glyphs corresponding to a single Unicode character. Ordinarily, such names can be inferred entirely from the Unicode Standard. This name might be useful, however, for a designer creating an experimental font for an unencoded script used in the Unicode Private Use Area.

The GroupInfo and GroupInfoGroup structures

When presenting an interface to a user for the selection of particular glyphs, an application would often like to know the set of glyphs in a font which are logically related (or perhaps, those which are related in the designer's eye). In order to accommodate this, the GroupInfo structure gathers together information about related glyphs. A GroupInfo is a collection of NamedGroup structures, where each named group has a name (via a 'name' table index), and a collection of glyph indices in whatever order the designer sees fit to impose. So for instance, a designer whose font contains many different swashes for a given letter could create groups for all the glyphs for that letter, where each group has a name (e.g. "long-tailed," "fat-stemmed," etc.)

The NamedGroup structure

NamedGroup structures reside in the extraInfo space, and have the following format:

Type Name Description
UInt16 nameIndex Index in the 'name' table for this group's name; a value of zero indicates no name for this group
UInt16 nGlyphs Number of glyph indices in this named group; this may be zero, in which case no glyphs will follow and this name is the name for the whole group (this convention is only valid for the first name in a group)
UInt16 glyphs[] Glyph indices for this group.

Now that we have described the NamedGroup, we can describe the GroupInfo and GroupInfoGroup proper.

The GroupInfoGroup structure is to be used for glyphs contained in more than one group. It looks like this:

Type Name Description
UInt16 nGroups The low-order 14 bits specify the count of the number of groups being defined. The top bit should be 0 (it is only used by the GroupInfo structure); and bit 14 should be set to indicate that this is a GroupInfoGroup and not a GroupInfo.
UInt16 padding This field is not currently used and should be set to 0. It is present to maintain as far as possible 32-bit alignment within the 'Zapf' table.
UInt32 groupOffsets[] Offsets (relative to extraInfo) to the GroupInfos containing this glyph. The first value in this field indicates the group to be used as the "alternate forms" for the given glyph. For example, a user could control-click on a character to bring up a pallet of alternate shapes, derived from this first entry. In the case where no group containing this glyph contains the "alternate forms" for the glyph, the first value in this field should be 0xFFFFFFFF.

The GroupInfo structure looks like this:

Type Name Description
UInt16 nGroups The low-order 14 bits specify the count of the number of groups being defined. If the top bit is 1, each group is preceded by a 16-bit flag word; if the top bit is 0, then what follows is an inline array of NamedGroups. Bit 14 should be 0 to indicate that this is a GroupInfo and not a GroupInfoGroup.
(variable) groups[] The group information. Each group may be preceded by a 16-bit flag (depending on the high bit of the nGroups field); note that this may not be just an array of NamedGroups, but possibly offsets, depending on the value of the optional 16-bit flag

A key part of the GroupInfo is the nGroups field, which counts the number of groups being defined. This is a 14-bit unsigned constant; the top two bits indicates whether flag information is associated with each group or whether a series of 32-bit offsets to groups is found instead of a series of NamedGroups. If these bits are off, then each group is just a NamedGroup inline, seriatim. If the top bit is on, then each group is preceded by a 16-bit flags word, whose flags indicate the following:

Mask Name Description
0x8000 isSubdivided If this bit is on, this NamedGroup is actually a subdivision of a larger single group. If this bit is off, this group is a unique self-contained group. This bit should be set in those parts of a group which are intended to be used to present a user interface.
0x4000 isAligned If this bit is on, this NamedGroup is padded to a 32-bit boundary
0x3FFF (reserved) These bits are not currently defined and must be zero

Let's look at a few examples to see how this structure can be used to represent various glyph groupings. First, in the simplest case, the font designer wishes to include ten different swash ampersands in a font, and wishes to group them together so the user can choose one. In this example, each ampersand glyph's GlyphInfo structure will have the same groupOffset value, an offset which will refer to the following GroupInfo:

Name Value Description
nGroups 0x0001 There is only one group for all ampersands; because the top bits are clear there is no flag word before the start of the NamedGroup, and the NamedGroup follow immediately
nameIndex 300 Index into the 'name' table for this group's name (which would be something like "Swash Ampersands")
nGlyphs 10 Number of glyphs which belong to this group
glyphs[] ... The ten glyph indices for the ampersands

An application could use this information to present the user with a simple palette of the ten ampersands to choose from.

In a slightly more complex example, the designer still has ten swash ampersands and wishes the user to see all ten of them, but grouped into labelled groups, with an overall label, so that a menu would look something like this:

All ten ampersands' GlyphInfo structures would still have an offset to the same GroupInfo, which now would look like this:

Name Value Description
nGroups 0x8003 Three groups follow, each with a 16-bit flag word in front of it
flag 0x4000 This group is one subdivision of a larger group, and the NamedGroup for it follows immediately
nameIndex 300 Name index for 'Swash Ampersands' string
nGlyphs 0 When the nGlyphs is zero, the nameIndex identifies the whole grouping
flag 0x4000 This group is one subdivision of a larger group, and the NamedGroup for it follows immediately
nameIndex 301 Name index for '"Classic" style' string
nGlyphs 6 There are six ampersands in the Classic style
glyphs[] ... Glyph indices for the six ampersands
flag 0x4000 This group is one subdivision of a larger group, and the NamedGroup for it follows immediately
nameIndex 302 Name index for '"Nouveau" style' string
nGlyphs 4 There are four ampersands in the Nouveau style
glyphs[] ... Glyph indices for the four ampersands

A yet more complex example deals with a font designer who wishes to have glyphs belong to multiple groups. For example, the ten ampersands of our previous example are to have all the characteristics just described, but they are also to belong to a separate group of all punctuation in the font. Again, each ampersand could have the same groupOffset value, referring to this GroupInfoGroup structure:

Name Value Description
nGroups 0x4002 This is a GroupInfoGroup structure (bit 14 is set) and there are two offsets within it.
padding 0 Padding word
offset[0] Offset (relative to extraInfo) to the GroupInfo structure for alternate ampersands (as above)
offset[1] Offset (relative to extraInfo) to the GroupInfo structure for punctuation glyphs

The period glyph would also be a member of the punctuation group, but (unlike the ampersand), it has no alternate glyphs for the user to select. Its GlyphInfo structure would contain an offset to a GroupInfoGroup like this:

Name Value Description
nGroups 0x4002 This is a GroupInfoGroup structure (bit 14 is set) and there are two offsets within it.
padding 0 Padding word
offset[0] 0xFFFFFFFF There are no alternate glyphs for the period
offset[1] Offset (relative to extraInfo) to the GroupInfo structure for punctuation glyphs

The punctuation GroupInfo would look like this:

Name Value Description
nGroups 0x0001 One NamedGroup follows and is not preceded by a flag word
nameIndex 350 Name index for 'Punctuation' string
nGlyphs 40 There are 40 punctuation glyphs in the following array (this will include our ten ampersands and one period)
glyphs[] ... The glyph indices for all the punctuation (again, including our ten ampersands and one period)

The FeatureInfo structure

A FeatureInfo structure identifies the layout engine inputs which force the appearance of this glyph. In many cases, nothing is specified because the given glyph is the default glyph for the specified Unicode(s). An example of this is 'A', where nothing else needs to happen. However, in the case of multiple swashes, line-start or line-end swashes, or contextual forms, more information is needed to let the user know how to get this glyph when a line is being laid out. (Of course, an application could just present the user with a palette of all the glyphs in the font, but that can be overwhelming sometimes!)

One of the interesting problems here is the identification of kinds of context which matter in the choice of certain glyphs. The list below is Apple's first attempt at listing these; no doubt we've missed some. We welcome other suggestions!

The format of a FeatureInfo is as follows:

Type Name Description
UInt16 context Bitfield identifying the contexts in which this glyph appears. Note more than one bit may be on! A value of zero means context is irrelevant.
0x0001 = line-initial
0x0002 = line-medial
0x0004 = line-final
0x0008 = word-initial
0x0010 = word-medial
0x0020 = word-final
0x0040 = auto-fraction numerator
0x0080 = auto-fraction denominator
UInt16 nAATFeatures Number of <type,selector> pairs which follow to select this feature
sfntFontRunFeature features[] The <type,selector> pairs. (This type is defined in SFNTTypes.h, with constants in SFNTLayoutTypes.h)
UInt32 nOTTags Number of 4-byte feature tags which follow to select this feature.
UInt32 tags[] The array of tags.

Example

To help clarify how all this information is represented, let's look at a concrete example. Suppose we have a font with this (somewhat odd) repertoire:

The numbers are the glyph indices, and the images show the actual glyphs. Here is an example 'Zapf' file for this font:

Offset Value Description
0 0x00010000 Version 1.0 in fixed notation
4 444 Offset to start of extra info part
The offsets to the GlyphInfo records start here
8 68 Offset to GlyphInfo for glyph 0
12 88 Offset to GlyphInfo for glyph 1
16 108 Offset to GlyphInfo for glyph 2
20 128 Offset to GlyphInfo for glyph 3
24 148 Offset to GlyphInfo for glyph 4
28 168 Offset to GlyphInfo for glyph 5
32 188 Offset to GlyphInfo for glyph 6
36 216 Offset to GlyphInfo for glyph 7
40 244 Offset to GlyphInfo for glyph 8
44 272 Offset to GlyphInfo for glyph 9
48 304 Offset to GlyphInfo for glyph 10
52 336 Offset to GlyphInfo for glyph 11
56 364 Offset to GlyphInfo for glyph 12
60 380 Offset to GlyphInfo for glyph 13
64 408 Offset to GlyphInfo for glyph 14
GlyphInfo for glyph 0 starts here
68 0xFFFFFFFF No GroupInfo for this glyph
72 0xFFFFFFFF No FeatureInfo for this glyph
76 1 Number of 16-bit Unicode values which follow
78 0x0063 Unicode for 'c'
80 1 Number of KindNames which follow
82 0 This is a universal name
84 1 A byte of string length
85 'c' ASCII name 'c'
86 0 Two bytes of padding for long alignment
GlyphInfo for glyph 1 starts here
88 0xFFFFFFFF No GroupInfo for this glyph
92 0xFFFFFFFF No FeatureInfo for this glyph
96 1 Number of 16-bit Unicode values which follow
98 0x0066 Unicode for 'f'
100 1 Number of KindNames which follow
102 0 This is a universal name
104 1 A byte of string length
105 'f' ASCII name 'f'
106 0 Two bytes of padding for long alignment
GlyphInfo for glyph 2 starts here
108 0xFFFFFFFF No GroupInfo for this glyph
112 0xFFFFFFFF No FeatureInfo for this glyph
116 1 Number of 16-bit Unicode values which follow
118 0x0069 Unicode for 'i'
120 1 Number of KindNames which follow
122 0 This is a universal name
124 1 A byte of string length
125 'i' ASCII name 'i'
126 0 Two bytes of padding for long alignment
GlyphInfo for glyph 3 starts here
128 0xFFFFFFFF No GroupInfo for this glyph
132 0xFFFFFFFF No FeatureInfo for this glyph
136 1 Number of 16-bit Unicode values which follow
138 0x006C Unicode for 'l'
140 1 Number of KindNames which follow
142 0 This is a universal name
144 1 A byte of string length
145 'l' ASCII name 'l'
146 0 Two bytes of padding for long alignment
GlyphInfo for glyph 4 starts here
148 0xFFFFFFFF No GroupInfo for this glyph
152 0xFFFFFFFF No FeatureInfo for this glyph
156 1 Number of 16-bit Unicode values which follow
158 0x0073 Unicode for 's'
160 1 Number of KindNames which follow
162 0 This is a universal name
164 1 A byte of string length
165 's' ASCII name 's'
166 0 Two bytes of padding for long alignment
GlyphInfo for glyph 5 starts here
168 0xFFFFFFFF No GroupInfo for this glyph
172 0xFFFFFFFF No FeatureInfo for this glyph
176 1 Number of 16-bit Unicode values which follow
178 0x0074 Unicode for 't'
180 1 Number of KindNames which follow
182 0 This is a universal name
184 1 A byte of string length
185 't' ASCII name 't'
186 0 Two bytes of padding for long alignment
GlyphInfo for glyph 6 starts here
188 0xFFFFFFFF No GroupInfo for this glyph
192 0 Offset in extra info space to FeatureInfo
196 2 Number of 16-bit Unicode values which follow
198 0x0066 Unicode for 'f'
200 0x0069 Unicode for 'i' (note we don't use the composed 0xFB01 value; these Unicodes are always decomposed)
202 2 Number of KindNames which follow
204 1 Apple name follows
205 2 Length of Apple name
206 'fi' ASCII name 'fi'
208 2 Adobe name follows
209 3 Length of Adobe name
210 'f_i' ASCII name 'f_i'
213 0 Three bytes of padding for long alignment
GlyphInfo for glyph 7 starts here
216 0xFFFFFFFF No GroupInfo for this glyph
220 0 Offset in extra info space to FeatureInfo
224 2 Number of 16-bit Unicode values which follow
226 0x0066 Unicode for 'f'
228 0x006C Unicode for 'l' (note we don't use the composed 0xFB02 value; these Unicodes are always decomposed)
230 2 Number of KindNames which follow
232 1 Apple name follows
233 2 Length of Apple name
234 'fl' ASCII name 'fl'
236 2 Adobe name follows
237 3 Length of Adobe name
238 'f_l' ASCII name 'f_l'
241 0 Three bytes of padding for long alignment
GlyphInfo for glyph 8 starts here
244 0xFFFFFFFF No GroupInfo for this glyph
248 0 Offset in extra info space to FeatureInfo
252 2 Number of 16-bit Unicode values which follow
254 0x0066 Unicode for 'f'
256 0x0066 Unicode for 'f' (note we don't use the composed 0xFB00 value; these Unicodes are always decomposed)
258 2 Number of KindNames which follow
260 1 Apple name follows
261 2 Length of Apple name
262 'ff' ASCII name 'fi'
264 2 Adobe name follows
265 3 Length of Adobe name
266 'f_f' ASCII name 'f_i'
269 0 Three bytes of padding for long alignment
GlyphInfo for glyph 9 starts here
272 0xFFFFFFFF No GroupInfo for this glyph
276 0 Offset in extra info space to FeatureInfo
280 3 Number of 16-bit Unicode values which follow
282 0x0066 Unicode for 'f'
284 0x0066 Unicode for 'f'
286 0x0069 Unicode for 'i' (note we don't use the composed 0xFB03 value; these Unicodes are always decomposed)
288 2 Number of KindNames which follow
290 1 Apple name follows
291 3 Length of Apple name
292 'ffi' ASCII name 'ffi'
295 2 Adobe name follows
296 5 Length of Adobe name
297 'f_f_i' ASCII name 'f_f_i'
302 0 Two bytes of padding for long alignment
GlyphInfo for glyph 10 starts here
304 0xFFFFFFFF No GroupInfo for this glyph
308 0 Offset in extra info space to FeatureInfo
312 3 Number of 16-bit Unicode values which follow
314 0x0066 Unicode for 'f'
316 0x0066 Unicode for 'f'
318 0x006C Unicode for 'l' (note we don't use the composed 0xFB04 value; these Unicodes are always decomposed)
320 2 Number of KindNames which follow
322 1 Apple name follows
323 3 Length of Apple name
324 'ffl' ASCII name 'ffl'
327 2 Adobe name follows
328 5 Length of Adobe name
329 'f_f_l' ASCII name 'f_f_l'
334 0 Two bytes of padding for long alignment
GlyphInfo for glyph 11 starts here
336 0xFFFFFFFF No GroupInfo for this glyph
340 8 Offset in extra info space to FeatureInfo
344 2 Number of 16-bit Unicode values which follow
346 0x0063 Unicode for 'c'
348 0x0074 Unicode for 't'
350 2 Number of KindNames which follow
352 1 Apple name follows
353 2 Length of Apple name
354 'ct' ASCII name 'ct'
356 2 Adobe name follows
357 3 Length of Adobe name
358 'c_t' ASCII name 'c_t'
361 0 Three bytes of padding for long alignment
GlyphInfo for glyph 12 starts here
364 44 Offset in extra info space to GroupInfo
340 16 Offset in extra info space to FeatureInfo
344 2 Number of 16-bit Unicode values which follow
346 0x0073 Unicode for 's'
348 0x0074 Unicode for 't' (note we don't use the composed 0xFB05 value; these Unicodes are always decomposed)
350 2 Number of KindNames which follow
352 1 Apple name follows
353 10 Length of Apple name
354 'stoldstyle' ASCII name 'stoldstyle'
364 2 Adobe name follows
365 12 Length of Adobe name
366 's_t.oldstyle' ASCII name 's_t.oldstyle'
378 0 Two bytes of padding for long alignment
GlyphInfo for glyph 13 starts here
380 44 Offset in extra info space to GroupInfo
384 8 Offset in extra info space to FeatureInfo
388 2 Number of 16-bit Unicode values which follow
390 0x0073 Unicode for 's'
392 0x0074 Unicode for 't' (note we don't use the composed 0xFB06 value; these Unicodes are always decomposed)
394 2 Number of KindNames which follow
396 1 Apple name follows
397 2 Length of Apple name
398 'st' ASCII name 'st'
400 2 Adobe name follows
401 3 Length of Adobe name
402 's_t' ASCII name 's_t'
405 0 Three bytes of padding for long alignment
GlyphInfo for glyph 14 starts here
408 44 Offset in extra info space to GroupInfo
412 28 Offset in extra info space to FeatureInfo
416 2 Number of 16-bit Unicode values which follow
418 0x0073 Unicode for 's'
420 0x0074 Unicode for 't'
422 2 Number of KindNames which follow
424 1 Apple name follows
425 7 Length of Apple name
426 'stfinal' ASCII name 'st'
433 2 Adobe name follows
434 9 Length of Adobe name
435 's_t.final' ASCII name 's_t.final'
Extra info space
FeatureInfo for common ligatures starts here (offset 0)
444 0 Context is irrelevant for the common ligatures
446 1 One AAT-style <type,selector> pair follows
448 1 "Ligature" type
450 2 "Common ligatures on" selector
FeatureInfo for rare ligatures starts here (offset 8)
452 0 Context is irrelevant for the rare ligatures
454 1 One AAT-style <type,selector> pair follows
456 1 "Ligature" type
458 4 "Rare ligatures on" selector
FeatureInfo for rare oldstyle ligatures starts here (offset 16)
460 0x0018 Context is word-start or word-middle
462 2 Two AAT-style <type,selector> pairs follow
464 1 "Ligature" type
466 4 "Rare ligatures on" selector
468 8 "Smart swashes" type
470 8 "Non-final swashes on" selector
FeatureInfo for rare final swash ligatures starts here (offset 28)
472 0x0024 Context is word-final or line-final
474 3 Two AAT-style <type,selector> pairs follow
476 1 "Ligature" type
478 4 "Rare ligatures on" selector
480 8 "Smart swashes" type
482 2 "Word-final swashes on" selector
484 8 "Smart swashes" type
486 6 "Line-final swashes on" selector
GroupInfo for "st" variants (offset 44)
488 600 Name table ID for "'st' variants" name
490 3 Number of glyphs in this group
492 12 First member of group is glyph 12
494 13 Second member of group is glyph 13
496 14 Third member of group is glyph 14

Mac OS-specific information

The contents of the 'Zapf' table not used directly by the system; they are made available to application programmers through the Carbon Font Manager APIs.

Newton-specific information

The 'Zapf' table is not used by the Newton OS.

Dependencies

The 'Zapf' table contains a table with an entry for every glyphs; this must be updated whenever the glyph count recorded in the 'maxp' table changes.

Tools

Hex editing of the 'Zapf' table is possible using TrueEdit. Conversion of 'Zapf' table data to and from text files is possible through DumperFuser.


Change Log

14 September 2000
Initial public version.
applefonts@apple.com

[Table of Contents]

Last updated: JHJ