PostScript: Programming Text and Graphics

PostScript Features

PostScript from Adobe Systems is an underappreciated yet superb general purpose computing language.

Among its many other capabilities, PostScript can read or write virtually any disk file format in just about any language. PostScript is a premier choice any time exotic calculations need to be combined with fancy visual presentations. PostScript also has unique robotics potential.

The Guru's Lair

PostScript is an interpreted, complete, stack-based, weakly- and dynamically-typed programming language with primitives for printing and drawing on an output device. Being a weakly typed programming language means that a variable can hold data of any type and the type of its object can even change throghout the running of the program... that is at interpretation as oposed to compilation. Like other programming languages, PostScript uses variables and functions (procedures), has a sufficient set of control structures, comparison, arithmetic and logical operators, console and file input-output, and so on.

Some remarks on PostScript are:

The PostScript Stack (and its Operators)

PostScript uses postfix notation. This is more than a syntactic feature. Postfix operators operate on operands that have been pushed onto a virtual stack. This modus operandi supports implementations on machines with lean RAM, such as printers.

  • dup: duplicates an item, that is pushes a copy of itself
  • copy: similar to dup, only it takes another argument, the number of (top) items to be copied. 1 copy does the same as dup.
  • index: it takes a number, an index to an item, and returns the item. For example, 0 index returns the top item, just as dup.
  • roll: performs a rotation. The first argument is the number of items or depth of stack that rotates, and the second arguments is the number of places that this chunk rotates. Thus, 2 1 roll is the same as exch.
  • exch: exchanges or swaps the two top items on the stack. Thus, \first \second exch leaves the stack \second \first.
  • count: returns how many items are on the stack

Some other operators are:

  • =: print the top item on standard output
  • ==: print the top item on standard output
  • cleartomark:
  • mark:
  • counttomark:

On Being Dynamic and Interpreted

PostScript is both dynamically-typed and interpreted. Maybe these two go hand in hand...

The Variable-Procedure Duality

In PostScript a name may be associated to a constant value (such as another name, a number, a string, an array) or to a procedure. Throughout the run of a PostScript program a name may change its type between constant and procedure, too.

Consider how the distance between two lines (lead) may be defined subtly differently in terms of fontscale. First as a constant:

/lead fontscale def

Then as a procedure:

/lead {fontscale} def

As long as fontscale doesn't change, the former definition results in faster execution. Perhaps we should do all our font changes through an intelligent procedure (myselectfont) that updates lead as a constant, among other tasks.

The bind Mechanism*

Binding

Binding a procedure replaces all executable operator names by their values. The overall effect is that all procedure names as well as procedures nested in those procedures and so on become tightly bound to their original definitions. When a bound procedure is executed, the interpreter encounters the operators themselves rather than the names.

Thus, the bind operator implements a form of static binding. It trades in dynamic binding for speed.

Relationship Between PostScript and PDF

PDF is a page description language, not a full-fledged programming language.

A PDF file is a static representation of the result of executing a PostScript file on a single occasion.

Some Basic Programs in PostScript

Here is a simple program that paints a diagonal:

%!PS−Adobe−2.0

100 100  moveto
350 350 rlineto
stroke
showpage

And here is another program that greets you into PostScript in 30 point Times-Roman font:

%!PS−Adobe−2.0

50 400  moveto
/Times-Roman 16 selectfont
(Welcome to PostScript!) show
showpage

Non-Programmatic or Flat PostScript

It is possible to use PostScript as a page description language and do all the logic in another programming language such as C++ that will produce a PostScript file. You would eschew all comparison and control structures, mathematics, loops, dictionary and array features, etc.

These are the PostScript operators you would need to do everything possible:

  • path construction operators: newpath, moveto, rmoveto, lineto, rlineto, curveto, rcurveto, and closepath;
  • font selection operators: either findfont, scalefont, and setfont, or just selectfont;
  • for showing text: show, and probably also widthshow (to stretch to right margin by widening inter-word spacing) and charpath;
  • to eject a page: showpage
  • to stroke or fill a path: stroke and fill;
  • to select a color or shade of gray: setrgbcolor, setymckcolor, sethsbcolor, and setgray;
  • for saving and restoring the graphic context: gsave and grestore;
  • for clipping: clip

You would probably like to use, too:

  • further path-building operators: arc, arcn, possibly also arct;

If you want to make PostScript paragraphs from outside PostScript, you need to know the widths of the characters in your font.

Direct PostScript

David Byram-Wigfield has called direct PostScript as work created by writing in PostScript from the start. PostScript is a powerful language that is not difficult to begin to learn, can express sophisticated graphical ideas in few words, and has the final say as to just where every mark appears on the output, giving unsurpassed visual control.

PostScript is a full programming language tailored to graphics, so it is great for getting a computer/printer to do many precise, repetitive, calculated things all on its own. Design, illustration, and typesetting can all involve precise, repetitive, calculated things, and direct PostScript techniques are justly popular with people who would rather tell the computer to do the work than drag it by the mouse every step of the way.

It could be said that pure, vanilla PostScript is hardly ever—except for simple jobs—the language you really write in: you write in a language that you get by sending a prolog of one or more resources that grow PostScript into a language suiting your job, which may retain the basic syntax and flavor of PostScript, or have a whole new look.

Benefits: PostScript in the small

Frequent, low-volume projects with highly individualized requirements can be natural jobs for direct PostScript. Often it is easy with a WYSIWYG program to get ninety percent of the desired result, and then the last ten percent becomes a battle with the program's built-in style or assumptions. Even if you can find the way to tell the program what you want, it is not guaranteed to be easier than to say it in PostScript—and to say it in PostScript can be the more promising investment, in learning coherent concepts of one established graphical language rather than workarounds and tricks for a hodge-podge of application programs.

For those doing mathematical figures, the point could not be made better than by Bill Casselman in his book, Mathematical Illustrations:

The truth is that the trade-off is unnecessary--once one has made a small initial investment of effort, by far the best thing to do in most situations is to write a program in the graphics programming language PostScript. [...] The apparent complexity involved in producing simple figures by programming in PostScript, as I hope this book will demonstrate, is largely an illusion. And the amount of work involved in producing more complicated figures will usually be neither more nor less than what is necessary.

For those who have taken to heart Edward Tufte's work on presenting information visually, there is no such thing any more as a generic chart or graph. Any information display worth making is worth making with close attention to how all the visual and typographic aspects combine to convey the information without distortion or distraction. A sampling of published papers in some fields might invite the conclusion that many charts and graphs were not worth making! The trouble is, they are typically made with software that tries to make charts and graphs generic, and then offers a cornucopia of ways to dress them up with gee-whiz chartjunk that adds nothing to the presentation.

All the ornamentation and noise in that sort of a graph would be tedious to replicate in PostScript—but a clean, readable graph is not hard, and in PostScript your control over the position and typography of every element, legend, and callout is more direct than in any graphing program.

Don Lancaster's pages (see below) offer examples of many other publishing-in-the-small projects—stationery, cards and numbered tickets, fancy borders (exactly the kind of repetitive work the computer excels at) and so on—where direct PostScript can be the way to go.

PostScript Data Types and Objects

Simple and Composite Objects and Virtual Memory (VM) in PostScript

There are two classes of objects: simple and composite. A simple object's value is contained in the object itself. A composite object's value is stored separately; the object contains a reference to it. The virtual memory (VM) is the storage in which the values of composite objects reside.

For example, the program fragment

234 (Here is a string)

pushes two objects, an integer and a string, on the operand stack. The integer, which is a simple object, contains the value 234 as part of the object itself. The string, which is a composite object, contains a reference to the value (Here is a string), which is a text string that resides in VM. The elements of the text string are characters (actually, integers in the range 0 to 255) that can be individually selected or replaced.

Here is another example:

{234 (Here is a string)}

This pushes a single object, a two-element executable array, on the operand stack. The array is a composite object whose value resides in VM. The value in turn consists of two objects, an integer and a string. Those objects are elements of the array, which can be individually selected or replaced.

Several composite objects can share the same value. For example, in

{234 (Here is a string)} dup

the dup operator pushes a second copy of the array object on the operand stack. The two objects share the same value—that is, the same storage in VM. So, replacing an element of one array will affect the other. Other types of composite objects, including strings and dictionaries, behave similarly.

Creating a new composite object consumes VM storage for its value.

PostScript Virtual Memory (VM)

The virtual memory (VM) is the storage in which the values of composite objects reside.

PostScript Local and Global Virtual Memory (VM)

There are two divisions of VM containing the values of composite objects: local and global. Only composite objects occupy VM. An object in VM means a composite object whose value occupies VM; the location of the object (for example, on a stack or stored as an element of some other object) is immaterial.

Local VM is a storage pool that obeys a stack-like discipline. Allocations in local VM and modifications to existing objects in local VM are subject to a feature called save and restore, named after the operators that invoke it. save and restore bracket a section of a PostScript language program whose local VM activity is to be encapsulated. restore deallocates new objects and undoes modifications to existing objects that were made since the matching save.

Global VM is a storage pool for objects that don't obey a fixed discipline. Objects in global VM can come into existence and disappear in an arbitrary order during execution of a program. Modifications to existing objects in global VM are not affected by occurrences of save and restore within the program. However, an entire job's VM activity can be encapsulated, enabling separate jobs to be executed independently.

In a hierarchically structured program, such as a page description, local VM is used to hold information whose lifetime conforms to the structure; that is, it persists to the end of a structural division, such as a single page. Global VM may be used to hold information whose lifetime is independent of the structure, such as definitions of fonts and other resources that are loaded dynamically during execution of a program.

Control over allocation of objects in local versus global VM is provided by the setglobal operator (a Level 2 feature). This operator establishes a VM allocation mode, a boolean value that determines where subsequent allocations are to occur (false means local, true means global). It affects objects created implicitly by the scanner and objects created explicitly by operators. The default VM allocation mode is local; a program can switch to global VM allocation mode when it needs to.

Dictionaries in PostScript

...

Predefined Dictionaries in PostScript

Some of these are read-only (systemdict), some read-writeable.

In Level 1 implementations of the PostScript language, there are two built-in dictionaries permanently on the dictionary stack; they are called systemdict and userdict. In Level 2 implementations, there are three dictionaries: systemdict, globaldict, and userdict.

  • systemdict is a read-only dictionary that associates the names of all the PostScript operators (those defined in this manual) with their values (the built-in actions that implement them).
  • globaldict (Level 2) is a writable dictionary in global VM.
  • userdict is a writable dictionary in local VM. It is the default modifiable naming environment normally used by PostScript language programs.

userdict is the topmost of the permanent dictionaries on the dictionary stack. The def operator puts definitions there unless the program has pushed some other dictionary on the dictionary stack. Applications can and should create their own dictionaries rather than put things in userdict.

PostScript Resources

A resource is a collection of named objects that either reside in VM or can be located and brought into VM on demand.

PostScript Resource Operators

There are five operators that apply to resources: findresource, resourcestatus, resourceforall, defineresource, and undefineresource. These operators and the general concept of named resources are Level 2 features. A more limited facility applicable only to fonts—the findfont and definefont operators—is available in Level 1.

The findresource operator is the key feature of the resource facility. Given a resource category name and an instance name, findresource returns an object. If the requested resource instance does not already exist as an object in VM, findresource gets it from an external source and loads it into VM. A PostScript language program can access named resources without knowing if they are already in VM or how they are obtained from external storage.

Other important features include resourcestatus, which returns information about a resource instance, and resourceforall, which enumerates all available resource instances in a particular category. These operators apply to all resource instances, whether or not they reside in VM; the operators do not cause the resource instances to be brought into VM. resourceforall should be used with care and only when absolutely necessary, since the set of available resource instances is potentially extremely large.

A program can explicitly define a named resource instance in VM. That is, it can create an object in VM, then execute defineresource to associate the object with a name in a particular resource category. This resource instance will be visible in subsequent executions of findresource, resourcestatus, and resourceforall. A program can also execute undefineresource to reverse the effect of a prior defineresource. The findresource operator automatically executes defineresource and undefineresource to manage the VM for resource instances that it obtains from external storage.

Resource instances can be defined in either local or global VM. The lifetime of the definition depends on the VM allocation mode in effect at the time the definition is made. Normally, both local and global resource instances are visible and available to a program. However, when the current VM allocation mode is global, only global instances are visible; this ensures correct behavior of resource instances that are defined in terms of other resource instances.

When a program executes defineresource to define a resource instance explicitly, it has complete control over whether to use local or global VM. However, when execution of findresource causes a resource instance to be brought into VM automatically, the decision whether to use local or global VM is independent of the VM allocation mode at the time findresource is executed. Usually, resource instances are loaded into global VM; this enables them to be managed independently of the save and restore activity of the executing program. However, certain resource instances do not function correctly when they reside in global VM; they are loaded into local VM instead.

The language does not specify a standard method for installing resources in external storage. Installation typically consists of writing a named file in a file system. However, details of how resource names are mapped to file names and how the files are managed are environment dependent. In some environments, resources may be installed using facilities entirely separate from the PostScript interpreter.

Resource instances are identified by keys that ordinarily are name or string objects; the resource operators treat names and strings equivalently. Use of other types of keys is permitted but not recommended. The defineresource operator can define a resource instance with a key that is not a name or string; the other resource operators can access the instance using that key. However, such a key can never match any resource instance in external storage.

PostScript Resource Categories

Resource categories are identified by name. The standard resource categories are shown below. Within a given category, every resource instance that resides in VM is of a particular type and has a particular intended interpretation or use.

Coordinate Systems and Transformations

Paths and shapes are defined in terms of pairs of points on the Cartesian plane specified as coordinates. A coordinate pair is a pair of real numbers x and y that locate a point within a Cartesian (two-axis) coordinate system superimposed on the current page. The PostScript language defines a default coordinate system that PostScript language programs can use to locate any point on the current page.

Coordinates specified in a PostScript language program refer to locations within a coordinate system that always bears the same relationship to the current page regardless of the output device on which printing or displaying will be done. This coordinate system is called user space.

Initially, the user space origin is located at the lower-left corner of the output page or display window, with the positive x axis extending horizontally to the right and the positive y axis extending vertically upward, as in standard mathematical practice. The length of a unit along both the x and y axes is 1/72 of an inch. This coordinate system is the default user space. In default user space, all points within the current page have positive x and y coordinate values.

Transformations and the Current Transformation Matrix (CTM)

The Page in PostScript

Page Dimensions in PostScript

Before painting anything we need to establish the absolute page dimensions and then the dimensions of the smaller printing area.

Paper Sizes

   Paper Size                      Dimension (in points)
   ------------------              ---------------------
   Comm #10 Envelope               297 x 684
   C5 Envelope                     461 x 648
   DL Envelope                     312 x 624
   Folio                           595 x 935
   Executive                       522 x 756
   Letter                          612 x 792
   Legal                           612 x 1008
   Ledger                          1224 x 792
   Tabloid                         792 x 1224
   A0                              2384 x 3370
   A1                              1684 x 2384
   A2                              1191 x 1684
   A3                              842 x 1191
   A4                              595 x 842
   A5                              420 x 595
   A6                              297 x 420
   A7                              210 x 297
   A8                              148 x 210
   A9                              105 x 148
   B0                              2920 x 4127
   B1                              2064 x 2920
   B2                              1460 x 2064
   B3                              1032 x 1460
   B4                              729 x 1032
   B5                              516 x 729
   B6                              363 x 516
   B7                              258 x 363
   B8                              181 x 258
   B9                              127 x 181
   B10                             91 x 127

Absolute Dimensions in PostScript

First there are the absolute dimensions left, right, bottom and top. These are called abslm, absrm, absbm, and abstm for absolute left margin, absolute right margin, absolute bottom margin, and absolute top margin. Actually, abslm and absbm are 0.

For each paper size we define a procedure to update absolute dimensions. Here is one for A4:

/inch {72.27 mul} bind def
/setupa4 {
  /absrm  8.26 inch def % A4 paper width
  /abstm 11.69 inch def % A4 paper height
  /abslm 0 def
  /absbm 0 def
} def

Printing Area Dimensions in PostScript

These are the left, right, bottom and top of the area where glyphs and geometric objects may be printed, which is strictly inside the absolute page boundaries. Paragraph procedures such as showpar rely heavily on these, as well as the newline procedure.

A procedure may be defined to update lm, rm, bm, and tm whenever necessary.

/applymargin { % stack: margin
  /lm abslm 2 index add def
  /rm absrm 2 index sub def
  /tm abstm 2 index sub def
  /bm absbm 2 index add def
  pop % drop the margin off the stack
} def

which, given a moderate margin of 33, might be used like this:

33 applymargin

Turning a Page in PostScript

The language offers operator showpage, but we may elaborate a little more on this:

/movetostart {lm tm moveto} def
/turnpage {showpage movetostart} def

Actually, we should define two dummy procedures and redefine turnpage as:

/preturnpage {} def
/postturnpage {} def
/turnpage {preturnpage showpage postturnpage movetostart} def

Moreover, turning a page often involves printing the page number, and sometimes an appropriate header and footer, too. I will not go into how to handle page numbers. The pagenumber, its font name and scale, and the position where it is printed should be kept in an overall-accessible dictionary like commondict.

Text in PostScript

Much of the code for handling text depends on the size of the current font, which is the fontscale variable. fontscale may be understood to be local, so a global absfontscale should then be defined in, say, an overall-accessible dictionary (commondict?).

Printing text on a page consists of the following parts:

  • setting a font by font name through intelligent procedure myselectfont,
  • printing a paragraph, which relies on newline.
  • higher stuff, auxiliaries, etc.

Fonts in PostScript

FontDirectory

FontDirectory pushes a dictionary of defined fonts on the operand stack. FontDirectory is not an operator; it is a name in systemdict associated with the dictionary object.

The FontDirectory dictionary associates font names with font dictionaries. definefont places entries in FontDirectory, and findfont looks there first. The dictionary is read-only; only definefont and undefinefont can change it.

Although FontDirectory contains all fonts that are currently defined in VM, it does not necessarily describe all the fonts available to a PostScript language program. This is because the findfont operator can sometimes obtain fonts from an external source and load them into VM dynamically. Consequently, examining FontDirectory is not a reliable method of inquiring about available fonts. The preferred method is to use the resourcestatus and resourceforall operators, which are Level 2 features, to inquire about the Font resource category. (See section Named Resources.)

In Level 2, when global VM allocation mode is in effect (see section Local and Global VM), the name FontDirectory is temporarily rebound to the value of GlobalFontDirectory, which contains only those fonts that have been defined in global VM. This ensures the correct behavior of fonts that are defined in terms of other fonts.

The Names of the PostScript Fonts

Sans-Serif fonts:

  • AvantGarde-Book
  • AvantGarde-BookOblique
  • AvantGarde-Demi
  • AvantGarde-DemiOblique
  • Helvetica
  • Helvetica-Oblique
  • Helvetica-Bold
  • Helvetica-BoldOblique
  • Helvetica-Narrow
  • Helvetica-Narrow-Oblique
  • Helvetica-Narrow-Bold
  • Helvetica-Narrow-BoldOblique

Special Fonts:

  • Courier
  • Courier-Oblique
  • Courier-Bold
  • Courier-BoldOblique
  • Symbol
  • ZapfChancery-MediumItalic
  • ZapfDingbats

Serif fonts:

  • Bookman-Light
  • Bookman-LightItalic
  • Bookman-Demi
  • Bookman-DemiItalic
  • NewCenturySchlbk-Roman
  • NewCenturySchlbk-Italic
  • NewCenturySchlbk-Bold
  • NewCenturySchlbk-BoldItalic
  • Palatino-Roman
  • Palatino-Italic
  • Palatino-Bold
  • Palatino-BoldItalic
  • Times-Roman
  • Times-Italic
  • Times-Bold
  • Times-BoldItalic

Intelligent Procedures For Setting Fonts

Procedure myselectfont is polymorphic, hence its complication. I reproduce it in full here:

/leadk 0.88 def
/myselectfont {
% stack: /fontname scale|[a b c d e f]
  /currentfontname 2 index def % update var currentfontname
  dup
  dup type /arraytype eq
  {
   dup 0 get /fontscale exch def % fontscale = horizontal scale factor
   /fonttransformationmatrix
  } {/fontscale} ifelse exch def
  selectfont
  /lead fontscale leadk mul def
  /space_width ( ) stringwidth pop def
  gsave
    setbackgroundcolor
    newpath 0 0 moveto
    (g) true charpath pathbbox /fontbodyheight exch def pop /fontdepth exch def pop
    (t) true charpath pathbbox /fontheight exch def 3 {pop} repeat
  grestore
} def

Here is a non-polymorphic and much simplified version of myselectfont:

/leadk 0.88 def
/myselectfont {
% stack: /fontname scale
  /fontscale exch def
  /fontname exch def % update var currentfontname
  currentfontname fontscale selectfont
  /lead fontscale leadk mul def
} def

Show a String Left of, Below, Over, or At a Given Point

When we show a string with built-in operator show the lower left hand corner of the first glyph is at the initial currentpoint. This is showing a string right of the currentpoint.

The names of these procedures bear a short (one-letter) abbreviation for center (c), left, under, and over. More complete procedures should take into account the body height of the font.

This is a simple procedure for showing a string centered at the current point:

/showcat { % stack: (string) x y
  gsave moveto
    dup stringwidth pop 2 div neg  0  rmoveto
    show
  grestore
} def

A procedure for printing a string left of the currentpoint could be defined analogously. Just don't divide by 2:

/showleft of { % stack: (string) x y
  gsave moveto
    dup stringwidth pop       neg  0  rmoveto
    show
  grestore
} def

whereas procedures for printing over and under would compound adding to or subtracting (vertical line separation) from the top argument on the stack:

lead add showcat

and

lead sub showcat

New Lines in PostScript

Next we shall define a newline procedure that will move the current point to the left margin and one line separation below. If should fall under the bottom margin line, a page will be turned through turnpage.

Procedure newline relies on lead, for vertical line separation.

/newline {
  currentpoint
  dup lead sub  bm  gt
  {
    exch pop lm exch
    lead sub
    moveto
  } {pop pop turnpage} ifelse
} def

Printing Paragraphs in PostScript

Printing a paragraph may involve breaking the line and moving to the start of the next line. The following sections deal with different situations.

Breaking a String Into Lines with showpar

We shall begin by printing ragged-right (or right-unjustified), non-word-breaking paragraphs. This is our first attempt:

/showpar { % stack: (string to be printed)
  ( )
  {
    search
    {
      dup stringwidth pop currentpoint pop add rm gt {newline} if
      show ( ) show
    }
    {
      dup stringwidth pop currentpoint pop add rm gt {newline} if
      show
      exit
    }
    ifelse
  } loop
} def

Note that the twin longest lines (dup stringwidth ...) cause a newline event if the word to be printed would move the current point past rm.

Mixed Paragraphs

A mixed paragraph is represented as an heterogenous array of (strings) and {executables} (showmixedpar2). Each string is printed through afore-defined showpar, and each procedure is, well, executed! The code for is this simple:

/showmixed2 { % stack: [{text|executable}*]
  {dup type /stringtype eq {showpar}{exec} ifelse} forall
} def

Additionally, a paragraph may be represented as an heterogenous array of (strings), {executables}, and numbers. Each of these numbers in the array precedes an executable. The paragraph composer uses this information to find if printing the procedure would move the current point beyond the right margin. The following procedure (showmixedpar3) just expands [the exec keyword in] preceding showmixedpar2:

/showmixed3 { % stack: [{text|executable|number}*]
  {dup type /stringtype eq {showpar}{
    dup xcheck {exec}{currentpoint pop add rm gt {newline} if} ifelse
  } ifelse} forall
} def

Strategies and Schemes for Breaking Words*

Having PostScript Declare an Array of Widths for Each of its Fonts

The intended purpose of this is to let another program written in a more congenial programming language (C, C++, etc.) do the computing and output a dumb PostScript script.

Printing the widths of a font for C, C++, or other

The actual procedure used is slightly involved because it declares an array of 256 widths as well as a two dimensional array of 256 fourfold items representing the bounding box of each character. (The Bounding Box of a picture is the coordinates of its lower left-hand and upper right-hand corners.)

I have named it definecharbb. It takes the name of an existing (possibly reencoded) font, a more convenient name without hyphens, and a string holding the name of the output file. It depends on variable definecharbb_fontscale, which is defined as 10. Its code is:

/definecharbb { % stack: /fontname /fontid (file_name)
  10 dict begin
    256 string cvs (w) file /fo exch def % open output file
    /char 1 string def               % define 1-char string
    /fontid   exch def
    /fontname exch def
    gsave
      % selectfont at given fontscale:
      fontname definecharbb_fontscale selectfont
    % DEFINE BOUNDING BOX ARRAY:
      fo               (float )  writestring
      fo  fontid 256 string cvs  writestring
      fo    (_bb[256][4] = {\n)  writestring
      clear
      % iterate through all characters to write bb array[255][4]:
      0 1 255 {
        /idx exch def
        fo (  {) writestring
        0 0 moveto
        idx char exch 0 exch put % load char
        char true charpath pathbbox % push bounding box
        4 -1 1 {-1 roll} for % reverse order
        4 {definecharbb_fontscale div 4 1 roll} repeat % divide by fontscale used
        3 {
          20 string cvs fo exch writestring
          fo (,) writestring
        } repeat
          20 string cvs fo exch writestring
        fo  idx 255 lt {(},\n)}{(}\n)} ifelse  writestring
      } for
      % terminate:
      fo (};\n) writestring
    % DEFINE WIDTHS ARRAY:
      fo               (float )  writestring
      fo  fontid 256 string cvs  writestring
      fo    (_widths[256] = {\n)  writestring
      clear
      % iterate through all characters to write widths array[255]:
      0 1 255 {
        /idx exch def
        fo (  ) writestring
        0 0 moveto
        idx char exch 0 exch put % load char
        char stringwidth pop     % push char's width
        definecharbb_fontscale div
        20 string cvs  fo exch  writestring
        fo  idx 255 lt {(,\n)}{(\n};)} ifelse  writestring
      } for
    grestore
    fo flushfile
    fo closefile
  end
} def

If you don't want to declare a bounding box array, delete all code between comments % DEFINE BOUNDING BOX ARRAY and % DEFINE WIDTHS ARRAY:.

Making or Changing Fonts

The first example, Re-encoding an Entire Font, presents a general procedure (ReEncode) for changing the encoding vector of a font. The second example, Making Small Changes to Encoding Vectors, presents an alternative to replacing the entire encoding vector for situations when the encoding vector only needs to be changed slightly. Most of the built-in fonts contain characters that have not been encoded, such as accented characters. To print such characters, the name of the character must be inserted into the encoding vector. However, we do not want to specify the entire encoding vector to insert a few new characters so the procedure ReEncodeSmall has been defined to handle this insertion.

The third example, Changing the Character Widths of a Font, defines a general procedure, ModifyWidths, for changing some or all of the character widths in a given font. It changes the necessary entries in the font dictionary. In this example the character widths of a font are rounded such that when the characters are printed at a certain point size, the widths will be an integral number of pixels in device space. This is useful for avoiding round-off error in positioning characters with the show operator.

The fourth program, Creating an Analytic Font, demonstrates how to create a new font whose character descriptions are geometric in nature. The program defines all the necessary font dictionary entries as well as some new entries of its own. The font created has 4 characters: bullets of three sizes and an open box shape. Each character is described using the PostScript graphic operators. After the font has been defined it is used in an example that prints the various characters intermixed with one of the built-in fonts.

Copying a Font

Since fonts are kept in dictionaries, you copy them by declaring a long-enough dictionary and copying everything but the ID field.

There are two important steps to remember when copying a font. The first is not to copy the FID field from the original font dictionary to the new dictionary. The FID field will automatically get created when the definefont operator is executed. Attempting to perform a definefont operation on a dictionary that already contains an FID field results in an invalidfont error. The second step is to change the FontName field in the new dictionary. The same name which appears in the FontName field should be provided as an argument to the definefont operator. The FontName should always be a unique name.

In addition, for fonts that have a UniqueID field, it is important to change the UniqueID field when the font is modified. The only case when the UniqueID field should not be changed is when the Encoding field of a font dictionary has been changed.

Re-Encoding a Font

To reencode a font you have to copy everything in it except the ID and then define your own Encoding array variable.

The encoding vector is a mapping of character codes (in the range of 0 to 255) to character names.

A font can hold more than 256 glyphs. An encoding is a mapping from the range of naturals 0-255 to 256 glyphs in the font or fewer. Most of the built-in fonts are encoded according to a standard encoding, which allows access to only the ASCII character set, good enough for English, inadequate for Spanish, French, German etc.

Reencoding entails defining a new font (which is a type in the PostScript programming language).

Below is a general encoding procedure. Further down we define a more specific procedure for encoding into ISOLATIN1.

/ReEncode {
  reencodedict begin
    /newencoding exch def
    /newfontname exch def
    /basefontname exch def
    /basefontdict basefontname findfont def
    /newfont basefontdict maxlength dict def
    basefontdict {
      exch dup dup /FID ne exch /Encoding ne and
      { exch newfont 3 1 roll put }
      { pop pop }
      ifelse
    } forall
    newfont /FontName newfontname put
    newfont /Encoding newencoding put
    newfontname newfont definefont pop
  end
} def

Usually we want to redefine a font to the ISOLatin1 encoding, as its tables are included in the documentation. We shall avail ourselves of the following convenient PostScript procedure lifted from the internet:

/EncodeAsISOLatin1 { % usage: /existingFont /newfont EncodeAsISOLatin1
  /newfontname exch def
  findfont            %
  dup length dict begin
  { 1 index /FID ne
    {def} {pop pop} ifelse
  } forall
  /Encoding ISOLatin1Encoding def
  currentdict
  end
  newfontname exch definefont pop
} def/

and is used like this:

/Times-Roman /Times-Roman-ISOLatin1 EncodeAsISOLatin1
/Times-Bold /Times-Bold-ISOLatin1 EncodeAsISOLatin1
...

When encoding accented characters it is important to understand that accented characters (also known as composite characters) are actually a composite of the letter and the accent. In order to print accented characters properly, both the letter and the accent of the composite character must be encoded in the encoding vector, as well as the composite character itself. For example, if you wish to encode the composite character Aacute both the A and the acute must be encoded.

Re-encoding an Entire Font*
Making Small Changes to Encoding Vectors*

Modifying Existing Fonts

The basic strategy for modifying an existing font is to create an entirely new font dictionary and to copy all the references to entries in the original font dictionary, except for the FID entry, into the new dictionary. The next step is to modify the appropriate fields. The last step is to perform a definefont operation on the modified font dictionary to make it into a PostScript font.

Changing the Character Widths of a Font
Making an Outline Font*

Creating New Fonts

When creating new fonts, certain font dictionary entries must be present. They are FontMatrix, FontType, FontBBox, Encoding and BuildChar. For a user defined font, the FontType should always have the value 3. In addition, it is useful, although not necessary, to have a UniqueID entry. The UniqueID entry facilitates better caching of characters on disk-based implementations of the PostScript interpreter. (Be forewarned that the UniqueID must truly be a unique 24 bit number and that the creator of the font is responsible for ensuring this.)

The BuildChar procedure is responsible for specifying how a character in the new font is rendered. It should always call either the setcachedevice or setcharwidth operator. The BuildChar procedure can use almost all of the PostScript operators to render a character. However, there are some restrictions when the character is to be cached (i.e., when the setcachedevice operator has been used). In this case, any of the operators related to gray-level and color are invalid (e.g., setgray, setrgbcolor, image, etc).

In the character descriptions for a new font, it is a good idea to create a character description that will be printed for undefined characters. This character is called .notdef in the built-in fonts, and it is defined to print nothing. When users try to print characters that have not been defined in the font, the .notdef character is printed; the .notdef character is a graceful way of avoiding unexpected errors. As well as creating a character description for the undefined character, it is important that the encoding vector have the name of this undefined character in each location that does not have a character defined. The simplest way to do this is to initialize all the entries in the encoding vector to contain the .notdef character and then enter the character names in the desired positions.

Creating an Analytic Font

Pdfmarks

pdfmarks are an extension of PostScript® code used to represent PDF features.

PDF features defined by pdfmarks are automatically generated when the PostScript code is converted into a PDF.

What can you do with pdfmarks?

pdfmarks enable you to:

  • ADD annotations, links, form fields, bookmarks, articles, named destinations, page transitions
  • SET page cropping, to open to a specific page, to open with bookmarks or thumbnails displayed
  • DEFINE page labels, document information fields

In other words many of the enhancements that make PDFs dynamic and interactive can be implemented with pdfmarks.

Understanding pdfmarks

Pdfmarks have three parts:

  • Every pdfmark starts with the mark object - this is the [ (left bracket) character.
  • Arguments: these describe what the pdfmark's features are. These are the ingredients of the pdfmark recipe.
  • A name: this specifies the kind of pdfmark.
[

  /Rect [ 0 0 216 144 ]
  /Open true
  /Title (Lynn)
  /Contents (This is a Red Note)
  /Color [1 0 0 ]
  /Subtype /Text

  /ANN pdfmark

Arguments are expressed as Key-Value pairs. Each type of pdfmark uses a set of required and optional key-value pairs. Some key-value pairs are specific to a type of pdfmark, some are used more globally.

Keys, being PostScript names, start with a slash (/), followed by the name of the key (first character capitalized). Some examples of keys are: /Rect, /Color, /Page.

Values can be expressed in several different formats: string, array, integer, boolean, name.

Specifying page numbers

All pages in a PDF document are numbered sequentially; the first page in a document is page 1. When referring to pages in a pdfmark recipe, all page numbers must be specified using this sequence number, not the page number as it appears on the printed page.

This is an example of an argument specifying a page number:

/Page 2

Defining Colors

Colors are defined by an array of three numbers placed between brackets that represent RGB values and look something like this: [1 0 .65 ].

These numbers must be 0, 1 or any decimal in between and represent percentages of Red, Green and Blue. Percentages between 0 and 100% are defined by decimals. There is no required number of digits to the right of the decimal point. 1, .6, ,.25, .824 are all valid.

This is an example of an argument that defines color:

/Color [ .3 .6 .734 ]

Rectangles

A rectangle is described by an array of four numbers:

[XLL YLL XUR YUR ]

Setting Views

To define how a page will display on the screen you use the /View key. The /View key can be used when defining links, named destinations, open options, or bookmarks. The following list describes the different values available, what they mean:

/Fit

Fit the page to the window

parameters: none

example: /View [ /Fit ]

/FitB

Fit the bounding box of the page contents to the window.

parameters: none

example: /View [ /FitB ]

/FitH

Fit the width of the page to the window. top specifies the distance from the page origin to the top of the window. If top value is -32768 the top value is calculated automatically.

parameters: top

examples:/View [ /FitH -32768 ] and /View [ /FitH 5 ]

/FitBH

Fit the width of the bounding box of the page contents to the window. top specifies the distance from the page origin to the top of the window.

parameters: top

example: /View [ /FitBH 5]

/FitR

Fit the rectangle specified by the parameters to the window.

parameters: x1 y1 x2 y2

example: /View [/FitR 30 648 209 761]

/FitV

Fit the height of the page to the window. left specifies the distance in from the page origin to the left edge of the window.

parameters: left

example: /View [ /FitV -5 ]

/FitBV

Fit the height of the bounding box of the page contents to the window. left specifies the distance from the page origin to the left edge of the window.

parameters:left

example: /View / [ /FitBV 18 ]

/XYZ

left and top specify the distance from the origin of the page to the top-left corner of the window. zoom specifies the zoom factor, with 1 being 100% magnification.

specifying a view destination of

/View [/XYZ null null null ]

goes to the specified page and retains the same horizontal and vertical offset and zoom as the current page. A zoom of 0 has the same meaning as a zoom of NULL.

parameters: left top zoom

example: /View [/XYZ 5 802 1.5 ]

example: /View [/XYZ null null 0 ]

Specifying Actions and Destinations

When a user opens a file, clicks on a link, or clicks on a bookmark, there are several types of information that need to be specified in order to indicate what should happen. Different pdfmark types require one or more of the following:

Actions

Actions specify what type of action should be taken. They are indicated by the Action key in a pdfmark.

Destinations

Destinations specify a particular location in a file, and a zoom factor.

View destinations require a Page key and a View key. Typically they are used along with an Action key; if there is no Action key, the action is the equivalent of GoTo, meaning to jump to the destination in the current file.

Alternatively, named destinations can be used, specified by the Dest key. They specify a destination in the same file or another file, by name.

Files

File specifiers are indicate the target of an action when it is not the current file.

pdfmark actions

PDF defines several types of actions that can be specified for bookmarks and annotations. The types defined as of PDF 1.3 are:

  • GoTo: Go to a destination in the current document

  • GoToR: Go to a destination in another document

  • Launch: Launch an application, usually to open a file

  • Thread: Begin reading an article thread

  • URI: Resolve a uniform resource identifier

  • Sound: Play a sound

  • Movie: Play a movie

  • Hide: Set an annotation's Hidden flag

  • Named: Execute an action predefined by the viewer application

  • SubmitForm: Send data to a URL

  • ResetForm: Set fields to their default values

  • ImportData: Import field values from a file

  • JavaScript: Execute a JavaScript script

When using pdfmark, the type of action for the annotation or bookmark is specified by the Action key. It takes one of the following values:

  • A predefined name corresponding to one of the first four items in the foregoing table: GoTo, GoToR, Launch, or Article (which corresponds to the Thread type in PDF).
  • A dictionary specifying one of the other types, or a custom action. This dictionary must contain the key–value pairs that are to be placed into the action dictionary in the PDF file. See Section 8.5 in the PDF Reference for a detailed description of all the actions and their dictionaries. The syntax for this type of Action key is:

    /Action << / Subtype actiontype
    ...other action dictionary key–value pairs... >>

    If the Action key is not present, the action is assumed to be the equivalent of GoTo; that is, jumping to a location in the current document. Actions other than GoTo may require a file-specifier key to specify an external document

Custom link action (URI link for the Acrobat WebLink plug-in): Examples
[ /Rect [50 425 295 445]
/Action << /Subtype /URI /URI (http://www.adobe.com) >>
/Border [0 0 2]
/Color [.7 0 0]
/Subtype /Link
/ANN pdfmark
% Equivalent link using Launch action
[ /Rect [50 425 295 445]
/Action /Launch
/Border [0 0 2]
/Color [.7 0 0]
/URI (http://www.adobe.com)
/Subtype /Link
/ANN pdfmark
% URI link with a named destination
[ /Rect [50 425 295 445]
/Action << /Subtype /URI /URI (http://www.adobe.com#YourDestination) >>
/Border [0 0 2]
/Color [.7 0 0]
/Subtype /Link
/ANN pdfmark
GoTo Actions

GoTo actions jump to a specified page and zoom factor within the current document. They require the Dest key, or both the Page and View keys.

GoToR Actions

GoToR actions specify a location in another PDF file. They require the Dest key, or both the Page and View keys, plus one or more file-specifier keys. Here is a bookmark whose action consists of opening a file:

[ /Action /GoToR /File (test.pdf) /Page 2 /View [/FitR 30 648 209 761]
/Title (Open test.pdf on page 2) /OUT pdfmark

The following list specifies keys that can be used with the GoToR, Launch, and Article actions to specify the target file:

File

string (Required)

The device-independent pathname of the PDF file.

DOSFile

string (Optional)

The MS-DOS pathname (in the PDF pathname format), of the PDF file. Acrobat viewer applications on Windows and DOS computers ignore the File key if the DOSFile key is present.

MacFile

string (Optional)

The Mac OS filename (in the PDF pathname format) of the PDF file. Acrobat viewer applications on Mac OS computers ignore the File key if the MacFile key is present.

UnixFile

string (Optional)

The UNIX filename (in the PDF pathname format) of the PDF file. Acrobat viewer applications on UNIX computers ignore the File key if the UnixFile key is present.

URI

string (Optional)

The uniform resource identifier (URI) of a file on the internet. It can be an HTML file as well as a PDF file. Acrobat viewer applications ignore the File key if the URI key is present. Named destinations may be appended to URLs, following a # character, as in http://www.adobe.com/test.pdf#name. The Acrobat viewer displays the part of the PDF file specified by the named destination.

ID
array (Optional) An array of two strings specifying the PDF file ID. This key can be used to ensure the correct version of the destination file is found. If present, the destination PDF file's ID is compared with ID, and the user is warned if they are different.
Launch Actions

Launch actions launch an arbitrary application or document, specified by the File key.

Here is an example of a link that launches another file:

[ /Rect [70 600 210 625]
/Border [16 16 1]
/Color [0 0 1]
/Action /Launch
/File (test.doc)
/Subtype /Link
/ANN pdfmark
Article Actions

Article actions set the Acrobat viewer to article-reading mode, at the beginning of a specified article in the current document or another PDF document.

They require the Dest key, which takes one of the following values:

  • An integer that specifies the article's index in the document (the first article in a document has an index of 0)
  • A string that matches the article's Title.

In addition, article actions require one or more file-specifier keys if the article is in a different PDF file (see list of GoToR actions).

Here is an example:

[ /Action /Article /Dest (Now is the Time)
/Title (Now is the Time)
/OUT pdfmark

pdfmark destinations

There are two ways of specifying a location within a document that is the target of an action:

  • View destinations explicitly specify a page, a location on the page, and a fit type.
  • Named destinations specify the target as a name which has been defined.
View Destinations

View destinations require the following two keys:

Page

integer or name

The destination page.

An integer value represents the sequence number of the page within the PDF file.

The name objects Next and Prev are valid destination page values for links and articles.

If the destination of a link is on the same page, the Page key should be omitted. If the value of the Page key is 0, the bookmark or link has a NULL destination.

View

array

Specifies a link or bookmark's destination on a page, and its fit type. The first array entry is one of the fit type names shown below. The remaining entries, if any, specify the location as either a rectangle, a point, or an x– or y–coordinate, depending on the fit type.

These are the fit type names and parameters:

Fit

No parameters.

Fit the page to the window. This is a shortcut for specifying FitR with the rectangle being the crop box for the page.

FitB

No parameters.

Fit the bounding box of the page contents to the window.

FitH

top

Fit the width of the page to the window. top specifies the distance from the page origin to the top of the window. This is a shortcut for specifying FitR with the rectangle having the width of the page, and both y-coordinates equal to top.

FitBH

top

Fit the width of the bounding box of the page contents to the window. top specifies the distance from the page origin to the top of the window.

FitR

x1 y1 x2 y2

Fit the rectangle specified by the parameters to the window.

FitV

left

Fit the height of the page to the window. left specifies the distance in from the page origin to the left edge of the window. This is a shortcut for specifying FitR with the rectangle having the height of the page, and both x-coordinates equal to left.

FitBV

left

Fit the height of the bounding box of the page contents to the window. left specifies the distance from the page origin to the left edge of the window.

FitXYZ

left top zoom

left and top specify the distance from the origin of the page to the top-left corner of the window. zoom specifies the zoom factor, with 1 being 100% magnification. If left, top or zoom is NULL, the current value of that parameter is retained. For example, specifying a view destination of

/View [/XYZ NULL NULL NULL]

goes to the specified page and retain the same horizontal and vertical offset and zoom as the current page. A zoom of 0 has the same meaning as a zoom of NULL.

The zoom factors for the horizontal and vertical directions are identical; there are not separate zoom factors for the two directions. As a result, more of the page may be shown than specified by the destination. For example, when using FitR, portions of the page outside the destination rectangle appear in the window, unless the window happens to have the same aspect ratio (height-to-width ratio) as the destination rectangle.

A common destination is upper left corner of the specified page, with a zoom factor of 1. This can be obtained using the XYZ destination form, with a left of -4 and a top equal to the top of the CropBox (or the page size if no CropBox was specified) plus 4. The offset of 4 is used to slightly move the page corner from the corner of the window, to provide a visual cue that the corner of the page is being shown.

Named Destinations

Locations in PDF files can be specified by name instead of by page number and view. These names can then be used as destinations of bookmarks or links. Using named destinations is particularly advantageous for cross-document links, because if the document containing a link's destination is revised, the link will still work, regardless of whether its location in the file has changed.

A named destination is specified by using the pdfmark operator in conjunction with the name DEST. The syntax for a named destination pdfmark is:

 [ /Dest name
  /Page pagenum
  /View destination
  /DEST pdfmark

These are the named destination attributes

Dest

name (Required): The destination's name.

Page

integer (Optional): The sequence number of the destination page. If present, the named destination pdfmark may be placed anywhere in the PostScript language file. If omitted, the pdfmark must occur within the PostScript language description for the destination page.

View

array (Optional): The view to display on the destination page. If omitted, defaults to a null destination (lower left corner of the page at a zoom of 100%). See Destinations for information on specifying a view destination.

In addition to the keys listed above, named destinations may also specify arbitrary key–value pairs.

Named destinations may be appended to URLs, following a # character, as in http://www.adobe.com/test.pdf#nameddest=name. The Acrobat viewer displays the part of the PDF file specified in the named destination.

Referencing Named Destinations

Named destinations that have been defined with the DEST pdfmark can be used as the target of a bookmark or link, or by the optional open action in a document's Catalog dictionary. They are specified using the Dest key.

Example of a Named Destination

Definition of named destination

[ /Dest /MyNamedDestination
  /Page 1
  /View [/FitH 5]
  /DEST pdfmark

Link to a named destination:

[ /Rect [70 650 210 675]
  /Border [16 16 1 [3 10]]
  /Color [0 .7 1]
  /Dest /MyNamedDest
  /Subtype /Link
  /ANN pdfmark

Page Cropping (PAGES, PAGE)

Page cropping is used to specify the dimensions of a page or pages in a PDF file that will be displayed or printed (without altering the actual data in the file). Cropping is specified by using the pdfmark operator in conjunction with the names PAGES (for the entire document) or PAGE (for an individual page).

The syntax for specifying the default page cropping for a document is:

[ /CropBox [xll yll xur yur]
  /PAGES pdfmark

The syntax for specifying a non-default page cropping for a particular page in a document is:

[ /CropBox [xll yll xur yur]
  /PAGE pdfmark

The CropBox key is an array representing the location and size of the viewable area of the page. CropBox is an array of four numbers [xll, yll, xur, yur] specifying the lower-left x, lower- left y, upper-right x, and upper-right y coordinates—measured in default user space—of the rectangle defining the cropped page. The minimum allowed page size is .04 x .04 inch (3 x 3 units) and the maximum allowed page size is 200 x 200 inches (14,400 x 14,400 units) in the default user space coordinate system.

The PAGES pdfmark can be placed anywhere in the PostScript language program, but it is recommended that it be placed at the beginning of the file, in the Document Setup section between the document structuring comments %%BeginSetup and %%EndSetup, before any marks are placed on the first page.

The PAGE pdfmark must be placed before the showpage operator for the page it is to affect. It is recommended that it be placed before any marks are made on the page. For example, it affects only the first page of a document if it is placed before any marks are made on the first page.

Examples of Cropping

Crop this page
% ...
[ /CropBox [0 0 288 288] /PAGE pdfmark
/Helvetica findfont 12 scalefont setfont
/DrawBorder              {
  10 278 moveto 278 278 lineto 278 10 lineto
  10 10 lineto closepath stroke
} bind def
%%EndSetup
%%Page: 1 1

DrawBorder
75 250 moveto (This is Page 3) show
75 230 moveto (Click here to go to page 1.) show
75 200 moveto (Click here to open test.doc.) show
Crop all pages
% ...
[ /CropBox [54 403 558 720] /PAGES pdfmark
/DrawBorder
{
58 407 moveto 554 407 lineto 554 716 lineto
58 716 lineto closepath stroke
} bind def

/Helvetica findfont 10 scalefont setfont
%%EndSetup

%%Page: 1 1
DrawBorder
75 690 moveto (This is Page 1) show
75 670 moveto (Below is a closed, default note created using pdfmark:) show
75 570 moveto (Below is an open note with a custom color and label:) show
400 670 moveto (Below is a closed note) show
400 655 moveto (containing private data:) show
400 570 moveto (Below is a custom annotation.) show
400 555 moveto (It should appear as an unknown) show
400 540 moveto (annotation icon:) show

Info Dictionary (DOCINFO)

A document's Info dictionary contains key–value pairs that provide various pieces of information about the document. Info dictionary information is specified by using the pdfmark operator in conjunction with the name DOCINFO.

The syntax for specifying Info dictionary entries is:

[ /Author       string
  /CreationDate string
  /Creator      string
  /Producer     string
  /Title        string
  /Subject      string
  /Keywords     string
  /ModDate      string
  /DOCINFO pdfmark

All the allowable keys are strings, and they are all optional. In addition to the keys listed above, arbitrary keys (which must also take string values) can be specified.

/Producer is the name of the application that converted the document from its native form to PDF.

The date and time the document was last modified should be of the form:

(D:YYYYMMDDHHmmSSOHH'mm')

D: is an optional prefix. YYYY is the year. All fields after the year are optional. MM is the month (01-12), DD is the day (01-31), HH is the hour (00-23), mm are the minutes (00-59), and SS are the seconds (00-59). The remainder of the string defines the relation of local time to GMT. O is either + for a positive difference (local time is later than GMT) or - (minus) for a negative difference. HH' is the absolute value of the offset from GMT in hours, and mm' is the absolute value of the offset in minutes. If no GMT information is specified, the relation between the specified time and GMT is considered unknown. Regardless of whether or not GMT information is specified, the remainder of the string should specify the local time.

Document Open Options (DOCVIEW)*

Tips, Common Idioms, and Techniques

Scope

Name scope is usually needed to avoid name clash and enables reusing short, easy, descriptive names inside procedures and other blocks.

Name scope in PostScript is enforced through local dictionaries. Graphic scope is enforced through "gsave"..."grestore" blocks.

Variables Writeable from Anywhere

A global variable can be read from anywhere. But writing to it from inside a local, enclosed dictionary just creates a same-named local variable (think "association") that is lost as soon as the interpreter leaves the local dictionary. Consider the scenario of a global variable "pagenum" and a procedure that attempts to increment it from within a local dictionary:

/pagenum 0 def
3 dict begin
  /pagenum pagenum 1 add def
end

The code inside the local dictionary just creates a local variable "/pagenum" with value 2. When the dictionary closes, all its associations get lost and forgotten.

The workaround is to define a named dictionary at the top level -and therefore accessible everywhere- meant to hold global writeable variables. You may call it "writeabledict", "commondict" or whatever. For instance you would code:

/commondict 20 dict def
commondict begin
  /pagenum pagenum 1 add def
end

or just:

/commondict 20 dict def
commondict /pagenum 1 put

You could write accessors to read and modify "commondict", possibly availing yourself of operators "put" and "get":

/getpagenum {commondict /pagenum get} def
/resetpagenum {commondict /pagenum 1 put} def
/incpagenum {commondict /pagenum getpagenum 1 add put} def

Dictionaries

Dictionaries are scopes where names are mapped to other objects such as numbers, arrays, strings, procedures or possibly names or dictionaries, too. They behave very much like namespaces in C++.

You create a dictionary on the stack by pushing a positive integer followed by procedure dict. The positive integer is the dictionary's initial capacity, which in later versions grows automatically on demand. Afterwards, a dictionary is opened by operator begin, and is closed by operator end. The following code creates an unnamed dictionary, opens it, declares some variables (c1 and c2), uses them to show the square root of c12 + c22, and closes it without polluting the global namespace.

5 dict begin
  /c1 3 def
  /c2 4 def
  c1 dup mul c2 dup mul add sqrt  =
end

Dictionaries define local scope, so they are used heavily in the body of procedures. Also they are good for prototyping complex behaviour or computations. They resemble classes in OOP.

Advanced Classes

Mixin Classes

A mixin is a class that contains members for use by other classes without having to be the parent class of those other classes. Mixins are sometimes described as being included rather than inherited.

Mixins encourage code reuse and can be used to avoid the inheritance ambiguity that multiple inheritance can cause, or to work around lack of support for multiple inheritance in a language. A mixin can also be viewed as an interface with implemented methods..

Polymorphism

Polymorphism means a procedure's behaving differently according to the type of the operands. Many standard operators exhibit some degree of polymorphism: "copy", "get", and "put" are applied to arrays, strings, and dictionaries, for instance.

To implement polimorphism use operator "type", which returns a name, one of /stringtype, /arraytype, /nametype, and so on. For instance, to use an input argument which is either a name or a string, you might code:

dup type /nametype eq {...}{...} ifelse

In the first block you would process the argument as a name, in the second as a string.

Common PostScript Idioms

  • If you are going to use an argument inside a procedure more than once, it may be convenient to define it inside a local unnamed dictionary: "1 dict begin /myvar exch def ... end"
  • You can define several arguments on the stack like this: "[/name_n ... /name2 /name1] {exch def} forall"
  • To check if any element of an array fullfills a condition, which is given in a procedure ({cond}) reading one argument and returning true or false, code like this: "false <array> {{cond} exec or} forall" When "forall" finishes executing the top of the stack will be either true or false.
  • Move into a local graphic context while keeping the current point like this: "currentpoint gsave moveto ... grestore".
  • To fill and stroke a path, notice that either "fill" or "stroke" erase the current path. To fill the path in gray, and then stroke its outline, code "... gsave 0.5 setgray fill grestore stroke"

Encapsulated PostScript

An encapsulated PostScript file is a PostScript language program describing the appearance of a single page. Typically, the purpose of the EPS file is to be included, or encapsulated, in another PostScript language page description. The EPS file can contain any combination of text, graphics, and images, and it is the same as any other PostScript language page description with only a few restrictions

At minimum, an EPS file contains a header comment (%!PS-Adobe-3.0 EPSF-3.0) and a BoundingBox DSC comment, describing the rectangle containing the image described by the EPS file. Applications can use this information to lay out the page, even if they are unable to directly render the PostScript inside. An application importing an EPS file must parse the EPS file for DSC comments and extract at least the bounding box and resource dependencies of the EPS file.Here is a typical EPS prolog:

%!PS-Adobe-3.0 EPSF-3.0
%%Creator: dvips(k) 5.95a Copyright 2005 Radical Eye Software
%%Title: texput.dvi
%%Pages: 1
%%PageOrder: Ascend
%%BoundingBox: 0 0 612 792
%%DocumentPaperSizes: Letter
%%EndComments

The four arguments of the bounding box comment correspond to the lower-left (llx, lly) and upper-right (urx, ury) corners of the bounding box. They are expressed in the default PostScript coordinate system. For an EPS file, the bounding box is the smallest rectangle that encloses all the marks painted on the single page of the EPS file.

Graphics state information, such as the current line width and line join parameters, must be considered when calculating the bounding box. The following example shows a minimally conforming EPS file that draws a square with a line width of 10 units.

%!PS-Adobe-3.0 EPSF-3.0
%%BoundingBox: 5 5 105 105
10 setlinewidth
10 10 moveto
0 90 rlineto 90 0 rlineto 0 -90 rlineto closepath
stroke

If the line width were not considered when calculating the bounding box, the bounding box would be incorrectly positioned by five units on each side of the square, causing the application to incorrectly place and clip the imported EPS file. The bounding box specified for this example is therefore correct.

EPS Restrictions

The EPS program must not use operators that initialize or permanently change the state of the machine in a manner that cannot be undone by the enclosing application's use of save and restore (e.g.. the operators starting with init, such as initgraphics). As a special case, the EPS program may use the showpage operator. The importing application is responsible for disabling the normal effects of showpage. The EPS program should make no environment-sensitive decisions (the importing application may be trying to attain some special effect, and the EPS program shouldn't screw this up), although it can use some device-dependent tricks to improve appearance such as a snap-to-pixel algorithm.

There are some operators that should not be used within an EPS file: banddevice, cleardictstack, copypage, erasepage, exitserver, framedevice, grestoreall, initclip, initgraphics, initmatrix, quit, renderbands, setglobal, setpagedevice, setshared and startjob. These also include operators from statusdict and userdict operators like legal, letter, a4, b5, etc. There are some operators that should be carefully used: nulldevice, setgstate, sethalftone, setmatrix, setscreen, settransfer and undefinefont.

Including EPS files in Plain TeX

You can use the simple epsf package which is designed for plain TeX:

\input epsf
%optional \epsfxsize=dimen or \epsfysize=dimen
\epsfbox{filename.eps}

Including EPS files in Plain TeX (pdftex Users)

For pdftex users the \pdfximage approach might seem appealing. The \pdfximage command creates an image object. The dimensions can be controlled in a similar way to a rule, i.e.

\pdfximage width ... height ... depth ... <general text>

where <general text> is the file name. (\pdfximage has many more parameters which can be looked up in the pdftex manual. For advanced things such as adjusting the bounding box also consult the manual.)

Still, \pdfximage only creates an object but does not insert anything yet. Therefore \pdfrefximage has to be used. The command

\pdfrefximage <integer>

places a whatsit including the image stored in the object referred to by <integer> in the output at shipout. In principle one could output the number corresponding to each image object, write them down and then later do, e.g. \pdfrefximage3. This is tedious and error prone, though. That's why there is \pdflastximage which is a count register always containing the number of the last \pdfximage, so including an image can be boiled down the following code

\pdfximage width 3cm {example-image-a.pdf}
\pdfrefximage\pdflastximage
\bye

Converting EPS to JPEG

This is one suggesting for conversion:

gs -sDEVICE=jpeg -dJPEGQ=100 -dNOPAUSE -dBATCH -dSAFER  -dDEVICEWIDTHPOINTS=w -dDEVICEHEIGHTPOINTS=h -r300 -sOutputFile=myfile.jpg myfile.eps

The -r switch selects the resolution. You will have to substitute the actual width and height as well, unless you want A4 size.

The format of the output file is included in the output device. -sDEVICE may be jpeg, png16m, or some gray-scale version, like jpeggray, pnggray etc.

Drawing in PostScript

If you create an Encapsulated PostScript file, you may converted it to a jpeg-extension file as explained here.

PostScript Paths

A path is a set of open and closed lines. For practical purposes, lines may be stroked in the color that has been set, whereas only closed lines may be filled with color.

Preserving and Erasing the Current Path

The following operators, among others, erase the current path and currentpoint:

  • newpath:
  • fill:
  • stroke:
  • showpage:

The following operators, among others, do not erase the current path or currentpoint:

  • clip:
  • show and such:

Getting and Setting the Current Path

Operator upath takes a boolean and pushes the userpath. If the boolean is true then ucache is included.

Complementary operator uappend takes a userpath argument and appends it to the current path.

Adding Paths from an Enclosed Graphics Scope

In the following template, operator upath pushes the user path onto the stack, while uappend appends it to the current, enclosing path.

gsave
  % your code
  false upath
grestore uappend.

Recurrent Pictures

For a picture such as an electronic component that is to be drawn several times you need a template. I suggest that you create a dictionary for each class... These are the steps:

  • define a dictionary to paint your component
  • include procedures initialize, drawat, possibly drawscaledat too, and so on...
  • after that, make it readonly, so that it must be copied in order to be modified

Such a dictionary you might use like this:

  • Copy it into a named or unnamed dictionary.
  • Change or redefine its parameters inside it.
  • Call initialize once, and
  • presto

You may also derive another dictionary from it and thereby reuse parameters and procedures. All you have to do is:

  • Copy it.
  • Change or add some parameters.
  • Write an initialize procedure that calls the previous initialize and does some more work as well.
  • Overwrite your drawat procedure and such maybe to call previous drawat first thing.
  • Don't forget to run initialize.

Gradients in PostScript

This is a procedure for simulating a gradient that changes upwards from one RGB color to another:

%!PS-Adobe-3.0

/gradientup { % stack: x y w h begR begG begB endR endG endB absVerticalStep
  15 dict begin
    [/vStep /endB /endG /endR /begB /begG /begR /h /w /y /x] {exch def} forall
    y  vStep  y h add {
      gsave
        dup y sub h div /idx exch def % 'idx' ranges from 0 to 1
        begR 1 idx sub mul  endR idx mul  add  2 div
        begG 1 idx sub mul  endG idx mul  add  2 div
        begB 1 idx sub mul  endB idx mul  add  2 div
        setrgbcolor
        x  exch  w  vStep  rectfill
      grestore
    } for
  end
} def

To be tested with code like:

100 100 200 200  0.9 0.7 0.3  0.3 0.9 0.9  10  gradientup
showpage

And this is a procedure for simulating a gradient that changes rightwards from one RGB color to another:

%!PS-Adobe-3.0

/gradientright { % stack: x y w h begR begG begB endR endG endB absHorizontalStep
  15 dict begin
    [/hStep /endB /endG /endR /begB /begG /begR /h /w /y /x] {exch def} forall
    x  hStep  x w add {
      gsave
        dup x sub w div /idx exch def % 'idx' ranges from 0 to 1
        begR 1 idx sub mul  endR idx mul  add  2 div
        begG 1 idx sub mul  endG idx mul  add  2 div
        begB 1 idx sub mul  endB idx mul  add  2 div
        setrgbcolor
        y  hStep  h  rectfill
      grestore
    } for
  end
} def

A Four Color Gradient

%!PS-Adobe-3.0
/gradientoversizek 1.05 def
/gradientupright { % stack: x y w h llR llG llB lrR lrG lrB ulR ulG ulB urR urG urB absHorizontalStep absVerticalStep
  15 dict begin
    [/vStep /hStep  /urB /urG /urR  /ulB /ulG /ulR  /lrB /lrG /lrR /llB /llG /llR /h /w /y /x] {exch def} forall
    x  hStep  x w add {
      dup /X exch def
      x sub w div /idx exch def % 'idx' ranges from 0 to 1 (horizontal)
      y  vStep y h add {
        dup /Y exch def
	y sub h div /jdx exch def % 'jdx' ranges from 0 to 1 (vertical)
        gsave
          llR 1 idx sub mul  lrR idx mul  add 2 div  1 jdx sub mul
	  ulR 1 idx sub mul  urR idx mul  add 2 div    jdx     mul  add 2 div
	  llG 1 idx sub mul  lrG idx mul  add 2 div  1 jdx sub mul
	  ulG 1 idx sub mul  urG idx mul  add 2 div    jdx     mul  add 2 div
	  llB 1 idx sub mul  lrB idx mul  add 2 div  1 jdx sub mul
	  ulB 1 idx sub mul  urB idx mul  add 2 div    jdx     mul  add 2 div
          setrgbcolor
        X  Y  hStep gradientoversizek mul  vStep gradientoversizek mul  rectfill
        grestore
      } for
    } for
  end
} def

To be tested with some driver code like:

100 100 300 300
1 0.1 0.1  1 1 0.1  0.1 1 0.1  0.1 0.1 1
50 50 gradientupright
showpage

Radial Gradients in PostScript

%!PS-Adobe-3.0

/gradientradial { % stack: x y R begR begG begB endR endG endB dr
  15 dict begin
    [/dr /endB /endG /endR /begB /begG /begR /R /y /x] {exch def} forall
    R  dr neg  0 {
      gsave
        dup /r exch def
        R div /rdx exch def % 'rdx' ranges from 0 to 1
        begR 1 rdx sub mul  endR rdx mul  add  2 div
        begG 1 rdx sub mul  endG rdx mul  add  2 div
        begB 1 rdx sub mul  endB rdx mul  add  2 div
        setrgbcolor
	x r add  y  moveto
	x y  r  0 360 arc
	closepath % not really necessary
        fill
      grestore
    } for
  end
} def

Using the gradient Procedures in PostScript and EPS

Since the afore-defined procedures fill a rectangle with a gradients, in order to fill a shape with a gradient, a stack of gradients, or a grid of gradients, you would execute the following steps:

  • open a gsave block by writing gsave, and then erase currentpath with newpath
  • create a closed path or set thereof using moveto, lineto, curveto, and so on
  • clip the graphic context to it,
  • paint a large enough gradient rectangle to hold all closed paths, and
  • close the graphic context by typing grestore

Writing a General Purpose Library in PostScript

I have included a listing of my work in this document. The guidelines and scope of this library can be learned from its comments.

cat lib.basic.ps | grep '^[ ]*%' | less

There is even a table of contents that can be found by searchin Table of contents or, more specifically, % Table of contents:.

Four Pillars for a General Purpose PostScript Library

This library should be understood to rest on four pillars:

Page (or printing area)

The following variables and procedures are defined:

  • page dimensions: where text starts and ends (lm or left margin, rm or right margin, bm or bottom margin, and tm or top margin); these are based on the absolute dimensions of the page (absrm and abstm). The absolute left and bottom margin are 0.
  • a turnpage procedure, which calls ... before turning the page, then turns the page and moves the current point to (lm, tm), and lasts calls .... The middle command relies on knowledge of the page count.
  • variables and procedures for recording and incrementing the page number, probably inside a commondict dictionary accessible everywhere.
Text

Routines to print paragraph and align text. Some of this is discussed in section Text in PostScript.

  • a newline procedure to move to the beginning of the next line, and possibly turn the page if the currentpoint falls below bm.
  • showing a string left of, below, over, or at a given point, that is, showing relative, which is useful for simple diagrams,
  • printing strings that break into a new line when too long with showpar and such.
Numerics and Geometry
Some of this is discussed in section Drawing in PostScript.
  • trigonometry
  • adding and subtracting points, as well as parameterized interpolation between points
  • etc.
Pdf Specifics (Navigation)

This section is of lesser importance.

  • hyperlinks
  • bookmarks

Using GhostScript

I have already explained how to convert eps-format files to jpeg-format files here.

Converting to PDF

This is a template command line for converting to PDF:

gs -q -dNOPAUSE -dBATCH -sPAPERSIZE=a4 -sDEVICE=pdfwrite -sOutputFile=filename.pdf filename.ps

Additionally, you may use switches -dFirstPage=pagenumber and -dLastPage=pagenumber.

Enabling Writing To and Reading From Other Files

This feature is activated through switch -dNOSAFER. This also enables executing or including a file by specifying its name, probably with extension .ps or .eps:

filename run

Printing One Page Per File

This option is useful when you want images of each page of a multi-page document. You can tell Ghostscript to put each page of output in a series of similarly named files. To do this place a template '%d' in the filename which Ghostscript will replace with the page number.

You can also control the number of digits used in the file name:

-sOutputFile=ABC-%d.png
produces: 'ABC-1.png', ... , 'ABC-10.png', ...
-sOutputFile=ABC-%03d.pgm
produces: 'ABC-001.pgm', ... , 'ABC-010.pgm', ...
-sOutputFile=ABC_p%04d.tiff
produces: 'ABC_p0001.tiff', ... , 'ABC_p0510.tiff', ... , 'ABC_p5238.tiff'

Generally %03d is the best option for normal documents.

Selecting the Paper Size

Ghostscript is distributed configured to use U.S. letter paper as its default page size. There are two ways to select other paper sizes from the command line:

  • If the desired paper size is known to Ghostscript, you can select it as the default paper size for a single invocation of Ghostscript by using the -sPAPERSIZE= switch, for instance:

    -sPAPERSIZE=a4
    -sPAPERSIZE=legal
  • Otherwise you can set the page size using the pair of switches:

    -dDEVICEWIDTHPOINTS=w -dDEVICEHEIGHTPOINTS=h

    Where w be the desired paper width and h be the desired paper height in points (units of 1/72 of an inch).

Individual documents can (and often do) specify a paper size, which takes precedence over the default size. To force a specific paper size and ignore the paper size specified in the document, select a paper size as just described, and also include the -dFIXEDMEDIA switch on the command line.

Paper Sizes Known to GhostScript

U.S. Standard:

name W (in) H (in) W (mm) H (mm) W (pt) H (pt)
11x17 11.0 17.0 279 432 792 1224
ledger 17.0 11.0 432 279 1224 792
legal 8.5 14.0 216 356 612 1008
letter 8.5 11.0 216 279 612 792
lettersmall 8.5 11.0 216 279 612 792
archE 36.0 48.0 914 1219 2592 3456
archD 24.0 36.0 610 914 1728 2592
archC 18.0 24.0 457 610 1296 1728
archB 12.0 18.0 305 457 864 1296
archA 9.0 12.0 229 305 648 864

ISO Standard:

name W (in) H (in) W (mm) H (mm) W (pt) H (pt)
a0 33.1 46.8 841 1189 2384 3370
a1 23.4 33.1 594 841 1684 2384
a2 16.5 23.4 420 594 1191 1684
a3 11.7 16.5 297 420 842 1191
a4 8.3 11.7 210 297 595 842
a4small 8.3 11.7 210 297 595 842
a5 5.8 8.3 148 210 420 595
a6 4.1 5.8 105 148 297 420
a7 2.9 4.1 74 105 210 297
a8 2.1 2.9 52 74 148 210
a9 1.5 2.1 37 52 105 148
a10 1.0 1.5 26 37 73 105
isob0 39.4 55.7 1000 1414 2835 4008
isob1 27.8 39.4 707 1000 2004 2835
isob2 19.7 27.8 500 707 1417 2004
isob3 13.9 19.7 353 500 1001 1417
isob4 9.8 13.9 250 353 709 1001
isob5 6.9 9.8 176 250 499 709
isob6 4.9 6.9 125 176 354 499
c0 36.1 51.1 917 1297 2599 3677
c1 25.5 36.1 648 917 1837 2599
c2 18.0 25.5 458 648 1298 1837
c3 12.8 18.0 324 458 918 1298
c4 9.0 12.8 229 324 649 918
c5 6.4 9.0 162 229 459 649
c6 4.5 6.4 114 162 323 459

Overriding Definitions With CommandLine Switches

You can pass parameters into your program with a switch like so:

-sPARAM=VALUE

Yet, if your commandline parameter is defined inside your file, its value gets lost. You need to conditionally define parameters inside your file. Use the where operator, as in ifnotdefthendefas procedure below:

/ifnotdefthendefas { %stack: /name obj
  exch
  dup where {pop pop pop} {exch def} ifelse
} def

You may also define a procedure that conditionally defines a variable or procedure:

/ifnotdefthendef {1 index where {pop pop pop}{def} ifelse} def

GhostScript Bounding Box Output

There is a special bbox device that just prints the bounding box of each page. You select it in the usual way:

gs -dNOPAUSE -dBATCH -sDEVICE=bbox

It prints the output in a format like this:

%%BoundingBox: 14 37 570 719
%%HiResBoundingBox: 14.308066 37.547999 569.495061 718.319158

Currently, it always prints the bounding box on stderr; eventually, it should also recognize -sOutputFile=.

Note that this device, like other devices, has a resolution and a (maximum) page size. As for other devices, the product (resolution x page size) is limited to approximately 500K pixels. By default, the resolution is 4000 DPI and the maximum page size is approximately 125 inches, or approximately 9000 default (1/72 inches) user coordinate units. If you need to measure larger pages than this, you must reset both the resolution and the page size in pixels, e.g.,

gs -dNOPAUSE -dBATCH -sDEVICE=bbox -r100 -g500000x500000