3.3.3 Text encoding

LilyPond uses the character repertoire defined by the Unicode consortium and ISO/IEC 10646. This defines a unique name and code point for the character sets used in virtually all modern languages and many others too. Unicode can be implemented using several different encodings. LilyPond uses the UTF-8 encoding (UTF stands for Unicode Transformation Format) which represents all common Latin characters in one byte, and represents other characters using a variable length format of up to four bytes.

The actual appearance of the characters is determined by the glyphs defined in the particular fonts available - a font defines the mapping of a subset of the Unicode code points to glyphs. LilyPond uses the Pango library to layout and render multi-lingual texts.

Lilypond does not perform any input-encoding conversions. This means that any text, be it title, lyric text, or musical instruction containing non-ASCII characters, must be encoded in UTF-8. The easiest way to enter such text is by using a Unicode-aware editor and saving the file with UTF-8 encoding. Most popular modern editors have UTF-8 support, for example, vim, Emacs, jEdit, and GEdit do. All MS Windows systems later than NT use Unicode as their native character encoding, so even Notepad can edit and save a file in UTF-8 format. A more functional alternative for Windows is BabelPad.

If a LilyPond input file containing a non-ASCII character is not saved in UTF-8 format the error message

FT_Get_Glyph_Name () error: invalid argument

will be generated.

Here is an example showing Cyrillic, Hebrew and Portuguese text:

[image of music]

To enter a single character for which the Unicode escape sequence is known but which is not available in the editor being used, use \char ##xhhhh within a \markup block, where hhhh is the hexadecimal code for the character required. For example, \char ##x03BE enters the Unicode U+03BE character, which has the Unicode name “Greek Small Letter Xi”. Any Unicode hexadecimal code may be substituted, and if all special characters are entered in this format it is not necessary to save the input file in UTF-8 format. Of course, a font containing all such encoded characters must be installed and available to LilyPond.

The following example shows UTF-8 coded characters being used in four places – in a rehearsal mark, as articulation text, in lyrics and as stand-alone text below the score:

\score {
  \relative c'' {
    c1 \mark \markup { \char ##x03EE }
    c1_\markup { \tiny { \char ##x03B1 " to " \char ##x03C9 } }
  \addlyrics { O \markup { \concat{ Ph \char ##x0153 be! } } }
\markup { "Copyright 2008--2009" \char ##x00A9 }

[image of music]

To enter the copyright sign in the copyright notice use:

\header {
  copyright = \markup { \char ##x00A9 "2008" }

Other languages: espaƱol, deutsch.

Notation Reference