Movatterモバイル変換


[0]ホーム

URL:


 

Apache XML Graphics Project Logo

Apache™ FOP

This document is for an old version of FOP that is no longer supported.Latest version of this page
The Apache FOP Project

The Apache™ FOP Project

Apache™ FOP: Complex Scripts

Overview

This page describes thecomplex scripts features of Apache™ FOP, which include:

Disabling complex scripts

Complex script features are enabled by default. If some application of FOP does not require this support, then it can be disabled in three ways:

  1. Command line:

    The command line option-nocs turns off complex script features:

    fop -nocs -fo mydocument.fo -pdf mydocument.pdf
  2. Embedding:

    userAgent.setComplexScriptFeaturesEnabled(false);
  3. Optional setting in fop.xconf file:

    <fopversion="1.0"><complex-scriptsdisabled="true"/>  ...</fop>`

When complex scripts features are enabled, additional information related to bidirectional level resolution, the association between characters and glyphs, and glyph position adjustments are added to the internal, parsed representation of the XSL-FO tree and its corresponding formatted area tree. This additional information will somewhat increase the memory requirements for processing documents that use these features.

A document author need not make explicit use of any complex scripts feature in order for this additional information to be created. For example, if the author makes use of a font that contains OpenType GSUB and/or GPOS tables, then those tables will be automatically used unless complex scripts features are disabled.

Changes to your XSL-FO input files

In most circumstances, XSL-FO content does not need to change in order to make use of complex scripts features; however, in certain contexts, fully automatic processing is not sufficient. In these cases, an author may make use of the following XSL-FO constructs:

Authoring Details

The complex scripts related effects of the above enumerated XSL-FO constructs are more fully described in the following sub-sections.

Script Property

In order to apply font specific complex script features, it is necessary to know the script that applies to the text undergoing layout processing. This script is determined using the following algorithm:

  1. If the FO element that governs the text specifies ahttp://www.w3.org/TR/2006/REC-xsl11-20061205/#script property and its value is not the empty string or"auto", then that script is used.

  2. Otherwise, the dominant script of the text is determined automatically by finding the script whose constituent characters appear most frequently in the text.

In case the automatic algorithm does not produce the desired results, an author may explicitly specify ascript property with the desired script. If specified, it must be one of the four-letter script code specified inISO 15924 Code List or in theExtended Script Codes table. Comparison of script codes is performed in a case-insensitive manner, so it does not matter what case is used when specifying these codes in an XSL-FO document.

Standard Script Codes

The following table enumerates the standard ISO 15924 4-letter codes recognized by FOP.

CodeScript
arabArabic
bengBengali
bopoBopomofo
cyrlCyrillic
devaDevanagari
ethiEthiopic
georGeorgian
grekGreek
gujrGujarati
guruGurmukhi
hangHangul
haniHan
hebrHebrew
hiraHiragana
kanaKatakana
kndaKannada
khmrKhmer
laooLao
latnLatin
mlymMalayalam
mymrBurmese
mongMongolian
oryaOriya
sinhSinhalese
tamlTamil
teluTelugu
thaiThai
tibtTibetan
zmthMath
zsymSymbol
zyyyUndetermined
zzzzUncoded

Extended Script Codes

The following table enumerates a number of non-standard extended script codes recognized by FOP.

CodeScriptComments
bng2BengaliOpenType Indic Version 2 (May 2008 and following) behavior.
dev2DevanagariOpenType Indic Version 2 (May 2008 and following) behavior.
gur2GurmukhiOpenType Indic Version 2 (May 2008 and following) behavior.
gjr2GujaratiOpenType Indic Version 2 (May 2008 and following) behavior.
knd2KannadaOpenType Indic Version 2 (May 2008 and following) behavior.
mlm2MalayalamOpenType Indic Version 2 (May 2008 and following) behavior.
ory2OriyaOpenType Indic Version 2 (May 2008 and following) behavior.
tml2TamilOpenType Indic Version 2 (May 2008 and following) behavior.
tel2TeluguOpenType Indic Version 2 (May 2008 and following) behavior.

Explicit use of one of the above extended script codes is not portable, and should be limited to use with FOP only.

When performing automatic script determination, FOP selects the OpenType Indic Version 2 script codes by default. If the author requires Version 1 behavior, then an explicit, non-extension script code should be specified in a governingscript property.

Language Property

Certain fonts that support complex script features can make use of language information in order for language specific processing rules to be applied. For example, a font designed for the Arabic script may support typographic variations according to whether the written language is Arabic, Farsi (Persian), Sindhi, Urdu, or another language written with the Arabic script. In order to apply these language specific features, the author may explicitly mark the text with ahttp://www.w3.org/TR/2006/REC-xsl11-20061205/#language property.

When specifying thelanguage property, the value of the property must be either anISO639-2 3-letter code or anISO639-1 2-letter code. Comparison of language codes is performed in a case-insensitive manner, so it does not matter what case is used when specifying these codes in an XSL-FO document.

Writing Mode Property

Thewriting-mode property is used to determine the axes and direction of the inline progression direction, the block progression direction, the column progression direction (in tables and flows), the shift direction, region placement, the resolution of writing-mode relative property values (such as start, end, before, after), and the default block (paragraph) bidirectionality level.

Thewriting-mode property is inherited, so it can appear on any XSL-FO element type; however, it applies (semantically) only to the following element types:

If it is not specified on one of these element types, but is specified on an ancestor element, then the value specified on that ancestor element (the inherited value) is used; otherwise, the initial valuelr-tb is used.

At present, only the following values of thewriting-mode property are supported:

Writing modes that employ a vertical inline progression direction are not yet supported.

Number Conversion Properties

Bidi Override Element

Thefo:bidi-override element may be used to override default bidirectional processing behavior, including default embedding levels and default character directionality. In the absence of either this element or use of explicitBidi Control Characters, the default behavior prescribed by theUnicode Bidirectional Algorithm applies.

Bidi Control Characters

In addition to the use of theBidi Override Element, an author may make use of the following explicit Unicode Bidi Control Characters:

If an embedding or override is not terminated (using U+202C PDF) prior to the end of adelimited text range, then it is automatically terminated by FOP.

Join Control Characters

In order to prevent joining behavior in contexts where joining occurs by default, for example, between U+0628 ARABIC LETTER BEH and U+0646 ARABIC LETTER NOON, an author may used a U+200C ZERO WIDTH NON-JOINER (ZWNJ).

Conversely, in order to force joining behavior in contexts where joining does not occur by default, for example, between U+0628 ARABIC LETTER BEH and U+0020 SPACE, an author may used a U+200D ZERO WIDTH JOINER (ZWJ).

The behavior of ZWNJ and ZWJ is script specific. SeeThe Unicode Standard, Chapter 8, Middle Eastern Scripts for information on the use of these control characters with the Arabic script. SeeThe Unicode Standard, Chapter 9, South Asian Scripts - I for information on the use of these control characters with common Indic scripts.

Supported Scripts

Support for specific complex scripts is enumerated in the following table. Support for those marked as not being supported is expected to be added in future revisions.

ScriptSupportTestedComments
Arabicfullfull
Bengalinonenone
Burmesenonenone
Devanagaripartialpartialjoin controls (ZWJ, ZWNJ) not yet supported
Khmerfullfull
Gujaratipartialnonepre-alpha
Gurmukhipartialnonepre-alpha
Hebrewfullpartial
Kannadanonenone
Laononenone
Malayalamnonenone
Mongoliannonenone
Oriyanonenone
Tamilnonenone
Telugunonenone
Tibetannonenone
Thainonenone

Supported Fonts

Support for specific fonts is enumerated in the following sub-sections. If a given font is not listed, then it has not been tested with these complex scripts features.

Arabic Fonts

FontVersionGlyphsComments
Arial Unicode MS1.0150377limited GPOS support
Lateef1.01147language features for Kurdish (KUR), Sindhi (SND), Urdu (URD)
Scheherazade1.01197language features for Kurdish (KUR), Sindhi (SND), Urdu (URD)
Simplified Arabic1.01contains invalid, out of order coverage table entries
Simplified Arabic5.00414lacks GPOS support
Simplified Arabic5.92473includes GPOS for advanced position adjustment
Traditional Arabic1.01530lacks GPOS support
Traditional Arabic5.00530lacks GPOS support
Traditional Arabic5.92589includes GPOS for advanced position adjustment

Devanagari Fonts

FontVersionGlyphsComments
Aparajita1.00706
Kokila1.00706
Mangal5.01885designed for use in user interfaces
Utsaah1.00706

Other Limitations

Complex scripts support in Apache FOP is relatively new, so there are certain limitations. Please help us identify and close any gaps.

Related Links

In addition to the XSL-FO specification, a number of external resources provide guidance about authoring documents that employ complex scripts and the features described above:

Apache Software Foundation

Copyright © 2025 The Apache Software Foundation, Licensed undertheApache License, Version 2.0.
Apache, Apache XML Graphics, Apache FOP, Apache Batik, the Apache logo, and theApache XML Graphics logos are trademarks ofThe ApacheSoftware Foundation. All other marks mentioned may be trademarks or registeredtrademarks of their respective owners.


[8]ページ先頭

©2009-2026 Movatter.jp