![]() | ![]() |
| SEARCH |ABOUT |INDEX |NEWS |CORE STANDARDS |TECHNOLOGY REPORTS |EVENTS |LIBRARY | |
| SEARCH Advanced Search ABOUT Site Map CP RSS Channel Contact Us Sponsoring CP About Our Sponsors NEWS Cover Stories Articles & Papers Press Releases CORE STANDARDS XML SGML Schemas XSL/XSLT/XPath XLink XML Query CSS SVG TECHNOLOGY REPORTS XML Applications General Apps Government Apps Academic Apps EVENTS LIBRARY Introductions FAQs Bibliography Technology and Society Semantics Tech Topics Software Related Standards Historic |
Publicly Available Software for SGML/XML/DSSSLIntroductionPriority is given to "public" SGML/XML software in this document database since the scope of interest is mainly the Internet, where the ethic of public gift is highly esteemed. The wealth of SGML software made freely available for public use is evidence of that ethos. As a supplement to the links and information provided on public SGML software below, readers should consultSteve Pepper's "Whirlwind Guide to SGML Tools and Vendors." See the main bibliographic entry for theWhirlwind Guide for a document abstract and detailed information about its contents. See also the detailed software summary for 207 products extracted from the technical report of Eila Kuikka and Erja Nikunen [updated January 1998]: (a) thefull bibliographic entry, or (b)the overview in the "Commercial SGML Software" page. NICE Technologies [November 1996] also has an onlinedatabase of SGML vendors and products (local archive copy). Primary sections in this document include the following -- however infelicitous the taxonomy for software categories. See the Contents listing to link directly to a particular description.
Public SGML Software: Table of Contents
SGML ParsersSP: James Clark's SGML Parser[CR: 20001011] James Clark'sSP parser toolkit is the successor to his SGMLS parser. Formally, SP is "An SGML System Conforming to International Standard ISO 8879 -- Standard Generalized Markup Language" [and] "A free, object-oriented toolkit for SGML parsing and entity management." [October 11, 2000] SP development (OpenSP) in the OpenJade project.OpenJade Source Control Repository Home Page". See also theproject summary page. ContactMatthias Clasen.OpenSP-1.4,cache. See also OpenSP-1.5 pre-release in CVS. [March 2000] New Version of OpenSP from the OpenJade Team.Matthias Clasen (Mathematisches Institut, Albert-Ludwigs-Universität Freiburg) hasannounced the availability of a new version of OpenSP (OpenSP-1.5pre1). OpenSP is a variant of James Clark's SP SGML parser, maintained by the OpenJade team. "The OpenJade team has made a prerelease of OpenSP-1.5 available atftp://openjade.sourceforge.net/pub/openjade/OpenSP-1.5pre1.tar.gz. Changes in version 1.5 include: (1) More of Annex K supported: Common data attributes can now be specified in external entity declarations. (2) The architecture engine supports#MAPTOKEN. (3) The multibyte version of OpenSP now uses 32bit chars and supports the full UTF-16 range 0x0000-0x10ffff." Bugs in the release should be sent to the development team atjade-bugs@infomansol.com."OpenJade "is a project undertaken by the DSSSL community to maintain and extend Jade. OpenJade is distributed under the same license as Jade. Jade isJames Clark's implementation of DSSSL -- Document Style Semantics and Specification Language -- an ISO standard for formatting SGML (and XML) documents." [March 10, 1998] See theannouncement from James Clark for the public availability of SP version 1.3 and Jade version 1.1. "The main change in SP 1.3 is bettersupport for XML based on the Web SGML TC. In Jade 1.1 the main changes are the experimental extensions for XSL (documented indsssl2.htm), and the use of XML for the FOT backend's output." SeeClark's Web site for detailed information.Note to SP and Jade users who depend upon the architectural processing support: the appropriate ArcBase processing instruction is now <?IS10744 ArcBase DSSSL>, and no longer <?ArcBase DSSSL>; SP and Jade will now require the former, on penalty of an error message (ca.) "jade:E: specification document does not have the DSSSL architecture as a base architecture. . ." or similarly. Thanks to Eliot Kimber (ISOGEN International) forclarification on this point. Also:Jade 1.1 and sp 1.3 for OS/2 provided by David J. Birnbaum. [February 16, 1998] Anannouncement from James Clark for a new test release of SP (version 1.2.92) and Jade (version 1.0.93). The main changes in Clark's SP package since version 1.2.91 are enhanced support for XML based on the final WebSGML Adaptations Annex (ISO 8879 Annex K) and the inclusion of the SX application (for converting SGML to normalized XML). [SP version 1.2.92 and Jade version 1.0.93, sources, archive copy]; [SP version 1.2.92 and Jade version 1.0.93, Win32 binaries, archive copy] [October 17, 1997] An announcement from James Clark describes a test release of SP with improved XML support. This test/experimental version is available via FTP as part of a Jade test release:source, orWin 32 binaries. In this distribution, SP supports "a number of key features from theWebSGML SGML TC," including: unbundling of SHORTTAG, feature to allow elements declared EMPTY to have end-tags, duplicate enumerated attribute tokens are allowed, support for multiple ATTLIST declarations for a single element type, relaxation of rules on use of parameter entity references inside groups, feature that turns off SGML's traditional record end rules, NESTC (net-enabling start tag close) delimiter, support for predefined single character entities in the SGML declaration (lt, amp etc), etc. See the text of the announcement for full details about this SP test release. [September 03, 1997] As of this time, the most recent version of SP is also available as part of James Clark'sJade package. [October 28, 1997] Announcement from James Clark for a "very preliminary release of SX, an application built with the SP library for converting SGML to XML." This tool will eventually be included in the standard SP distribution. SX (the provisional name) "parses and validates the SGML document contained in sysid... and writes an equivalent XML document to the standard output. SX will warn about SGML constructs which have no XML equivalent." Thedistribution includes both source and Win 32 binaries (the sp120u.dll file included in the SP 1.2.1 Win32 Unicode binary distribution is required). Note that the program "does not yet provide enough to handle the situation where you want to migrate your document source from SGML to XML. In particular it doesn't try to preserve entity references; all entities are expanded." Note: this paragraph is not up-to-date for SP version 1.2, released in September 1997; seethe official documentation, and/or the links in thedescription of SP version 1.2. . . Thecurrent version is SP 1.1.1 (July 30, 1996). SP is a "free, object-oriented toolkit for SGML parsing and entity management." SP is written in C++, supports the LINK feature, is reentrant (a single process can use multiple parsers at the same time), is command-line compatible with SGMLS, includes an application [nsgmls] to generate sgmls-style output format, and an application [rast] to generate RAST output format (like SGMLS) conforming to ISO/IEC 13673:1944. Other parser tools include [sgmlnorm], a simple SGML tag normalizer, and [spent], a facility for printing an SGML entity on standard output. SP supports any concrete syntax allowed by ISO 8879, and supports large character sets (can be compiled to use 16-bit characters internally; supported systems include UTF-8, Unicode/UCS-2, UJIS/EUC, and Shift-JIS). It is said to be fast for large documents. In addition to the C++ source code, binaries [nsgmls and rast] are available for MS-DOS (SP version 0.2) and several UNIX systems. The MS-DOS binaries use a 32-bit DOS extender (included in the distribution), so that the MS-DOS 640K conventional memory barrier should not be a limiting factor in the use of SP. In the most recent releases of SP, James Clark has also issued some very useful tools that handle entities and "normalize" SGML documents in various ways, as specified in command line options. For example, SPAM (SP Add Markup) will provide canonical SGML when SHORTTAG and OMITTAG have been used in the SGML source. The output SGML is determined by the user's specification. SPAM (SP Add Markup) thus serves as a markup stream editor. Seethe documentation from the official site for complete details. Version 1.1 also supportsArchitectural Form Processing [mirror copy], on which, see the following"toy example". [April 10, 2000] XML Base Architectures in SP. Steve Newcomb writes: "You can now use SP to validate the conformance of XML documents to base architectures (meta-DTDs). TechnoTeacher has created a version of SP with full industrial-strength support for the alternative PI-based "Base Architecture Declaration" syntax. The enhancement builds on pioneering work done by Luis Martinez while he was working at TechnoTeacher, and it has recently been brought up to industrial strength by Peter Newcomb. Because of urgent need in certain industrial quarters (mortgage, healthcare, etc.), we've placed binaries of this version of SP at our FTP site:ftp://ftp.techno.com/TechnoTeacher/SPt..." [cache] [September 1996] Commercial support for SP is provided byTechnoTeacher, Inc. - NB, James Clark himself has no commercial connection with TechnoTeacher, Inc. Seethe support announcement. [November 25, 1997] See theannouncement for a GC-enabled spgrove application, from Vladimir V. Tsychevski. Other links:
Pointers to the latest released version of the SP parser (version 1.0.1: October 21, 1995) and its description:
parseDTD - DTD parser package for SP[CR: 19980612] [February 06, 1998] From Peter Newcomb, ofTechnoTeacher Inc.:parseDtd. It parses an SGML declaration set in the absence of a document (e.g., can parse a DTD and spit out information about the elements and attributes defined in it). It is based on the SP SGML parser, version 1.2.1, written by James Clark. Peter's description: "I recently put together a small SP-based package that parses declaration sets irrespective of particular documents, returning the result as an SP DTD object." Links:
Graphical Front Ends for SP[CR: 19971028] Probably there are several such front ends. [Please let me know what's missing in the list below.]
ARC-SGML: Charles Goldfarb's Almaden Research Center SGML ParserARC-SGML was one of the first SGML parsers to be made publicly available, and it provided the basis for the development of SGMLS by James Clark.
SGMLS: James Clark's SGMLS parser[CR: 19970909] SGMLS is probably the most widely used "public domain" parser as of late 1994. It has been incorporated as a validating parser into several commercial products as well. It is superseded now in part by James Clark's "SP" parser (and perhaps by the YASP and YAO parser materials) though for many simple validation tasks, SGMLS remains quite useful. SGMLS is alsovery fast. Its output is intended for a structure-oriented application, and this output is trivially parsable. SGMLS has been ported to many platforms, including OS/2.
YASP: Pierre Richard's Yorktown Advanced SGML Parser (or: 'Yet Another SGML Parser')[CR: 19970405]
YAO (Yuan-Ze--Almaden--Oslo project) Parser Materials
PSGML, by Lennart Staflin[CR: 20001201] PSGML is described as "a major mode for editing SGML and XML documents. It works with GNU Emacs 19.34, 20.3 and later or with XEmacs 19.9 and later [perhaps also Lucid Emacs 19.9, OEmacs, NTEmacs]. PSGML contains a simple SGML parser and can work with any DTD. Functions provided includes menus and commands for inserting tags with only the contextually valid tags, identification of structural errors, editing of attribute values in a separate window with information about types and defaults, and structure based editing." David Megginson's personal testimonial: "XEmacs+PSGML is my editor of choice for all of my XML and SGML work. I've used it to create probably close to 10,000 printed pages of documentation over the last few years, and have used XEmacs's regular-expression facilities for adding complex markup to e-texts. It's probably not suitable for naive users (give 'em XMetaL or WordPerfect, or maybe XED), but for the tech-savvy, it's great." [XML-DEV] [December 06, 2001]"Using Emacs for XML Documents. Install add-ons to the powerful Emacs text editor to build a platform-independent (and free) environment for working with XML." ByBrian Gillan (Software engineer, ID Technology and Design Group, IBM). From IBM developerWorks XML Zone. December 2001. ['Emacs, best known as a powerful text editor for UNIX developers, can be an ideal XML editor for MS-DOS, Windows, and MacOS. The author describes how to install the right add-on packages and modify settings to create a powerful XML/SGML editing-and-validation environment in Emacs with extensions such as PSGML and OpenSP. Most of the work involved in setting up this environment ends with downloading and installing Emacs and the individual packages, but you must also configure Emacs properly and enable the DTDs you plan to work with. The article includes sample configuration files and XHTML DTDs.'] "Though it's best known as a powerful text editor favored by UNIX developers, Emacs can be used to work with XML in non-UNIX platforms such as Windows, MS-DOS, and MacOS. Emacs works as a full-blown development environment for processing text, writing applications, and, as I'll discuss, creating structured information like XML and SGML. I use it as a general-purpose editor for creating and managing some of my programming projects, and for writing XHTML and playing around with SGML and XML. In fact, I used it to write this article. This article tells how to install Emacs and the extensions PSGML and OpenSP. It also outlines how to customize Emacs to make it function with a variety of DTDs. I present many of the Emacs customizations one piece at a time. However, you can download a zip file with sample DTDs and all of the Emacs customizations. My intent is to get you started using Emacs by providing you with just enough information for you understand what's going on. Then you'll be able to add DTDs and customize Emacs based on your needs and preferences..." PSGML version 1.2.3 was released onSourceForge November 8, 2001; see thedownload. [PSGML version 1.2.3, November 8, 2001, cache] [December 01, 2000] Update notice 2000-10-27. "The future ofPSGML: It is currently not in active development. I plan to put out one or two bug fix releases and the move the sources tosource forge (possibly after restructuring the code a bit and merging in various patches and additions that has been send to me.) I will then invite others to take an active part in the future development of PSGML. To start this I have created two mailing lists on source forge. A psgml-user for general discussion and questions about PSGML and psgml-devel for discussion about the future development of PSGML. Visit theSourceForge: Mailing Lists for PSGML page for subscription information..."
tdtd - Emacs Macro Package for Editing SGML/XML DTDs[CR: 20011102] [June 09, 2001] The web site URL for 'dtd -- Emacs Major Mode for SGML and XML DTDs' ishttp://www.menteith.com/tdtd/. The latest version is 0.7.1. Features of tdtd revision 0.7.1 include: (1) Standalone mode for editing DTDs; (2) "Goto" menu for locating declarations within the current buffer; (3) dtd-etags function for creating Emacs TAGS files for easy lookup of any element, parameter entity, or notation's definition using Emacs's built-in tag-lookup functions; (4) dtd-grep function for searching files that shares a file history with dtd-etags for easy searching of the same files with both functions; (5) Specific font lock highlighting of declarations in XML DTDs, SGML DTDs, SGML Declarations, and System Declarations so that the important information stands out; (6) XML-specific behaviour that, at user option, is triggered by automatic detection of the XML Declaration; (7) Functions for writing and editing element, attribute, internal parameter entity and external parameter entity declarations and comments to ease creating and keeping a consistent style; and (8) Elements and parameter entity names referenced in declarations are stored in minibuffer history to minimise retyping in new declarations..." [cache cersion 0.7.1] In March 1999, Tony Graham (Mulberry Technologies, Inc.) released an updated version of histdtd 'Emacs Major Mode for SGML and XML DTDs'. Features in revision 0.7: (1) Standalone mode for editing DTDs; (2) [August 03, 1998] Update of thetdtd emacs macro package for editing SGML/XML DTDs. [May 27, 1998] ThetdtdEmacs Macro Package for editing SGML/XML DTDs wasupdated by Tony Graham on May 24, 1998. Version 0.5.1 features: "1) dtd-etags function for creating Emacs TAGS files for easy lookup of any element, parameter entity, or notation's definition using Emacs's built-in tag-lookup functions; 2) Font lock highlighting of declarations so that the important information stands out; 3) XML-specific behaviour that, at user option, is triggered by automatic detection of the XML Declaration; 4) Functions for writing and editing declarations and comments to ease both creating and keeping a consistent style." Previously: Tony Graham (Mulberry Technologies, Inc.) announced the availability of atdtdEmacs Macro Package for editing DTDs (revision 3, December 14, 1997). The macro package was presented in a poster session at SGML/XML '97. The macros have been developed "intermittently over the last two years." Tony says: "Thetdtd macro package for an Emacs major mode for editing DTDs is available atftp://ftp.mulberrytech.com/pub/tdtd. The package includes font lock keywords for colour highlighting of declarations and reserved words plus a collection of macros that help when writing DTDs. The Links:
Panorama: SoftQuad's SGML Viewer for WWW[CR: 19980408] SoftQuad Panorama is a free version of SoftQuad Panorama PRO. It supports browsing (and searching?) of fully compliant SGML documents on the WWW.
HoTMetaL: SoftQuad's HoTMetaL editor for HTMLHoTMetaL is an unsupported version of the commercial productHoTMetaL Pro. It provides an editor/browser for (extended) HTML documents. HoTMetaL is available on a number of platforms (UNIX, MS-Windows, etc.). Atutorial for HoTMetaL Pro teaches HTML basics, supported by an HTMLQuick Reference guide. The most recent [March 1995] Windows version of HoTMetaL supports some of the Netscape extensions (e.g., <CENTER>, <BLINK>), displays graphics inline, uses a stylesheet configured to look like a standard HTML browser, and supports a filter for loading plain text files and invalid HTML documents. Seethe posted public announcement or the fuller description on the SoftQuad server, including FTP location. Trythe FTP directory ftp://ftp.ncsa.uiuc.edu/Web/html/hotmetal/Windows, and specifically the binary file ftp://ftp.ncsa.uiuc.edu/Web/html/hotmetal/Windows/hotm1new.exe).
Other mirror FTP sites list for HoTMetaLConnect to theSoftQuad server for arecent list of FTP sites in the US, Canada, and Europe that host HoTMetaL. The FTP links below are older, but may still be alive:
HyBrick - SGML/XML Browser[CR: 19990304] [March 04, 1999]Ralph E. Ferris (Fujitsu Software Corporation) hasannounced a new release of Fujitsu'sHyBrick SGML/XML browser, with expanded support for XLink/XPointer. It is available from theFujitsu Software Corporation's Web site. New features in HyBrick V0.82 related to XLink and XPointer include: "1) XLink/XPointer error/warning info is shown in the error list dialog; 2) A 'Document Group' sub-menu has been added in the 'XLink/XPointer' menu; users can now navigate between inter-linked documents by using Document Groups as well as through individual links; 3) In the 'select link' dialog, link element 'role' values are displayed instead of GIs. This feature, as well as the 'Document Group' display feature, are particularly useful for creating and navigating 'Topic Maps.'; 4) The mouse cursor now changes its shape over links." Also new in HyBrick 0.82 are multiple stylesheet support (if multiple stylesheet PIs are present, users are presented with a dialog box to select the stylesheet they want to use), 'Reload hubdocument' function and 'Close window' function. 'HyBrick' is "an advanced SGML/XML browser developed by Fujitsu Laboratories, the research arm of Fujitsu. 'HyBrick' is based on an architecture that supports advanced linking and formatting capabilities. HyBrick includes a DSSSL renderer and XLink/XPointer engine running on top of James Clark's SP and Jade. It supports both valid and well-formed XML documents, XLink and XPointer (XLink implemented as a subset of the HyTime property set), SGML (ISO 8879), DSSSL (ISO 10179) online specification, printing and print previewing based on DSSSL stylesheets." See more onHyBrick Support for XPointer in a posting of March 4, 1999. [February 15, 1999]Ralph E. Ferris (Fujitsu Software Corporation) posted an update on the HyBrick V0.80 support for XLink and XPointer.HyBrick is an advanced SGML/XML browser developed by Fujitsu Laboratories, the research arm of Fujitsu. HyBrick is based on an architecture that supports advanced linking and formatting capabilities. HyBrick includes a DSSSL renderer and XLink/XPointer engine running on top of James Clark's SP and Jade. It supports "both valid and well-formed XML documents, XLink and XPointer, SGML (ISO 8879), DSSSL (ISO 10179) online specification, printing and print previewing based on DSSSL stylesheets." To make the point [about HyBrick XLink/XPointer support, Ralph has] put some files with XLink/XPointer declarations in them up on the HyBrick Web site athttp://www.fsc.fujitsu.com/hybrick/. These files are intended to be accessed over the Web. If your network access environment allows you to though, you can see XLink and XPointer at work over the Web by downloading HyBrick and pointing it at:http://www.fsc.fujitsu.com/hybrick/hubdoc-1.xml . . ." [see the posting for caveats and full details.] HyBrick Version 0.8 with XLink/XPointer support is now available fordownload. [Earlier description:] "HyBrick" is 'an advanced SGML/XML browser developed by Fujitsu Laboratories, the research arm of Fujitsu. "HyBrick" is based on an architecture that supports advanced linking and formatting capabilities. HyBrick includes a DSSSL renderer and XLink/XPointer engine running on top of James Clark's SP and Jade. HyBrick supports: 1) Both valid and well-formed XML documents; 2) XLink/XPointer on the local file system [XPointer is implemented as a subset of the HyTime property set; Link traversal can use either "New" or "Replace" to display a new page]; 3) SGML (ISO 8879); 4) DSSSL (ISO 10179) online specification; 5) Printing and print previewing based on DSSSL stylesheets.' [November 03, 1998]Ralph E. Ferris of Fujitsu Software Corporation has announced thatHyBrick V0.8 with XLink/XPointer is Now Available for download. Links:
The Wurd [was: WP] Project"Wurd is an SGML capable Wurd Processor and publishing tool for multiple operating systems/platforms - although at the moment the only operating system supported is Linux. [June 1997] [Work in progress only] WP is "a word processor being built by linux enthusiasts. . . with a native file format based on the SGML model. . .The use of SGML as the file format means that wp has an open interchange format. It will be possible to maintain World-Wide Web pages directly with wp."
GRIF Symposia: "A Collaborative Authoring Tool for the World Wide Web" (HTML and XML)[CR: 19970827] Links:
HyBrowse HyTime Browser[CR: 19961126] HyBrowse is a HyTime Browser from TechnoTeacher, Inc., - a HyMinder application. "HyBrowse is a true HyTime (ISO/IEC 10744) hyperdocument browser for Windows 95 and Windows NT. It is useful for developing electronic document architectures that employ HyTime's strongly typed location-independent linking mechanisms." HyBrowse is publicly available (free) [as of November 22, 1996] for a trial period of 45 days. In addition to standard features one would expect, it supports: (1) True HyTime independent hyperlinking; (2) User-defined strong hyperlink typing with [a] icons assignable to anchor roles over entire bounded object set (BOS), [b] rendering styles assignable to anchor roles over entire BOS; (3) HyTime-conforming address elements ; (4) Aggregate location and hyperlink traversal handling; (5) Arbitrary BOS awareness allows users to add (import) a document into the current BOS; (6) Re-open browsing sessions without reparsing or reprocessing." Eliot Kimber writes: "NOTE: HyBrowse is intended as a tool for creating prototypes and demos of HyTime features. It is not intended to be a production-quality information delivery system. The formatting features are minimal compared to Panorama or DynaText but sufficient to demonstrate the very interesting things you can do with independent links and anchors thereof. If you've been thinking of ways that HyTime hyperlinking could solve some of your information management problems but never had a way to realize or test those ideas, nowyou do, for free." Links:
perlSGML - Perl programs and libraries (Earl Hood)[CR: 19970918] perlSGML is a collection of Perl programs and libraries written by Earl Hood for processing SGML documents. The following software is available in the perlSGML distribution: dtd.pl (A Perl library to parse SGML DTDs), dtd2html (An SGML DTD documentation/navigation tool), dtddiff (a utility to list changes in a DTD), dtdtree (Generate content hierarchy trees of SGML elements), dtdview (Interactively query a DTD), sgml.pl (A Perl library to parse SGML instances), stripsgml (utility to remove SGML markup). The 'dtd2html' tool is widely used. "What is dtd2html: dtd2html is part of theperlSGML package. dtd2html is a program that generates an HTML document (composed of several files) that documents and allows hypertext navigation of an SGML DTD."
Carthage, dpp, and Bison tools by Michael Sperberg-McQueen[CR: 19970122] Several SGML grammar tools have been created and made publicly available by TEI editor Michael Sperberg-McQueen. DPP: "DPP is a parser for SGML document type declarations, intended for use as a front end for filters which modify DTDs (e.g. filters to expand all or some parameter entity references, or to rename elements, etc.). Since DPP uses the same output format as sgmls. . .many existing tools for writing filters for SGML document instance . . . can be used with DPP to make filters for DTDs." Bison tools: "The subdirectory pub/tei/grammar/bison contains files with Bison grammars and Flex scanners for SGML document type definitions, SGML document instances, and SGML declarations. Seeftp://ftp-tei.uic.edu/pub/tei/sgml/grammar for fuller description of these grammar tools. Another of the tools is a utility called Carthage. "Carthage is a yacc/lex-based parser for SGML DTDs which can delete references to undeclared elements. It can also do a few other things, depending on the run-time flags you give it." Some options include: (1) dropping or keeping marked sections; (2) warning if entities are declared twice; (3) dropping or keeping parameter entity declarations; (4) deleting named GIs from content models; (5) listing of specified classes of elements in the DTD [used, unused, default undeclared, declared]; (6) dropping or keeping comments in the output file, etc. The software is "unsupported" but "users who improve it or fix errors are requested tonotify the author so he can also fix them." [extracts from the README file, dated June 17, 1996.
DTDParse, by Norman Walsh[CR: 19980409] "DTDparse reads an SGML DTD and constructs a simple, easily parsed database of its content. This database can be examined to construct other views of the DTD. The DTDparse distribution contains several scripts which use the database to extract useful information about the DTD: (1)parents lists the parents of a particular element; (2)children lists the children of a particular element; (3)dtd2man produces DocBook RefEntry pages ('man' pages in common UNIX parlance) for the components of the DTD; (4)dtd2html [unrelated to Earl Hood's program of the same name] builds an HTML web of the components of the DTD." The documentation page provides sample output for DTDs such as DocBook 3.0, HTML 3.2, ISO 12083 DTDs, TEI Lite 1.6, and the CALS Table DTD.
Fred - The SGML Grammar Builder[CR: 19980508] "Fred is an ongoing research project at OCLC Online Computer Library Center, Inc. (OCLC) studying the manipulation of tagged text. As a service to the community, OCLC has decided to make several portions of Fred freely available via a WWW server." These services include (subject to documented limitations): automatic SGML DTD creation from tagged text, grammar reduction (BNF, DTD, and Four-Tuple output formats), and arbitrary transformations. Links:
NORMDTD (by Richart Light)[May 1996] "NORMDTD is a DOS (yes!) program that reads a valid SGML DTD, even a TEI-like one that uses marked sections and multiple input files, and generates a single file containing a normalized version of that DTD. The element content models in this normalized DTD will not contain any references to elements that are not declared, and so it can be used by highly-strung SGML packages such as RulesBuilderthat refuse to process TEI applications (in particular) for this reason. In fact, having a normalized DTD in a single file can be helpful for a number of reasons, to a variety of SGML applications." NORMDTD is written in Borland Pascal and runs only under DOS.
Babble - Synoptic Text Browsing/Searching Tool[CR: 19970628] "Babble, under development by Robert Bingler at the Institute for Advanced Technology in the Humanities (University of Virginia in Charlottesville), is an SGML-capable synoptic text tool that can display multiple texts in parallel windows. It uses Unicode, an ISO 16-bit character set standard, which allows multilingual texts, using mixed character sets, to be displayed simultaneously. Babble also allows users to search for strings in text or in tags, and to link open texts for scrolling and searching. Currently, Babble runs as an application, and not as an applet . . . Babble was originally prototyped in C++ and Motif++ for AIX 3.25 by Pete Yadlowsky. The current version is written in Java." [from the Home Page] Note: Babble has been described to me as nominally but usefully SGML-aware. For example: "The search function allows you to search for strings, either in text or--if the file you're searching is marked up in SGML--within tags. When you click on the search button, a dialogue box appears, offering two choices: search in text or in tags, and a character set for the search. It is assumed that SGML tagging will be done in the Latin alphabet, but Babble will allow you to search for a non-Latin string within tags." [from the online documentation] Links:
IADS: Integrated Authoring and Display System[CR: 20011019] "Interactive Authoring and Display System (IADS) was developed as a U.S. Army Missile Command initiative to reduce or eliminate paper documentation. IADS utilizes standard generalized markup language (SGML) to manipulate the text and graphics. The author can chose to display graphics within the text and/or in separate windows." [from the Home Page]
SARA (SGML-Aware Retrieval Application)The SARA system. SARA (SGML-Aware Retrieval Application) is a client/server software tool allowing a central database of texts with SGML mark-up to be queried by remote clients. The system was developed at Oxford University Computing Services, with funding from the British Library Research and Development Department (1993-4) and the British Academy. The original motivation for its development was the need to provide a robust low-cost search-engine for use with the 100 million word British National Corpus, and several features of the system design necessarily reflect this. The SARA system has four key parts:
Links:
Ispell for SGML[CR: 19970225]
Syntext -- the SGML Grammar Grapher[CR: 19960521] "SYNTEXT is an SGML DTD providing elements and attributes to mark up text in English for: (1) syntactic structure, including (a) X-bar based parsing, with Government and Binding-style PRO and t, (b)grammatical relations a la Quirk et al. marked as attributes; (2) cohesion ; (3) coreference; (4) conjunctive relations as attributes of sentence specifiers; (5) lexical cohesion as attributes of lexical items; (6) rhetorical figures. Any text marked up for these features and identifying itself as DOCTYPE SYNTEXT is an SGML document and can be browsed in a SGML browser or viewer such as SoftQuad's free Windows browser Panorama or the costwish viewer for X Windows being developed by Peter Murray-Rust. It is an SGML application, the purpose of which is to provide markup for the analysis of syntactic and textual structure; a marked up text can viewed as a tree and in other modes and can be searched with context sensitive and contingent scans, making it very powerful for stylistic analysis (once a passage is marked up!)." Links:
MtSgmlQL, the SgmlQL interpreter[CR: 19971216] "The SGML query language SgmlQL was developed in the context of the MULTEXT project. It is a functional language based on SQL, which enables complex operations on SGML documents, for instance: (1) extraction of parts of an SGML document that satisfy given criteria; (2) tests, counts, and various other computations on SGML elements in a document; (3) construction of new elements and documents using the result of queries. Because SgmlQL is a functional language, all data and program statements are expressions, or queries, which are recursively evaluated. It allows for manipulation of numbers, strings, (SGML) names, elements, attribute-value sets, documents, and (mixed content) lists. A free alpha version for UN*X of MtSgmlQL, the SgmlQL interpreter, can be downloaded to your system for non-commercial, non-military purposes (see the user agreement). Links:
'sgrep' grep-like searching of structured documents[CR: 19981210] Description: 'sgrep' (structured grep) "is a tool for searching text files and filtering text streams using structural criteria. The data model of sgrep is based on regions, which are nonempty substrings of text. Regions are typically occurrences of constant strings or meaningful text elements, which are recognizable through some delimiting strings. Regions can be arbitrarily long, arbitrarily overlapping, and arbitrarily nested. Sgrep is a convenient tool for making queries to almost any kind of text files with some well kown structure. These include programs, mail folders, news folders, HTML, SGML, etc... With relatively simple queries you can display mail messages by their subject or sender, extract titles or links or any regions from HTML files, function prototypes from C or make complex queries to SGML files based on the DTD of the file." Sgrep is distributed under GNU General Public License. [December 10, 1998]Jani Jaakkola hasannounced the availabilty of "sgrep-1.90a - An SGML and XML Search and Indexing Tool." Sgrep is a tool to search and index text, SGML, XML and HTML files using structured patterns. New features in Sgrep version 1.90a include: 1) query operators that support direct containment, so that one may query children and parents of given elements; 2) the sources are available under GPL-license for those interested in compiling sgrep; 3) Sgrep now uses GNU autoconf, so compiling sgrep under Unix-systems should be easy; 4) bug fixes. This version of Sgrep contains the sources, Win32 binaries, and binaries for HP-UX, Linux, OSF1 and Solaris. The Win32 binary also includes the m4 macro processor. For more information on Sgrep, seeREADME file orthe overview. [August 29, 1998]Jani.Jaakkola@cs.helsinki.fi (Department of Computer Science, University of Helsinki) posted an announcement for the release ofsgrep version 1.71a as the first prerelease of sgrep-2. Sgrep is a tool to search and index text, SGML, XML and HTML files using structured patterns. Features new in version 1.17 include: "1) Indexing of both structure and content; 2) SGML/XML/HTML scanner; 3) both Win32 and i386-Linux binaries; 4) compatibility with older versions of sgrep; 5) no dependence upon 'sgtool'. Features announced for inclusion in sgrep-2 are: 1) Support for querying notations, element type declarations and attribute list declarations inside SGML/XML document prolog; 2) Parsing of all well-formed XML-documents; 3) Proper documentation. Links:
Inside & Out, from ZGDV[CR: 19970522] Inside & Out is a graphical DTD editor created by Hans Holger Rath and Ulrich von Engelberg, of the Computer Graphics Center (ZGDV) in Darmstadt, Germany. It runs under MS-Windows 3.1 (386 PC) with 4 MB RAM. The editor is designed to build SGML DTDs interactively, providing a graphical presentation of the DTD in the shape of a a syntax (or railroad) diagram. Every element and parameter entity definition is shown in a single diagram. All definitions are alphabetically sorted (first all entity, second all element definitions)" Links:
MU: Forms Assisted SGML Markup"MU is a perl-based program that builds fill-out forms for SGML editing, based on simple templates. It supports lock files (for networked workgroups), and it is distributed with a TEI-lite template. Demonstrations, source code, help files, and an email list for bug reports and developers are available. . .Features: (1) Helps to automate the SGML markup process; (2) Quite general - works on various types of DTD templates; (3) Version 1.1 deals quite nicely with attributes; (4) Allows for multi-user editorial communication through the use of remarks; (5) Supports internet workgroups via lockfiles." Markus Hoenicka's SGML/DSSSL Setup for Windows NT[CR: 19981014] "These pages describe how to set up a free integrated SGML editing and publishing system running under Windows NT - and, with a few modifications of the installation procedure, also on Windows 95/98 boxes." The documentation provides instructions for the installation of Emacs, Jade, PSGML, Ghostscript, Acrobat, MiKTeX, AucTeX, Jadetex, DocBook, etc. Links:
SGML Data Conversion, Transformation, and ManipulationAt SGML'96, Boston, November 1996, Tony Graham (Mulberry Technologies, Inc.) presented"Free SGML Transformation Tools." "The criteria for selecting an SGML transformation processing tool are discussed, and the details and SGML-processing features of several free SGML transformation tools are listed." RainbowSeveral companies have collaborated on the design of an SGML interchange language for word-processing formats. Rainbow makers produce SGML from the supported word-processing formats, preserving as much information about document structure as can be deduced reliably. The Rainbow SGML format can then be used as input to other applications. See further explanation onEBT's server or on the mirrors in the file 'rainbow.why'. Rainbow makers are now available (free) for FrameMaker/FrameBuilder MIF, RTF, Interleaf, and (possibly) Ventura. Authoritative files for the Rainbow distribution are located onEBT's FTP server (SGML Rainbow via ftp.ebt.com/pub/nv/dtd/rainbow/ Other sources for Rainbow makers include:
ICA: Integrated Chameleon ArchitectureThe ICA (Release 1.6, February 1994) is a toolset for generating data translators. In particular, the toolset can be used to generate translators to and from a constrained subset of instances of SGML Document Type Definitions (DTDs). There are several example translators included in the distribution. The first is a book DTD and includes specific translators for the LaTeX book documentstyle and a specific troff macro package. The second is a bibliographic DTD and includes specific translators for BibTeX and refer bibliographic database formats. Please note that the ICA is for developing translators and not providing translators. The ICA runs in the Unix environment, using the X Window System for the basis of the graphical user interfaces. A new user's manual for ICA is also available. Published by Prentice Hall, the book is entitledThe Integrated Chameleon Architecture: Translating Documents with Style, by Sandra Mamrak, Conleth S. O'Connell, and Julie Barnes. ISBN 0-13-056418-4. This book contains much new and revised material over the previously available online documentation, including a chapter on the ICA and SGML. See also description in excerpts fromthe release notes. Seefurther description in the ICA toolkit anouncement, and see network addresses for supportingmailing list. The sources for ICA on the Internet are:
STIL - `SGML Transformations in Lisp'STIL is a stylesheet language developed by Joachim Schrod (Computer Science Department Technical University of Darmstadt, Germany). "STIL (`SGML Transformations in Lisp') is a style sheet language to create structure-controlled SGML applications. In these applications you have neither access to the DTD nor to the original document source, instead you operate on a tree representation of the document. If you know CoST (the tree mode version) or SGMLSpm, STIL uses the same concept as these style sheet languages. The most obvious difference is the use of Common Lisp instead of Tcl or Perl5. You define classes for elements that appear in a document, instances of these classes are the inner nodes of the tree. Automatic transformation of attributes to data structures more appropriate in your task domain than simple strings is available. Elaborate handling of PCDATA is supported, too. The document tree is traversed, you can specify operations (`callbacks') that are triggered at certain points in that traversal. Within these callbacks, you have access to the full tree." [from the README, 1995/09/09] Links:
CoST (Copenhagen SGML Tool, UNIX)[CR: 19990628] [June 28, 1999]Joe English hasannounced the release of Cost version 2.2, which now provides 'preliminary support for XML'.Cost is a free "structure-controlled SGML application programming tool. It is implemented as a Tcl extension, and works in conjunction with James Clark's nsgmls and/or sgmls parsers. Cost provides a flexible set of low-level primitives upon which sophisticated applications can be built. These include: (1) A powerful query language for navigating the document tree and extracting ESIS information; (2) An event-driven programming interface; (3) A specification mechanism which binds properties to nodes based on queries. Cost can be dynamically loaded into a Tcl application with the usual package mechanism, or it can be statically linked into a custom Tcl interpreter. There is also a command-line interface, costsh, which can be used interactively or as part of a command pipeline. A windowing interface, costwish, is also available for building GUI applications with Cost and Tk. New features in Cost version 2.2 include: (1) It should compile and install out-of-the-box on most Unix platforms, with any Tcl release from 7.5 through 8.1.1 - courtesy autoconf; (2) One can load more than one document at a time, and switch between them with the new 'selectDocument' and 'withDocument' commands; (3) It allows comments at certain places in specifications. (4) It provides preliminary support for XML, courtesy expat by James Clark. Note: XML support is largely untested and has a few known deficiencies (and probably several unknown ones!); I'd appreciate any feedback/bug reports. (5) It is released under a Tcl-style license instead of the 'Artistic' license. (6) Cost can now be loaded as an extension into multiple Tcl interpreters without conflicts. (7) Many minor bugfixes, enhancements, and cleanups." [1997] "What is CoST? CoST (Copenhagen SGML Tool) is a structure-controlled SGML application programming tool. It is built on top of a public domain SGML tool: the SGMLS parser made by James Clark. With CoST you can write translation specifications for SGML document instances. CoST is purely structure driven, i.e. it gives you access to the structure of the SGML document instance. It won't, however, let you access the lexical and syntactical details in the SGML entities that represent the document instance in storage. You can write CoST programs that will translate SGML document instances or perform other processing in response to SGML documents. You program CoST using TCL - Tool Command Language." [from the Manual Introduction [March 1995] CoST was written by Klaus Harbo (Klaus.Harbo@euromath.dk) and is maintained by Joe English (joe@flightlab.com). Links:
costwish - SGML postprocessor and renderer"Costwish is a graphical interface (SGML postprocessor and renderer) for Joe English's CoST-2 tool. From the README: "costwish is a generic graphical interface to Joe English's CoST SGML/ESIS post-processing tool. It is aimed at those who wish to: (1) run sgmls (or other ESIS-based parser) under a graphical interface; (2) browse their documents graphically (3) customise their postprocessing easily, powerfully and flexibly; (4) construct powerful searches of SGML-based documents; (5) and manage the results interactively; (6) develop interfaces to helper applications (e.g. graphical renderers)." [from the README, April 1996] Links:
SGMLS.pm and sgmlspl: A Simple Post-Processor for SGMLS and NSGMLS[CR: 19980423] SGMLS.pm and sgmlspl were written by David Megginson, and were maintained by him through 1995. The current maintainer [1998] of the SGMLS.pm Perl package is Ingo Macherius (Ingo.Macherius@tu-clausthal.de). David's description: "SGMLSpm is a free perl5 object-oriented postprocessor for James Clark's SGMLS and NSGMLS parsers. The main part of this release is a library, SGMLS.pm, which repackages the ESIS output of (N)SGMLS into perl5 objects. On top of this, I have built a script, sgmls.pl, for formatting or processing SGML documents quickly using event patterns. Like CoST (which is several times slower), and unlike QWERTZ (etc.), SGMLSpm is a general-purpose package which can be used with any DTD. It even includes a script, skel.pl, which will write a skeleton conversion script for your document automatically!" "sgmlspl is a sample application distributed with the SGMLS.pm perl5 class library -- you can use it to convert SGML documents to other formats by providing a specification file detailing exactly how you want to handle each element, external data entity, subdocument entity, CDATA string, record end, SDATA string, and processing instruction. sgmlspl also uses the Output.pm library (included in this distribution) to allow you to redirect or capture output."
OmniMark LE[CR: 19970923] [September 23, 1997] Announcement for theOmniMark LE, available "at no charge for a limited time." OmniMark is a flagship industry software product -- a leading SGML based "hypertext programming language for development of on-line, Web, CD-ROM and print-on-demand publishing applications, being used for SGML conversion by a wide range of industry-leaders, including over 700 companies in 34 countries." OmniMark LE is a free product which runs utility-sized OmniMark programs. It is described as useful for: "(a) small-sized utility programs; (b) program development on the road away from your commercial licenses (since OmniMark LE will compile a large program -- it won't just run it); (c) evaluating OmniMark V3's capabilities before licensing V3." OmniMark LE is available on many platforms, including Windows NT/95 and many varieties of UNIX. Seethe program description for other information, orthe main database entry. "OmniMark LE will compile and execute programs that contain 200 or fewer actions in the program source. An action is a statement that OmniMark executes, distinguished from a "rule header" (e.g. an element rule) which describes an event. Within each element rule, one action is not counted towards the 200-action limit. The action count is performed at compile time, not run time; this means that any given action in a 200-action program could execute millions of times." Links: LT NSL and NSL (Normalised SGML Library)[CR: 19970128] From the Language Technology Group, Human Communication Research Centre, University of Edinburgh: the "Normalised SGML Library (NSL version 2.0 ) . . .consists of a set of C programs for manipulating SGML files and a C application program interface (API) designed to ease the writing of C programs which manipulate SGML documents." "LT NSL is a development environment for SGML-based corpus and document processing, with support for multiple versions and multiple levels of annotation. It consists of a C-based API for accessing and manipulating SGML documents and an integrated set of SGML tools. The LT NSL initial parsing module incorporates v1.1.1 of James Clark's SP software, arguably the best SGML parser available. The basic architecture is one in which an arbitrary SGML document is parsed once, yielding two results: (1) An optimised representation of the information contained in the document's DTD; (2) A normalised version of the document instance, which can be piped through any tools built using our API for augmentation, extraction, etc. Links:
TclYasp SGML toolkitExtracts from the announcement by David Durand: "TclYasp integrates a conforming SGML parser with the TCL scripting language. . . Unlike CoST 1.1, it uses an simplest-possible procedure call interface, rather than an eloborate object-oriented interface. . . TclYasp does have a few unique features: it's based on YASP, which is more easily portable (it's in ANSI C and not C++) and was designed to be integrated with an application. Since Yasp is fully re-entrant, more than one parser can be active at a time. It is not restricted to the informationd efined by the ESIS, as the full parser data is available. . . TclYasp/Mac includes a command shell, multiple-pane windows, limited on-screen text formatting, and a variety of interface features as well as the SGML processing stuff." Links:
Python for XML/SGML Processing[CR: 19981103] A few people (at least) believe that Python is well suited for SGML text processing. Sean McGrath wrote that it "beats any other language I know for SGML processing hands down", and Paul Prescod said: "Python is a really easy, incredibly powerful scripting language. . . [it] combines the best features of other scripting languages and borrows many neat features from the Great Languages from history (Simula, SmallTalk, Lisp, Algol)." Links [provisionally]:
I4I S4-Desktop V2.1 SGML middleware[CR: 19970212] Educational Support Program:
SENG: SGML transformation engine[CR: 19960806] SENG = Scheme engine add-in for SP. "SENG is a transformation engine based on SP 1.0. It executes an SGML document as scheme code, [using a 'style semantics' concept]. SENG provides some basic procedures (some DSSSL like) to manipulate and access information from an SGML Instance. SENG was developed as a testbed for DSSSL experiments as well as an interm transformation engine for SGML. Features: (1) Cross document transformations; (2) Access to element context and left-siblings; (3) R4RS Scheme programming environment; (4) Simple syntax for style semantics (style sheets)." "There is a free distribution for SENG for both the WIN32 and Solaris 2.x environments." Links:
SGML-SPGrove[CR: 19980105] "SGML-SPGrove is a Perl module that links with James Clark's SGML Parser (SP) and builds in-memory groves from SGML, XML, and HTML documents. The groves can be accessed using iterator and callback (visitor) interfaces." "SGML::SPGrove takes a system identifier and passes it to SP to parse, as each element is parsed from the document SPGrove builds Perl objects to match. When done parsing, SPGrove returns an SGML::SPGrove object that contains the root element of the parsed document and an array (hopefully empty :-) of parser errors. Elements of the document are SGML::Element objects. Elements have a generic identifier (or name), attributes, and the contents of the element. Attributes are stored as a Perl hash, with the values as an array of scalars and SGML::SData objects. The contents of an element may be more Elements, scalars, SData objects, processing instruction (PI) objects, or Entities. SGML::SData objects are replacements for character entity references within the document. The Text::EntityMap perl module can be used to map SData replacements from common character entity sets to common output formats." Links:
SGMLC (-Lite) products for MS-Windows[CR: 19980218] [February 18, 1998]Announcement from Bruce Hunter (SGML Systems Engineering) for an updated version (1.3) of the SGMLC language tools "designed specifically for creating SGML document processing applications in the Microsoft Windows environments." New features in this release include: "1) inbuilt ODBC support (query and update databases during document processing); 2) PERL-type regular expression support; 3) binary file I/O, with associated seek, put, etc. functions; 4) widow/orphan and hyphenation control; etc. The free, unregistered version of the SGMLC Publisher Development Environment includes all the new features listed above, plus full access to all the conversion, browser and publisher facilities." "The 16-bit SGMLC-Lite free compiler for MS-Windows is now available from http://www.dircon.co.uk/sgml. A 32-bit version will be available soon from the same source. Unix and Mac versions are still [April 1996] some way off, I'm afraid. . .SGMLC is a language designed for processing SGML documents. It is based upon the C language, with some elements of C++. It recognises events which occur when processing an SGML document. You then provide the code to tell the application how to process the event. . .SGMLC may be used, for example, for writing SGML transformation applications, for converting SGML documents into some other form; extracting selected bits of information from an SGML document; creating semantic parsers, where you need to check more than just the SGML conformance of a document (for example, that a CDATA attribute conforms to a defined criteria); creating SGML browser and IETM applications, via the SGMLC-View add-on library; formatting and printing SGML documents into various output formats (PostScript, PDF, HPGL, etc.), again via the SGMLC-View library." [from the announcement, April 1996] Links:
SGML Formatting Toolsformat: Thomas Gordon's QWERTZ SGML -> LaTeX formatting package[CR: 19980220]
gf: Gary Houston's general formatter program[CR: 19960724] "gf is short for "general formatter",i.e., it can work on documents which use the ISO "general" document type definition (DTD). It can convert SGML documents conforming to a small number of DTDs into various output formats: LaTeX, ASCII, RTF and Texinfo. However not every output format can be generated for every DTD." "Apart from the general DTD, gf supports the HTML DTD used in the WWW project and the Snafu DTD I [Gary Houston] just made up. There are many other DTDs which would be worth supporting. However gf is not intended as a flexible system for hacking up a formatter for a random DTD, but as a usable document production system for a few DTDs." [from the README, version 0.46, July 1996] Links:
Jörg Wittenberger's Typeset PackageWittenberger'stypeset is a formatter for SGML documents. "Overview [March 1995; link to the URL given below for a July 12, 1995 update]: Typeset is an extensible formatter for documents. It transforms documents using SGML markup into various target formats. Typeset comes with a couple of document type definitions (DTD's). The DTD's feature the reuse of text, minimization of markup and readability of the SGML source. They share their elements as much as possible. The formatting differs due to the features possible in the target format and to the rules common for type of the document. This includes the automated rearrangment of text and insertion of standard parts like content sections and bibliography. The latter for instance is composed from the items of a database which are referenced in the document. Future versions will also generate an index. According to the goal and the aim to support many target formats, these DTD's don't attempt to cover each and every case possible. Instead they try to provide all elements nessesary for daily use and leave the implementation of special features to extensions. It is fairly easy to coerce typeset to parse documents with other DTD's. But this implies to write rules for formatting in the desired target format(s). The transformation (formatting) is described by files of scheme code related to both, the document type and the target format. Only combinations of common value are supported by default. (For instance for letters only PostSript output is defined.)" "Currently there are these DTD's: document (Simple ,,plain'' documents); report (Technical reports, documentation etc.); bibdata (Bibliography database); manpage (Pages for the Unix(TM) man command); brief (A letter according to DIN). Currently the following output formats are supported: PostScript (for english and german text); HTML (Hyper Text Markup Language); Info (to be included into the on-line help of emacs); man suitable for roff -man; ASCII; limited support for RTF. Future output formats will include: RTF, LaTeX." Update on features [July 15, 1995]:* sorted indexes; * notation handling; * Addison-Wesley like entities for math symbols and greek letters; * handling for entities of notations eps, latex, roff (tbl), xfig, lout (as far as there is a chance for a final repesentation); * inlined code of aforementioned notations possible; * LaTeX 2.09 and LaTeX2e supported as backendl * limited support of RTF; "But it still lacks a native SGML table construct" [communique from Joerg Wittenberger] Contact: Jörg Wittenberger's SDC Package[CR: 19960806] "SDC is a well featured, free system aiming to make SGML suitable for day to day use. SDC compiles SGML documents into representations as PostScript, LaTeX, HTML, man pages, (emacs) info files and is a little RTF aware. The goal of SDC is to be author friendly, easy to use without the need of special editors and to hide as much of the backend as possible. Hence the required markup is minimized, mixed content type allow you to type text (almost) everywhere and get the desired meaning. There are no `special characters' but those special to SGML (< and &)." Links:
RATFINK SGML <--> RTF Conversion[CR: 19970107] "RATFINK, a library of Tcl utilities for generating RTF files, including a Cost script for converting SGML to RTF, is now available." From Joe English Links:
SGML-Tools [Was:Linuxdoc-SGML][CR: 19980716] SGML-Tools "is a text-formatting package based on SGML (Standard Generalized Markup Language), which allows you to produce LaTeX, HTML, GNU info, LyX, RTF, and plain ASCII (via groff) from a single source; due to the flexible nature of SGML many other target formats are possible. This system is tailored for writing technical software documentation, an example of which are the Linux HOWTO documents. However, there is nothing Linux-specific about this package; it can be used for many other types of documentation on many other systems. It should be useful for all kinds of printed and online documentation. The package was formerly called Linuxdoc-SGML because it originates from the Linux Documentation Project (LDP). The name has been changed into SGML-Tools to make it clearer that there is no Linux-specific stuff included in this package." Currently [September 1997] maintained byCees de Groot. [March 17, 1998]Update on SGML-Tools, from Cees de Groot: "SGML-Tools v2.0 will be an all-new version of SGML-Tools (currently in its 1.0 incarnation) that will base on DocBook and offer users migration software from linuxdoc to DocBook. We hope that this move will give authors more freedom in choosing their software, give anthologists and publishers/Linux distributors more useful raw material, and maybe even move a lot of current LaTeX-based documentation into the existing body of SGML documents. At the moment, the main target is to decide which software to base on: Quilt, by Ken MacLeod, or DSSSL. From there on, we'll attempt to build a distribution that offers functionality comparable to the current version of SGML-Tools (like support for multiple languages), and we hope to release in 98Q3. . . " [February 27, 1998 - provisional paragraph]Announcement from Ken MacLeod for a new 'Quilt Kit' centered around Quilt, with DTDs and docs for DocBook, LinuxDoc, and TEI Lite. It has been produced in conjunction with SGML-Tools. Quilt is a "processing and formatting framework for structured documents. Quilt is intended to support processing of common, rich document elements and is tested against and comes with support for the DocBook, LinuxDoc, and TEI Lite SGML document types (DTDs) and formatting to Ascii and HTML. Quilt-Kit Quilt bundled with all the tools, except Perl, to format DocBook, LinuxDoc, and TEI Lite documents, including user guides. The kit also includes DSSSL stylesheets for DocBook for use with Jade. The Kit includes: James Clark's Jade, James Clark's Jade, Dean Roehrich's Class-Eroot, Norm Walsh's DSSSL stylesheets for DocBook (db104), The Davenport Group's DocBook v3.0 DTD (docbk30), The Davenport Group's DocBook SGML documentation (dbsset), SGML-Tools' LinuxDoc DTD and docs (sgml-tools-dtd), The Text Encoding Initiative's TEI Lite DTD (teilite), TEI's TEI Lite documentation (teiu5), Class-Visitor, PkgMaker (a utility used by the Kit to install), Quilt, SGML-Grove, SGML-SPGroveBuilder, entity-map, and iso-entities-8879.1986." See theQuilt for Perl README (1997/10/25). Links:
TEItools[CR: 19981216] [December 16, 1998] From Boris Tobotras. "TEItools is loosely coupled set of scripts, written in Tcl, which does various SGML transformations. Source file format TEItools uses is SGML, inTEI Lite incarnation. Currently they include converters: 1) from TEI Lite to HTML, RTF, TeX, DVI, PS, PDF; 2) from HTML to TEI Lite, Linuxdoc, TeX, DVI, PS, PDF; 3) from Linuxdoc to HTML, TEI Lite, DocBook, TeX, DVI, PS, PDF; 4) from DocBook to TEI Lite (very preliminary). TEItools was inspired by idea of SGMLtools (previously known as SGML-tools, previously known as linuxdoc-sgml. TEItools belongs to SGML conversion class of tools. It is part of entire document management system. You should be absolutely clear about that: just the part. Entire DMS includes also document repository, version control, access control, usage policy, documents editing, search and retrieval, to name just the few. Please don't expect TEItools will be all of that. But it can help you to build such a system, as it did for me." References:
MetaMorphosis - SGML/XML Tree Transformer[CR: 19980910] [September 10, 1998] Anannouncement was made on CTS by OVIDIUS for the release of MetaMorphosis 3.0. "MetaMorphosis is a target-driven SGML/XML tree transformer. Parsers for other input formats may easily be plugged into MetaMorphosis using the freely available tree representation API (MMdb-API). The software runs on MS Windows95/98/NT, Linux, and Solaris. Version 3.0 represents a complete redesign of MetaMorphosis. It has a modular architecture, and a set of APIs and is available as an SDK Version which allows a complete integration of MetaMorphosis into other applications. New features in version 3.0 include an enhanced query and transformation language, full XML support, support of various character encodings (Unicode, Shift-JIS, etc.), full integration of SP, etc." A demo version (Win32) and a free for Linux (ELF) version are available for download. [Earlier description:] MetaMorphosis (fromMID/Information Logistics Group) "is a modular, programmable tree transformer. It is used to convert any valid SGML instance to any other format, including SGML, arbitrary word processor formats, formats for hypertext systems, database tables, etc. MetaMorphosis has three main modules: (1) a workflow system which is highly configurable; (2) the MetaMorphosis kernel which in itself has a modular architecture [source tree generator, binary tree reader, tree transformer, tree annotator], and (3) a set of output processors. MetaMorphosis is quite different from any other SGML conversion tool. The underlying paradigm is that SGML instances are treated as trees rather than character streams. An SGML instance conceived as a character stream basically allows only sequential access to the instance. Furthermore, context information is usually limited to the elements on the root path and their left siblings. The tree model on the other hand allows random access to any node in the tree at any moment. The instance is seen as a kind of structured database each part of which can be accessed, copied, moved or deleted." Links:
gmat: an SGML Publishing System[CR: 19971113] "gmat: an SGML Publishing System is being developed by O'Reilly & Associates. It is currently in early alpha testing and suffers most noticeably from a lack of documentation." Version 0.2.2b (gmat-0.2.2b.tar.gz) is dated July 14, 1997. SGML2TeX - SGML-to-TeX converter"Converts a fully-qualified, pre-parsed SGML instance to a file with TeX-style control sequences replacing the SGML tags and entity names, and writes the element, attribute and entity names to a template style file so that their TeX expansions (macro meanings) can be edited in and the file printed using TeX. Written in PCL, requires PCL.COM and PCL.SYS to run (available from Calend Ltd, Twickenham, Middlesex, England)." Generalized Document Objects (GDO)[CR: 19970628] Written byKen MacLeod, "GDO is a framework for loading, integrating, and formatting structured documents. It can be used to load all or part of a document or data structure, contain or be contained by application objects, query and iterate over elements, merge elements or other documents, and format and output a document. Documents can be manipulated in their native format, in a generalized format, or in the output format. "Possible applications include interactive documentation, context-sensitive help, dynamic Web pages, document management, contextual data, catalogs and directories, and browser-agnostic Web servers. . . release 0.4 is written in Perl 5, and supports SGML input and output to HTML 2.0 and plain ASCII. Links:
tei2latex - TEILITE to LaTeX2e[CR: 19971022] TEI2LATEX and TEI2HTML version 0.2. - 'Two Perl5 Programs to Translate TEI Lite Documents into LaTeX2e and HTML documents .' "tei2latex is a Perl5 Program to Translate TEI Lite Documents into LaTeX2e documents. . . The translation process can be configured in several ways for two reasons: (1) to enhance the default translation in case TEI Lite lacks information about the presentation (as in tables for instance); (2) to personalize the presentation of a document or a set of documents." [from the announcement; see below]
DSSSL Software Tools[CR: 19981013][Table of Contents] Seethe main DSSSL entry for fuller information about DSSSL sample application profiles, (Jade) compatible utilities,DSSSL stylesheets, DSSSL tutorials, DSSSL development tools, etc. Jade - James [Clark]'s DSSSL Engine[CR: 19981013]
Jade MIF Backend[CR: 19980505] As of May 04, 1998, 'Jade MIF Back' is based upon James Clark's Jade version 1.1, and its current version number is 1.0e.
YADE (Yet Another DSSSL Engine)[CR: 19970421] YADE (Yet Another DSSSL Engine) is a DSSSL engine being developed by Norbert H. Mikula, [previously of Philips Semiconductors]. YADE "is, as the name suggests, a project whose final outcome should be a Java-based implementation of a tool that is able to process documents conforming to ISO-IEC standard 10179. In order to reduce development complexity DSSSL-Online, a subset of DSSSL that has been specifically designed to allow early software implementers to provide a common accepted minimal conformance to ISO-IEC 10179, has been chosen as the first milestone to achieve." The current version of YADE [April 21, 1997] uses Kawa 1.2. According toLou Burnard's report on YADE as presentated at the Third Annual Conference of the Belgium-Luxembourgian SGML Users' Group: 'Yet Another DSSSL Engine' (or YADE), uses Milowski's Kawa scheme interpreter [sic - for "Milowski's" read 'an older version of Per Bothner's Kawa Scheme implementation'], also written in Java. The context for these tools is the Philips Semiconductors Electronic Databook, an application of PCIS, the dtd Philips have developed within the Pinnacles framework, and forms the basis of Mikula's research at the University of Klagenfurt in Austria. His presentation was impressive, and although only in prototype form, the work he outlined shows great potential." As of March 29, 1997, Mikula considered YADE not yet ready for public release, but thought it might be after the presentation at WWW6 in Boston. "YADE... is a DSSSL engine that has been implemented using Java. Right now it is used in conjunction with my XML parser NXP (Norbert's XML Parser). YADE is using the Scheme engine Kawa, which has been developed by Per Bothner. YADE also follows the concept of having a core DSSSL engine and 'backends' for output to a certain device. As of today, YADE only supports the Java AWT (Abstract Window Toolkit) as a reference implementation of a backend." [from a description contributed by NHM] Links:
DSC---DSSSL Syntax Checker[CR: 19970710] "This tool, which embeds a full R4RS Scheme interpreter in James Clark's SP parser, is designed both to provide an online syntax checker for all DSSSL expression, style and transformation language programs, and to serve as a preprocessor for any Scheme-embedded DSSSL implementation." [from version 1.0 announcement] "Version 2.0, providing a much richer implementation framework, including the ocre query language, is scheduled for 2Q97."
DSSSL Developer's Toolkit[CR: 19970602] Theannouncement from R. Alexander Milowski (Copernican Solutions Incorporated) describes the DSSSL Developer's Toolkit (DSSSLTK) version 1.0, available as a downloadable distribution. The toolkit "is similar in nature to the applet or serverlet architectures developed by Sun Microsystems/JavaSoft. . . a set of abstract interfaces written in Java to allow application developers to work with different Java-based DSSSL environments. . .[it] serves as an interface between difference DSSSL components. It represents an architecture for building DSSSL-oriented systems using the Java programming language. . .[it] provides a means for different DSSSL implementations in Java to share components such as parsers, transformation engines and flow object semantics. The toolkit contains three Java packages: dsssl.engine, dsssl.grove, and dsssl.flowobject. . . Developed as part of the Seng DSSSL Environment from Copernican Solutions, the SSSL Developer's Toolkit contains: (1) Full source code to the interfaces and classes; (2) Javadoc for the API reference; (3) Configuration and makefile utilities for building the distribution; (4) A prebuilt zip file containing all the classes." Links:
Kawa - Java-based Scheme System (SENG)[CR: 19970421] "Kawa is a full Scheme implementation. It implements almost all of R4RS (for exceptions see section Features of R4RS not implemented), plus some extensions. It provides define-syntax from the R4RS appendix, and (from the draft R5RS) eval and multiple values. . . It is completely written in Java. Scheme functions and files are automatically compiled into Java byte-codes, providing reasonable speed. (However, Kawa is not an optimizing compiler, and does not perform major transformations on the code.) . . .Kawa provides the usual read-eval-print loop, as well as batch modes. . . Kawa is written in an object-oriented style. Kawa implements most of the features of the expression language of DSSSL, the Scheme-derived ISO-standard Document Style Semantics and Specification Language for SGML. Of the core expression language, the only features missing are character properties, external-procedure, the time-relationed procedures, and character name escapes in string literals. Also, Kawa is not generally tail-recursive, and literal unescaped symbols are case-insensitive (folded to lower-case). From the full expression language, Kawa additionally is missing format-number, format-numer-list, and language objects. Quantities, keyword values, and the expanded lambda form (with optional and keyword parameters) are supported." [from the FAQ for version 1.4, updated March 31, 1997.] Links:
psgml-dsssl[CR: 19971113] "This program generates skeleton DSSSL specifications for DTDs from within PSGML. Emacs and PSGML are required."
panodssl[CR: 19970307] PANODSSL.pl version 0.2. Script for converting Panorama stylesheets to DSSSL specifications.
psgml-jade[CR: 19980423] Matthias Clasen (Institut für Mathematik, Albert-Ludwigs-Universität Freiburg) has contributedpsgml-jade. psgml-jade is "an add-on to thepsgml package for editing SGML files with Emacs which is intended to make menu-driven processing SGML files with jade and jadetex possible. It requires Gnu Emacs or XEmacs, together with Lennart Staflin's PSGML mode (tested with version 1.0.1) and David Megginson's DSSSL extensions (psgml-dsssl.el). It can also take advantage of David Love's new scheme.el which defines dsssl-mode." Links:
Jadetex Package[CR: 19981020] "Jadetex package, an implementation of the TeX skeleton produced by "jade -t tex". . built on top of LaTeX. From Sebastian Rahtz (s.rahtz@elsevier.co.uk) and David Megginson:
DSSSL editing under emacs (dsssl/scheme mode)[CR: 19970425]
SGML/DSSSL Presentation Development Application[CR: 19980511] Ken Holman (Crane Softwrights Ltd.) announced the public availability of an SGML/DSSSL Presentation Development Application. It is an SGML application for frame-based presentation slide-shows with DSSSL scripts for the rendering of the slides to HTML and RTF final forms. This shareware application may be used with James Clark's JADE DSSSL Engine "to create slide-show presentations and associated paper handouts" from SGML source documents. The tool is "based on an SGML document model (DTD) and uses two DSSSL stylesheet scripts to render the structured presentation in both HTML and RTF." Links: XML/XSL Software Tools[CR: 20001011][Table of Contents] The mainXML document in the SGML/XML Web Page contains a sectionwith references to generally-available XML/XSL/XLink software, and a section onXML design and development resources. Some other XML (demo) applications are listed in the sectionXML: Miscellaneous Unevaluated Uncategorized. Software packages specific to XSL and XLink are listed on the dedicatedXSL andXLink pages. For XML software tools, see also: Steve Pepper'sWhirlwind Guide to SGML Tools and Vendors, and theFree XML software list from Lars Marius Garshol. Lark, an XML processor[CR: 19980105] Tim Bray of Textuality (and one of the XML editors) is developing Lark, an XML processor. The name 'Lark': "Lauren's Right Knee" [ask Tim]. The Textuality server contains a document "An Introduction to XML Processing with Lark," the abstract of which says, in part: "Lark is a non-validating XML processor implemented in the Java language; it attempts to achieve good trade-offs among compactness, completeness, and performance. . . Lark is available on the Internet for general public use." Note that the Textuality Web server has a number of other resources for XML. "Lark is a processor only; it does not attempt to validate. It does read the DTD, with parameter entity processing; it processes attribute list declarations (to find default values) and entity declarations. Lark's internationalization is incomplete; it reads UCS-2, UTF-16, and ASCII (making use of the Byte Order Marks and Encoding Declarations in the appropriate fashion), but not UTF-8. Aside from that, Lark is relatively full-featured; it implements (I think) everything in the XML spec, except conditional DTD sections, and reports violations of well-formedness." [description of October 29 1997]
DXP - DataChannel XML Parser[CR: 19980504] "DXP is a validating XML Parser in Java. DXP is based on NXP (Norbert Mikula's XML Parser), one of the first XML parsers." The current version of DXP [19980504] is 1.0 beta1c. "DXP is specifically aimed at providing a utility for server-side applications that need to integrate XML capabilities into existing systems and for out-of-the-browser Java-based software. DXP provides the highly sophisticated error-checking mechanisms required for XML-based data interchange. DXP has not been architected for usage in an applet context, downloaded via the Internet. DXP, due to its complexity and feature set, is too large and would cause performance problems if transferred via the Internet. DXP uses JavaCC, a Java compiler-compiler that allows for the automatic generation of a parser framework based on a formal specific of the language (XML) targeted." NXP, Norbert's XML Parser: an XML parser written in Java[CR: 19980504] [The successor to NXP is DXP - DataChannel XML Parser. See immediately above.] NXP is a public domain XML parser written in Java, byNorbert Mikula. The lexical analyzer and the grammar has been defined using the parser generator Jack. In beta development stage. As of March 08, 1997, it supported: Public Identifiers, catalogs (incl. DELEGATE and CATALOG), Parameter Entitities, Resolution of Name conflicts, Attribute defaults.
Microsoft XML parser in Java (MSXML)[CR: 20001011] [October 11, 2000] Microsoft XML Parser (MSXML). See (1) Joshua Allen's"Unofficial MSXML XSLT FAQ." (2)"What's New in the September 2000 Microsoft XML Parser Beta Release." (3)"Internet Explorer Tools for Validating XML and Viewing XSLT Output" (March 15, 2000 or later). (4)"Installing Msxml3.dll in Replace Mode" (September 2000 or later). [19971209] "The Microsoft XML Parser is a validating XML parser written in Java. The parser loads XML documents and builds a tree structure of Element objects, starting with the root object of type Document. Each XML tag can either represent a node or a leaf of this tree. You can then browse and edit the tree using the methods of the Element class, and you can save the tree back out in XML format." [December 09, 1997] Version 1.8 of theMicrosoft XML Parser in Java was released on December 04, 1997. Version 1.8 of the parser implements the entire W3C working draft of the XML specification dated November 17, 1997, including support for thestandalone attribute, new End-of-Line Handling, support for thexml:lang attribute on any tag regardless of ATTLIST declaration, [now] lower-casing of some generated GIs and attribute names, etc. The parser "will be revised to reflect future W3C changes to the specifications. The Microsoft XML Parser is a validating XML parser written in Java(r). The parser checks for well-formed documents and optionally permits checking of the documents' validity." [November 01, 1997] New features were announced for version 1.6 of the Microsoft XML Parser in Java. Released October 31, 1997, the package containing the source code for the latest version of the XML Parser supersedes the XML Parser that shipped with Internet Explorer 4.0..."it implements the entire W3C working draft of the XML Specification dated August 7th, 1997, and will be revised to reflect future W3C changes to the specifications. . . The Microsoft XML Parser is a validating XML parser written in Java(tm). The parser checks for well-formed documents and optionally permits checking of the documents' validity. Once parsed, the XML document is exposed as a tree through a simple set of Java methods, which [Microsoft is] working with the World Wide Web Consortium (W3C) to standardize." As elaborated in therelease notes, changes in the latest version include: (1) Case sensitivity; (2) Conditional sections in the DTD (INCLUDE and IGNORE keywords); (3) Support for namespaces (seeXML Namespaces document); (4) Support for the ENCODING attribute on the XML tag; (4) Support for the XML-SPACE attribute in regular XML and in the DTD; (5) Support for the RMD attribute on the XML tag; (6) New Document save options for COMPACT and PRETTY save formats; (7) Support for floating ampersands,e.g., 'This & that'; (8) Support for empty end tags,e.g., <Foo>bar</>." The main XML page from Microsoft now referencesseveral online demos for XML, andsample XML files. [June 1997] "The XML Parser in Java (MSXML) from Microsoft Corporation is now [June 07, 1997] available for download. This is the second piece of XML technology from Microsoft, the first being theChannel Definition Format support in Internet Explorer 4.0. The Microsoft XML Parser can be installed on any machine that has the Java Development Kit (JDK 1.0.2 or JDK 1.1.1)." Links:
XP, an XML parser in Java (James Clark)[CR: 19980813] On January 26 1998, James Clark posted anannouncement for the public availability of a new XML parser in Java, tentatively calledXP, along with an expanded collection of test cases, and a specification of a subset of XML called Canonical XML (for use in testing XML parsers). The XP parser, now in alpha-test version, "is fully conforming: it detects all non well-formed documents. It is currently not a validating XML processor. However it can parse all external entities: external DTD subsets, external parameter entities and external general entities." XP's design goals are documented as follows: 1) Conformance and correctness: XP is designed to be 100% conformant to the XML specification; 2) High performance: XP aims to be the fastest conformant XML parser in Java; 3) Layered structure: In addition to a normal high-level parser API, XP provides a low-level API that supports the construction of different kinds of XML parser (such as incremental parsers)." XP is one of several XML development resources made available by James Clark; see the link to his "XML Resources" below. [August 13, 1998]On August 13, James Clark announced the availability of XP version 0.4 - 'XML Parser in Java'. In XP version 0.4, the main change "apart from bug fixes is that XP now makes available much more information about the markup of the document (non-ESIS information) including information about comments, entity references and the document type." XP supports several encodings: UTF-8, UTF-16, ISO-8859-1, US-ASCII. Links:
expat - XML parser in C[CR: 20001011] [October 11, 2000] See theannouncement for the release of version 1.2 [2000-10-06]. With this release, expat development is handed over to Clark Cooper and others. [July 02, 2000]expat - "12-May-00 01:11 145k" [cache] [May 31, 1999]James Clark announced the release of Expat Version 1.1, which may be used under either the Mozilla Public License Version 1.1 or the GNU General Public License. "Expat (XML Parser Toolkit) is an XML 1.0 parser written in C. It aims to be fully conforming [but] is currently not a validating XML processor. New features of expat version 1.1 relative to 1.0 include: (1) Support for XML namespaces, (2) Ability to report comments, (3) Ability to report CDATA section boundaries, (4) Ability to report which attributes are defaulted, (5) Compile option to reduce object-code size at the expense of speed. Expat has built in support for the following encodings: utf-8, utf-16, iso-8859-1, and us-ascii. Additional encodings can be supported by using [November 23, 1998] James Clark hasannounced the availability of theexpat - XML Parser Toolkit Version 1.0.1, containing bug fixes. Expat "is an XML 1.0 parser written in C which aims to be fully conforming, but is currently not a validating XML processor. . . [the distribution] contains the [August 14, 1998] A new version of James Clark'sExpat is now available. Expat version 1.0 represents the first production release of this XML Parser Toolkit. Changes since the last beta version are a few minor bug fixes. Clark's Expat is an XML 1.0 parser written in C. It 'aims to be fully conforming, but is not currently a validating XML processor. The distribution comes with Win32 executables. It also includes an "xmlwf application, which uses the xmlparse library. The arguments to xmlwf are one or more files which are each to be checked for well-formedness. An option [June 21, 1998] James Clark has announced the release of a new version ofexpat, his "high-performance, fully conforming, non-validating XML 1.0 parser toolkit written C." The public distribution comes with source code and Win32 binaries, and is subject to the Mozilla Public License Version 1.0. "The directory xmlwf contains the [May 1998] James Clark'sExpat (XML Parser Toolkit) is distributed under the Mozilla Public License Version 1.0. In its current beta-test version (19980504), "Expat is an XML 1.0 parser written in C. It aims to be fully conforming. It is currently not [currently] a validating XML processor. . . [the distribution contains] thexmlwf application, which uses the xmlparse library. The arguments to xmlwf are one or more files which are each to be checked for well-formedness. An option -d dir can be specified; for each well-formed input file the corresponding canonical XML will be written to dir/f, where f is the filename (without any path) of the input file." The predecessor to expat was called xmltok (see below). Links:
XMLTok - XML parser in C[CR: 19980210] See now the successor (expat), above. From James Clark,XMLTok "is an XML parser in C. This includes 1) a low-level XML tokenizer; 2) a non-validating XML parser built on the tokenizer; this has an API designed for integration into Web browsers; 3) a simple applicationxmlwf for testing the parser, which can test XML entities for well-formedness and generate canonical XML." Links:
SX - An SP application for SGML to normalized XML[CR: 19980216] James Clark posted an announcement on October 28, 1997 for a "very preliminary release of SX, an application built with the SP library for converting SGML to XML." This tool will eventually be included in the standard SP distribution. SX (the provisional name) "parses and validates the SGML document contained in sysid... and writes an equivalent XML document to the standard output. SX will warn about SGML constructs which have no XML equivalent." Thedistribution includes both source and Win 32 binaries (the sp120u.dll file included in the SP 1.2.1 Win32 Unicode binary distribution is required). Note that the program "does not yet provide enough to handle the situation where you want to migrate your document source from SGML to XML. In particular it doesn't try to preserve entity references; all entities are expanded." As of the February 1998 test release of SP - peran announcement from James Clark for a new test release of SP (version 1.2.92) and Jade (version 1.0.93) - SP includes the SX application. Links:
SAX - the Simple API for XML[CR: 19990226] SAX 1.0 (the Simple API for XML) was released on May 11, 1998. SAX is a common, event-based API for parsing XML documents, developed as a collaborative project of the members of the XML-DEV discussion under the leadership of David Megginson. Relative to the preliminary draft version of SAX released in January 1998, SAX Version 1.0 represents a major reimplementation, adding some important features such as the ability to read documents from byte or character streams. "SAX fills the same role for XML that the JDBC fills for SQL: with SAX, a Java application can work with any XML parser, as long as the parser has a SAX 1.0 driver available. . . The first release of SAX is in Java, but versions in other programming languages may follow. SAX is free for both commercial and non-commercial use." Links:
Docuverse DOM SDK. Previously 'FREE-DOM - W3C DOM API using SAX' andSAXDOM[CR: 19980907] Docuverse DOM SDK is an implementation of W3C Document Object Model (DOM) API in Java. As of Preview Release 2, it includes W3C DOM HTML API support. Has: "Support for W3C Proposed Recommendation for DOM (Core) Level 1, Support for SAX 1.0 compatiable XML Parsers, Support for custom node implementations, Full JavaDoc documentation for the DOM API." [Previous description, partially out-of-date, follows:] The FREE-DOM package is from Don Park. A free DOM implementation, formerly called SAXDOM, ". . .supports but is not limited to SAX. Look for AElfred and MSXML support in the near future with expanded DOM spec support (meaning XML and HTML portion of the spec). It can be used with MSXML if you combine two drivers: one for bridging FREE-DOM to SAX and another for bridging SAX to MSXML. A direct driver from FREE-DOM to MSXML as well as other popular parsers is planned." FREE-DOM [SAXDOM], under development byDon Park, is an implementation of W3C Document Object Model (DOM) API using Simple API for XML (SAX). In Java. SAXDOM is in public domain and can be used for any commercial or non-commercial purpose. "The DOM isn't finished, so any implementation is necessarily tentative. With that warning, however, you can look athttp://www.quake.net/~donpark/saxdom.html. The nice thing about Don's work is that SAXDOM will run with any SAX-conformant Java XML parser, so you can use NXP, Lark, MSXML, AElfred, and/or XP, as you wish. Don also includes some information about integrating the DOM with the new, standard Java Swing widgets." [comment from David Megginson, author ofSAX, posted to XML-DEV on February 7, 1998] Links:
Saxon: An Open-Source XSLT Processor[CR: 20020411] Update 2002-04 On 20-December-2001, to coincide with the publication of the first working drafts of XPath 2.0 and XSLT 2.0,Michael Kay [now of Software AG] released version 7.0 of Saxon, as an initial implementation of these drafts. Saxon 6.5 continues to be available as an XSLT 1.0 processor with some extensions based on the abortive XSLT 1.1 working draft. Since its initial release, Saxon has concentrated increasingly on its role as a highly-conformant XSLT processor, known for its good processing speed and for its library of extension functions. Meanwhile its original role as a Java class library has declined, although it can still be used this way. As well as XSLT 1.0 and XPath 1.0 support, Saxon supports the JAXP 1.1 API, and has interfaces with DOM, SAX2, and JDOM. It also integrates with the Apache FOP processor for XSL Formatting Objects." [text supplied by MKay] References:
Earlier references Retained for historical purposes. [2000-01-19] "The SAXON package is a collection of tools for processing XML documents. The main components are: (1) An XSL processor, which implements the Version 1.0 XSLT and XPath Recommendations from the World Wide Web Consortium, found at http://www.w3.org/TR/1999/REC-xslt-19991116 and http://www.w3.org/TR/1999/REC-xpath-19991116 with a number of powerful extensions (2) A Java library, which supports a similar processing model to XSL, but allows full programming capability, which you need if you want to perform complex processing of the data or to access external services such as a relational database. So you can use SAXON by writing XSL stylesheets, by writing Java applications, or by any combination of the two. If you are only interested in running the XSL interpreter, on a Windows platform, try Instant SAXON. At 250 Kb, this is a much smaller download; it excludes source code and API documentation. SAXON provides a set of services that are particularly useful when converting XML data into other formats. The output format may be XML, or HTML, or some other format such as comma separated values, EDI messages, or data in a relational database. SAXON implements the XSLT recommendation, including XPath, it its entirety. SAXON also does things that are beyond the scope of the XSL standard: for example: (1) It allows XSL processing and Java processing to be freely mixed, so you can always escape into procedural code to do something non-standard (such as accessing a database) (2) It allows multiple output files. SAXON is particularly useful for splitting a large document into page-sized chunks. You can do this without writing any Java code. (3) It allows multi-pass processing, by means of an extension function that converts a result tree fragment to a nodeset, or by chaining stylesheets together (4) It allows variables to be updated..." [January 19, 2000] A posting fromMichael Kay to the XSL-Listannounces the release of SAXON version 5.0, which supports the W3C XSLT and XPath Recommendations published on November 16, 1999. Kay says that theSAXON 5.0 package is now "a complete implementation of XSLT 1.0 and XPath 1.0 - If there are any parts of the spec it doesn't implement, then that's an oversight and will be treated as a bug. Apart from full conformance, the new things in this release include: (1) a number of new extension functions [intersection(), difference() and has-same-nodes() to compare node-sets; line-number() and system-id() of the current node in the source document; if(condition, then, else)]; (2) stylesheet chaining: [specify <saxon:output next-in-chain="phase2.xsl"> to send the output of this stylesheet to be the input to another stylesheet]; (3) user-definable numbering and collating sequences; (4) internal improvements to node-set handling and sorting, which should result in better performance when handling large node-sets, and should certainly reduce the load on the garbage collector." The SAXON package is "a collection of tools for processing XML documents. The main components are: (1) An XSL processor, which implements the Version 1.0 XSLT and XPath Recommendations from the World Wide Web Consortium [...] with a number of powerful extensions; (2) A Java library, which supports a similar processing model to XSL, but allows full programming capability, which you need if you want to perform complex processing of the data or to access external services such as a relational database. So you can use SAXON by writing XSL stylesheets, by writing Java applications, or by any combination of the two. If you are only interested in running the XSL interpreter, on a Windows platform, try Instant SAXON. At 241 Kb, this is a much smaller download; it excludes source code and API documentation. SAXON provides a set of services that are particularly useful when converting XML data into other formats. The output format may be XML, or HTML, or some other format such as comma separated values, EDI messages, or data in a relational database. SAXON implements the XSLT recommendation, including XPath, it its entirety. SAXON also does things that are beyond the scope of the XSL standard: for example: (1) It allows XSL processing and Java processing to be freely mixed, so you can always escape into procedural code to do something non-standard (such as accessing a database); (2) It allows multiple output files. SAXON is particularly useful for splitting a large document into page-sized chunks. You can do this without writing any Java code; (3) It allows multi-pass processing, by means of an extension function that converts a result tree fragment to a nodeset, or by chaining stylesheets together; (4) It allows variables to be updated." For related software, see"XSL/XSLT Software Support." [February 16, 1999]Michael H. Kay (ICL) hasannounced the release ofSAXON Version 4.0. "SAXON is a Java library for processing XML documents: it provides a number of services above the SAX and DOM level to make applications easier to write and more modular. The services are particularly useful for applications performing XML-to-XML or XML-to-HTML transformations. SAXON is available as a free download with source code included." Among the 'substantial changes' in the version 4 release of SAXON: 1) 'Improved support for processing using the DOM, in a way that is forward compatible with serial (SAX-based) applications: you can use the same element handlers in both modes; the processing model (selecting an element handler based on a pattern match) is identical to that for XSL; 2) Support for Stylesheets: you can now invoke many of SAXON's capabilities without writing any Java code. SAXON Stylesheets support a useful subset of XSL and provide two important additional features: the ability to create multiple output files, and the ability to freely mix XSL and Java code: XSL can be used to process some elements, and Java for others, or you can preprocess the element in Java before rendering it in XSL. Very useful if you are doing more than simple rendering,e.g., if you are loading a relational database.' Links:
XAF - an XML Architectural Forms Processor[CR: 19980610] On June 10, 1998, David Megginson (Megginson Technologies Ltd.) posted an announcement for the beta release ofXAF, an XML Architectural Forms Processor. Accompanying the software package is detailed, tutorial-oriented documentation about XAF and architectural forms (Using the XAF package for Java), appropriate for both XML document designers and XML software designers. According to the announcement, XAF is "a Java-based XML architectural forms processor that acts as both a SAX application and a SAX parser. XAF uses any SAX 1.0-conformant parser to parse an XML document, then masquerades as a SAX parser itself: the client application sees the (virtual) architectural document instead of the actual XML document. Architectural forms are a very powerful markup facility that simplifies embedding multiple structures in a single XML document. They are especially useful for working with XML-related standards like RDF and MathML. You even can use XAF together with Don Park's FREE-DOM to create a DOM of a virtual architectural document." Links: XML Testbed[CR: 19980901] On September 01, 1998, Steve Withall announced the release of an XML application environment written in Java. At the earlier XML Developers' Conference in Montréal,Withall gave a presentation"XXX - eXpandable XML eXploitation" which described a number of design ideas for flexible, expandable applications that manipulate and otherwise exploit XML documents. Thedetails of the Java 'XML Testbed' application used to demonstrate these ideas are nowdocumented online, and thesoftware is available from the W3C web server.Slides from the Montréal presentation are also available. "The software uses an XML configuration file to define the (Swing-based) user interface. It includes its own non-validating XML parser (though it can use any SAX parser instead), a nascent XSL engine (to the old/submission standard - just in time to be out of date), and a few other odds and ends. The key feature of the infrastructure is that it is intended to be easily expandable, to allow application-specific functionality to be slotted in dynamically. This is achieved by registering the classes to be instantiated for given named elements, and invoking special behaviour in a generic way by invoking a method called verify() on each element as soon as it has been parsed. The software is freely available for non-commercial use and can be downloaded, with all source code. XML Testbed application is "written in Java, with its own supporting XML infrastructure, including an XML parser and grove. A key feature of the infrastructure is a 'node type registry', which allows dynamic control over which classes are used for particular types of elements - the element class to represent them, the parser class to parse them, the customizer class to edit them and the view class to display them (using a Swing text editor kit). The XML Testbed provides means to edit and then parse an XML source - currently going so far as to highlight the portion of the source at which any error occurs. It also allows the parsed document to be viewed in the form of a tree. The Testbed user interface is implemented using Swing. The software has been designed to be as modular as possible, to be divided into a suite of relatively small packages, each with a clear role. Each usage of XML (using the word 'usage' rather than 'application' to avoid a dual meaning of the latter) is placed in its own additional package. Three such usages are included in this release, demonstrating how to build on the basic infrastructure, and also providing some (limited!) usable functionality. These three usages are a nascent XSL engine, XML-based user interface configuration, and a database analyser for generating an XML file of the schema of a database. To run the XML Testbed requires JDK1.1 or above and Swing 1.0.2 or above. These are the only essentials. To parse using SAX requires the installation of the desired parser(s) and their drivers. By default, parsing is performed using the parser in the xe package, which is included in this release." Links:
DAE SDK and DAE Server SDK (Copernican Solutions)[CR: 19980228] The DAE SDK is an "SGML, XML, and DSSSL technology for a Java application environment. . . DAE SDK is an implementation of theDSSSL Developer's Toolkit. Its principal features are support for XML Parsing and Groves, SDQL from Java, DSSSL Formatting, and Scheme scripting. It provides a framework for processing SGML and SGML-related documents with DSSSL and non-DSSSL constructs. The DAE Server is an integration of the DAE SDK into a web server completely written in Java. This integration provides means for the server to manage automatic access to groves and different processors." As of January 1998 release, the DAE SDK supports: "1) The full DSSL expression language; 2) A majority of the SDQL procedures; 3) DSSSL style language support; 4) XML Processor for building groves from XML documents; 5) A full API in Java for processing, loading groves, and applying style; 6) A full API in Scheme for processing, loading groves, and writing transformations." Note: earlier XML development toolsfrom Copernican Solutions were released as part of an 'XML Toolkit.' This toolkit (XDK) provided a developer with both light weight and vigorous parsers and APIs for validating, loading, and accessing XML documents" and Featured: (1) A light weight Well-Formed XML document parser; (2) A uniform document API based on the DSSSL ISO 10179 standard; (3) Interfaces for loading and accessing XML documents in arbitrary data stores; (4) A validating XML parser for syntax checking an XML document. Programming languages supported: Java and C++." Links: IBM XML for Java - validating XML processor in Java[CR: 19981009] On February 10, 1998, XML-DEV received anannouncement fromKent Tamura (Tokyo Research Laboratory, IBM Japan) for the release of `IBM XML for Java' - a validating XML processor written in Java. The processor is said to provide two main functions: 1) Parsing an XML document and construction of a Java object tree, and 2) Generation of an XML document from a Java object tree. The package requires Java 1.1, and may be downloaded from IBM alphaWorks:http://www.alphaworks.ibm.com/formula/xml. According to the developers (apparently: Kento Tamura and Hiroshi Maruyama): "XML for Java is a validating XML parser written in 100% pure Java. The package ( [May 14, 1998] Anannouncement was posted byTAMURA Kent (Tokyo Research Laboratory, IBM Japan) for an updated version of 'XML for Java'. Among the enhancements: 1) update to support the W3CDOM specification of April 16, 1998; 2) support forSAX Version 1.0; 3) support for UTF-16 encoding; 4) new factories. According to documentation packaged in this release: `XML for Java' is an XML processor written in Java, a library for parsing XML documents and generating XML documents. XML for Java runs on Java 1.1 and Java 1.2 Beta, not Java 1.0. The distribution includes some sample applications: 1) trlx, an XML syntax checker; 2) SiteOutliner - a Java application that scans a Web site and reports its profile in CDF format; 3) CDF Editor is a Java application to edit CDF files; 4) CDF Viewer is an applet that parses CDF files and visualizes their structures by using a tree; 5) Validating Generation sample - generates a valid element tree according to the specified DTD; 6) XML TreeView. From the June 12, 1998 version README: "XPointer package [June 25, 1998] As of June 23, 1998, IBM'sXML for Java, version 1.0.0, has been released with a free commercial license. Previously distributed under a 90-day trial license for commercial purposes, the Java Edition XML parser now allows developers to "use XML, create derivative works, and sell [their] products with IBM's XML parser inside." The IBM parser toolkit is still under development, under the supervision of Kent Tamura and Hiroshi Maruyama (IBM Tokyo Research Laboratory). [September 03, 1998] IBM'sXML for Java has been updated. It runs on Java 1.1.x, and some samples require Swing 1.0.x. The revision of September 2, 1998 provides support for the W3C DOM specification of 1998-08-18 (Document Object Model (DOM) Level 1 Specification, Version 1.0). It also includes an experimental implementation of theattribute-based namespace working draft (Namespaces in XML, WD-xml-names-19980802); the PI-based namespace support has been removed. [October 09, 1998] The Alphaworks IBM Laboratory has released version 1.1.4 of theIBM XML Parser in Java (October 7, 1998). The new version of XML4J provides support for the REC-DOM-Level-1-19981001 W3C DOM Specification Version 1.0 (1-October-1998), and includes additional support for 18 different EBCDIC encodings; performance has been significantly improved (it runs 'twice as fast' as version 1.0.9), and numerous bugs have been fixed. From Kent Tamura and Hiroshi Maruyama, XML4J "is a validating XML parser written in 100% pure Java. The package ( Links:
JUMBO - XML browser/editor[CR: 19980904] JUMBO (Java Universal Markup Browser for Objects) "is a Java-based browser for XML documents, being developed by Peter Murray-Rust. JUMBO is a set of Java classes for viewing CML (and other XML) applications. It can be used in standalone mode (application), or as applets downloaded from a server to a traditional Java-enabled browser, or locally, within a Java-enabled browser, with the classes under the document tree." [September 04, 1998]Peter Murray-Rust posted an announcement describing the release of the latest snapshot of JUMBO2 (alpha2, version 2A2) and the associated Web site,xml-cml.org. XML-CML atxml-cml.org is the home page of the nascent Chemical Markup Forum, metamorphosing from the Open Molecular Foundation.JUMBO2 is an element-oriented XML-browser, in Java/Swing. It is an application for the demonstration of XML and CML. Its source is freely available with the normal sort of copyright. The architecture tries to follow the specs and anticipate the possible XML-related APIs. JUMBO2 is now offered to the community as a catalyst to spawn the creation of high-quality client-side tools ('browsers'). Ideally we converge towards a set of core APIs and all that remains of my code will be the elephant-specific stuff. I have already started to get some offers of help." [May 28, 1998] Anannouncement was posted byPeter Murray-Rust for the release of JUMBO 2.0 (alpha). "JUMBO 2.0 is a Java-based freeware SAX-compliant XML browser/editor prototyping tool which tracks the emerging XML specs. It is a complete rewrite of JUMBO1 and has new functionality, especially for editing and exploration. JUMBO 2.0 uses the SwingSet (JFC) 1.0.1, with SAX, and your parser(s) of choice. [It] is offered as a collaborative core for Java-XML based projects. . . XML namespaces, XSL, XML-DTDs, XML-LINK, Xpointer etc. will be implemented as soon as the current [W3C] drafts firm up." [January 29, 1998]Announcement from Peter Murray-Rust for an alpha "snapshot" (i.e. release) of his Java-based JUMBO tool. David Megginson has added JUMBO to the list of clients supportingSAX: "In Java, there are nowfive XML parsers with SAX support available and four publically-announced SAX clients (that makes twenty possible client-parser combinations, according to my arithmetic)." The documentation from Murray-Rust describes JUMBO as "an element-oriented system for processing XML documents. It can read and parse (with/without additional parsers, with/without the SAX interface). It creates a tree or elements and attributes with various types of content. It also supports processing instructions (PIs) in a generic manner. There is support for namespaces and XSL stylesheets, though JUMBO does not have sophisticated rendering. It has a browsing model based on a tree/TOC model, event streams or customised element display. It supports (SIMPLE) XLL navigation including NEW and REPLACE and most Xpointer syntax. It extends the latter to provide sophisticated search and navigation tools for the document. JUMBO also provides authoring and editing facilities, driven by DTD information where possible. These can be customised to provide novel types of data input other than text. JUMBO is designed to be extended, especially through subclassing or elements, and I hope that a collaborative community (cf. tcl/tk, LaTeX, Linux) will develop for its future support. . . [Among the principal features]: 1) JUMBO is 100% pure Java (1.02) and runs as an applet or application; 2) JUMBO does not knowingly deviate from the X*L specs, apart from known limitations; 3) JUMBO has an elementary XML parser, sufficient for its own configuration files; 4) JUMBO has been developed to be used with the SAX API so that any SAX-J-compliant parser [1998-01-28: AElfred, Lark, MSXML, NXP, (XP not yet done)] can be used at runtime." Seehttp://www.vsms.nottingham.ac.uk/vsms/java/jumbo/jan9801. [May 24, 1997] Jumbo is "a prototype XML engine primarily aimed at: (1) Providing a prototyping tool for XML developers; (2) Exploring non-textual uses of XML; (3) Specifically, but not exclusively, supporting Molecular Science; (4) Resolving semantics through hyperlinking to documents or Java methods." "JUMBO is built from components and is not limited in what applications it can be configured for. At present it consists of these parts [described here in abbreviated format; seethe full documentation for updated information] [November 10, 1997]Announcement from Peter Murray-Rust (Virtual School of Molecular Sciences) for updates to JUMBO and CML1.2 (Chemical Markup Language). Links:
Links:
LT XML - XML toolset[CR: 19980626] LT XML is issued by The Language Technology Group (Human Communication Research Centre, University of Edinburgh). "LT XML is an integrated set of XML tools and a developer's tool-kit, including a C-based API. It contains everything required to process a very wide range of conformant XML documents. The tools are intended to process all documents which are well formed according to [the XML specification]. Updated June 24, 1998 or later. LT XML is a cut down version of the LT NSL package. LT XML only processes XML files, rather than arbitrary valid SGML files. However, LT XML contains its own XML parser, thus does not require the SP SGML parser." A derived parser under development [in February 1998] is RXP (a non-validating XML parser in C); see below. [June 26, 1998] Anannouncement was posted byHenry S. Thompson (HCRC Language Technology Group, University of Edinburgh) for the release of LT XML version 1.0. LT XML now meets the requirements for a fully conformant XML processor (per the XML 1.0 specification) and includes support for a wider range of characters encodings for input and output (UTF-8, ISO-646, SO-8859-n, UTF-16 and UCS-2). LT XML is both a set of command-line/console XML applications and a C language library supporting a powerful API for new application development. The new release comes in two versions: 1) a source version for UN*X platforms, with straight-forward compilation and installation procedures, and 2) source plus DLLs and executables version for WIN32 platforms. LT XML is available free for evaluation and non-commercial use. The package includes extensive documentation of the tools and the API, together with detailed examples of how to build your ownapplication using the API. Online documentation in HTML (built using DocBook 3.0) is also available in the"The XML Library LT XML version 1.0. User Documentation and Reference Guide." The LT XML API allows applications to choose, or even switch, between an event-oriented and a tree-oriented view of XML documents. The functionality of the tools in this release includes 1) Text extraction; 2) Powerful markup-aware 'grep' (search); 3) Down-translation; 4) Tokenisation; 5) Sorting; 6) Transclusion using a subset of XML-link. [adapted from the posting and Web site information] The new release [version 0.9.5, September 01, 1997] of LT XML represents ... a " high-performance publicly available XML toolset written in C. The LT XML tool-kit includes stand-alone tools for a wide range of processing of well-formed XML documents, including searching and extracting, down-translation (e.g., report generation, formatting), tokenising and sorting... [the release] includes executable images for a range of platforms, including Windows 95 and Windows NT, FreeBSD, Linux and Solaris. A preliminary partial Macintosh version is also available. This release is restricted to 8-bit character input/output, and does NOT do validation, although it does process and make use of DTDs in documents which include them... [Tools in the new 0.9.5 release]: (1) sggrep -- extract sub-parts of XML documents, using patterns over element structure and text content; (2) textonly -- extract text content only; (3) sgsort -- reorder sub-elements within specified elements; (4) sgmltrans -- pattern+action downtranslation tool; (5) sgrpg -- Structure-based transformation tool; (6) simple, simpleq -- event- and fragment-based examples of API use." Links:
Project addresses: RXP XML parser program[CR: 20000807] [August 07, 2000] Seethe 'XML well-formedness checker and validator' based upon RXP - interactive and online. "Use this form to check an XML document for well-formedness and (optionally) validity. External entity references are included, even when not validating. If the document is well-formed, the parser outputs the corresponding canonical XML." Validation and namespace processing can be toggled off/on. "XML namespaces don't mesh well with DTD-based validity, so you quite likely won't want to select both validation and namespace processing. Only HTTP URLs are allowed. My HTTP code doesn't understand redirects, so be sure to put a slash on the end of URLs that refer to directories. RXP is licensed under the GNU Public Licence. It may be made available under other licensing terms; contactM.Moens@ed.ac.uk for details." The author provides Win32 binaries as well assource code;cache] [February 17, 1999]Richard Tobin has announced therelease of RXP Version 1.0. RXP is a validating XML parser in C, developed by theLanguage Technology Group, Human Communication Resarch Centre, University of Edinburgh. A simple application (called rxp) is provided that parses and writes XML data, optionally expanding entities, defaulting attributes, and translating to a different output encoding. Some command-line options include: insertion of declared default values for omitted attributes; expansion of entity references; printing output as "bits"; XML well-formedness checking mode (vs. validation mode); treating the input as normalized SGML rather than as XML; producing output in the specified character encoding (ISO-8859-1, UTF-8, ISO-10646-UCS, UTF-16); specifying big- or little-endian byte order for 16-bit encoding names. There is an RXP web page athttp://www.cogsci.ed.ac.uk/~richard/rxp.html. Bug reports should be sent torichard@cogsci.ed.ac.uk. RXP 'is used by theLT XML toolkit, and in theFestival speech synthesis system'; it also supports anonline XML checking tool. "Whereas previous versions were available only for individual, research and educational use, this version is licensed under the GNU Public Licence (GPL)." Other XML parsers: see"XML Parsers and Parsing Toolkits." [Earlier description] RXP (version .8, beta-test release, May 26, 1998) is a non-validating XML parser in C. It is maintained byRichard Tobin (Centre for Cognitive Science and the Human Communication Research Centre, Edinburgh). RXP is based on the W3C recommendation of 10th February 1998, and is free for individual, research and educational use and for evaluation. RXP comes with a sample application (parser program, 'rxp') which "reads and parses XML from the (or standard input if none is provided) and writes it to standard output, optionally expanding entities, defaulting attributes, and translating to a different output encoding. [. . .] It can be compiled in 8- or 16-bit character mode. In 8-bit mode, the internal encoding is a superset of ASCII, in which all characters above 0xa0 are treated as name characters. Characters are not translated on input or ouput. This means that well-formed documents in ASCII and ISO-8859-N should work. In 16-bit mode, the internal encoding is UTF-16 and the supported input encodings are ISO-8859-N (1<= N<= 9), UTF-16 and UTF-8." Links:
XED - An XML document instance editor[CR: 19980715] [July 15, 1998] Anannouncement from Henry S. Thompson reported on the availability of a new beta-release of the XED "XML document instance editor" from the HCRC Language Technology Group, University of Edinburgh. This new beta-level release of XED has additional features, improved installation packaging for WIN32 platforms, and bug fixes. Upgrades include: 1) refilling of text content and indenting of element content upon request; 2) accented character support [ISO-8859-1]; 3) an experimental file processing facility: processing may be invoked on the file, "and XED will then step you through any validation or application errors which are logged" (e.g., nsgmls and jade). [March 18, 1998] Henry S. Thompson (Language Technology Group, University of Edinburgh) posted an announcement for the availability of an alpha release of 'XED: A smart XML instance editor'. As a WYSIWYG XML instance editor, "XED uses the LT XML toolset integrated with a Python-Tk user interface, to provide a free, cross-platform, well-formedness preserving editor for XML document instances. . . as a text editor for XML document instances, it is designed to support hand-authoring of small-to-medium size XML documents, and is optimised for keyboard input. It works very hard to ensure that you cannot produce a non-well-formed document. Although it neither parses DTDs in detail nor validates, it does keep track of your document structure, and provides context-based accelerators to make element and attribute entry fast and easy. XED keeps track of all the changes you make in your document, so that you can undo changes, as many as you need to, if you make a mistake. This makes it easy to learn . . ." Windows95/NT and Solaris 2.5 binaries are available now [980320]. The author solicitsfeedback from testers for this alpha version of XED. An update notice for thealpha version 0.2.1.4 was posted on April 02, 1998.
Ælfred XML Parser[CR: 19990324] [May 05, 1998]Announcement from David Megginson for the public release of an updated (1.2) version of Microstar's Ælfred XML parser. Ælfred is "a small, fast, DTD-aware Java-based XML parser, especially suitable for use in Java applets, free for both commercial and non-commercial use." User-visible changes in the parser since version 1.1 include: 1) The [March 09, 1998]Announcement from David Megginson (Microstar Software Ltd.) for the version 1.1 release of Microstar's free Java-based XML parser, Ælfred. From the announcement: "Ælfred is a very small, very fast XML parser optimised for use with applets, where Java 1.0.2 compatibility and download time are major requirements. Ælfred is forgiving with some errors, but otherwise supports the entire feature set of the XML 1.0 recommendation including Unicode, defaulted attribute values, external DTD subsets, external entities, and flagging of ignorable whitespace. The distribution also contains a native SAX (Simple API for XML) driver so that you can interchange Ælfred with other SAX-supported parsers without rewriting your code. Version 1.1 introduces a smaller, cleaner interface, together with some important new functionality: 1) the ability to read an XML document from an input stream as well as a URI; 2) a new, optional SAX driver; 3) a new, optional base class for deriving event handlers; 4) a new, optional exception class for reporting parsing errors; 5) use of the HTTP content-encoding parameter, if available; 6) better position-reporting for errors." Seethe Microstar news document for more detailed information on Ælfred 1.1 changes, On December 09, 1997, an announcement was posted byDavid Megginson ofMicrostar Software Ltd. for the availability of a free Java-based XML parser, theAElfred XML Parser. According to the announcement, Microstar has released "Ælfred (AElfred), a small, fast, DTD-aware Java-based XML parser, especially suitable for use in Java applets. Ælfred has been designed for Java programmers who want to add XML support to their applets and applications without doubling their size: Ælfred consists of only two class files, with a total size of approximately 24K, and requires very little memory to run. Ælfred also implements Java's [December 11, 1997] 1.0beta3 release: The new version is still interface-compatible with the first two public betas, but it adds the ability to query for content models and enumerated attribute types (both returned as normalised strings, with whitespace removed and parameter entities resolved). With the new query routines, Ælfred is now capable of producing a normalised version of an XML document's DTD; in fact, the distribution now includes a new demonstration class, DtdDemo.java, that does exactly that." Links:
DataChannel XML Development Environment (DXDE)[CR: 19980211] "DXDE, with its first complete roll-out in 1998, will be a collection of XML tools including parsers, viewers, and APIs. We will also supply documentation and tutorials. Primary contributors to DXDE include Norbert Mikula and John Tigue, both XML pioneers and DataChannel's XML experts, as well as other leading XML researchers and developers." As of December 10, 1997, components available included: NXP - Norbert's XML Parser, A demo of the XML viewer, Deployment kit for the XML viewer, and Source code to an XML parser (Pax Syntactica). An addition in 1998 will be an XML Server - "a platform-independent server that supports a database schema for managing and distributing meta-data." [February 11, 1998] See theannouncement fromNorbert Mikula of DataChannel for the availability of a beta version of DXP - DataChannel's XML Parser. According to thepress release, "the DataChannel XML Parser is a Java-based XML parser designed for server side-based XML parsing and integration. It is a redesigned version of NXP (Norbert Mikula's XML Parser), one of the first XML parsers. DXP allows application developers to make their applications XML-aware by providing them with the ability to import XML data into their own data structure. Data can come from a database, the Web, a file, or from a local application -- whatever a URL can address. . . The DataChannel XML Parser is part of the DataChannel XML Developer Toolkit (DXDE), which will be available Q1Y98." Links:
Tcl XML Parsing Package[CR: 19990524] From the Australian National University, a Tcl package has been created for parsing XML documents and DTDs. This package requires Tcl 8.0b1 or a later version. The parser has been tested with a simple DTD and several small document instances; it is said to use "the XML namespace." Links:
XML Editing Mode in PSGML[CR: 19980223] [December 09, 1997] Announcement from David Megginson (Microstar Software Ltd.) for a new public version of the XML patches for Lennart Staflin's PSGML (an SGML mode for Emacs). Availablefrom the author's home page. "These patches allow you to use PSGML in Emacs as a non-validating XML editor: all names will be case-sensitive, many (but not all) forbidden constructions will generate errors, all attribute values will be quoted, and PSGML will use the variant XML delimiters. There are also two changes that are useful for full SGML as well as XML: 1) these patches add support for multiple ATTLIST declarations for the same associated element type; 2) the variable sgml-namecase-general allows you to make element type names, attribute names, and keywords case-sensitive in full SGML as well." [August 09, 1997] Public posting of an announcement from David Megginson (Microstar Software Ltd.) for initial enhancements of PSGML to enable an XML editing mode: ". . . I patched PSGML to add an XML mode that enables XML-specific delimiters, parsing, and error-reporting -- in other words, it's a real, native XML DTD-driven editor." The new code for XML support has not yet been incorporated intothe main psgml distribution, but Megginson is requesting assistance from qualified alpha testers to help debug the code. Please help! The announcement contains a list of currently supported and unsupported XML features. Links:
XSLJ - Jade-compatible XSL-to-DSSSL translator[CR: 19980112] [January 12, 1998]Announcement from Henry S. Thompson (Human Communication Research Centre, University of Edinburgh) for the "final" beta release of XSLJ.XSLJ is an XSL to DSSSL Translator. Specifically, it translates from "the XML style language proposed in'A Proposal for XSL' to the augmented version of DSSSL which is supported by the test release ofJADE. Thus,xslj "translates valid XSL style sheets into valid extended DSSSL style sheets, which can then be used to render XML documents using Jade." The current release from Thompson includes bug fixes and an aditional increase in conformance to the W3C proposal (e.g., mixed content is now allowed in style sheet 'actions'). How doesxslj compare to Microsoft's new XSL support in MSXSL? According to thexslj documentation, Microsoft's MSXSL "does not support flow-object macros or named styles and supports only the HTML flow-objects, but can therefore be integrated more closely with a browser." Announcement from Henry S. Thompson for the release of an alpha version ofxslj, a Jade-compatible XSL-to-DSSSL translator. "XSLJ is a virtually complete implementation of XSL by way of translation into extended DSSSL, as supported by the latest test release of James Clark's DSSSL engine Jade. XSLJ translates valid XSL style sheets into valid extended DSSSL style sheets, which can then be used to render XML documents using Jade. Virtually all of XSL as described in the W3C document'A Proposal for XSL' is supported, although some minor modifications have been necessitated by the exigencies of implementation, all of which are described in detail in material contained in the release.." XSLJ development was supported by the UK Economic and Social Research Council via their support for HCRC and by a grant from Microsoft. See the University of Edinburgh Web site for details:http://www.ltg.ed.ac.uk/~ht/xslj.html. "Major XSL features which are [now 971121] supported include: 1) template-based style rules using XML itself as the notation; 2) The pattern language: how to identify elements in style rules; 3) The rendering language: how to describe the desired appearance; 4) The expression language (based on JavaScript): when computation is required; 5) Flow-object macros; 6) Style rules (cascading). [...] XSL specifies two sets of flow objects for expressing the style of desired output: one based on DSSSL and one based on HTML/CSS. Both are supported by xslj. Using the DSSSL flow objects, output using any of the Jade backends is supported, including RTF, TeX and SGML. Using the HTML/CSS flow objects, output is to HTML using the Jade SGML backend." [November 25, 1997] Announcement by Henry S. Thompson (Human Communication Research Centre, University of Edinburgh) for an updated version of the XSL-to-DSSSL translatorxslj. Version 0.3 "includes a number of bug fixes (thanks for reports) and much improved HTML output when the CSS/HTML flow objects are used." docproc - an XML + XSL document processor[CR: 19980318] Under development by Sean Russell (Department of Physics, University of Oregon), "docproc is a software package that provides processing and layout of XML documents based on XSL scripts. docproc is written in pure java, and can be used as a server-side preparser for serving XML documents on the web. . .docproc can be used in two different ways. The first, and ideal, method is to use docproc as a servlet; the other way to use docproc is to call it by hand on documents that you want to reformat." Links:
DTDGenerator - XML DTD Generator[CR: 20000105] "DTDGenerator is a program that takes an XML document as input and produces a Document Type Definition (DTD) as output. The aim of the program is to give you a quick start in writing a DTD. The DTD is one of the many possible DTDs to which the input document conforms. Typically you will want to examine the DTD and edit it to describe your intended documents more precisely. In a few cases you will have to edit the DTD before you can use it. DTDGenerator was written byMichael Kay of ICL. DTDGenerator is now issued as part of the SAXON XSL product. It can be used either by installing SAXON on your own machine, or as a web-based service provided by Paul Tchistopolskii athttp://www.pault.com/Xmltube/dtdgen.html. If you use this service, ensure that the XML file you upload contains no references to other local files such as a DTD or an external entity." References:
Near & Far Designer - DTD Design Tool[CR: 19980511] Near & Far Designer is a visual DTD design tool, especially useful for those who are new to structured information and DTD design. "DTDs can be created and modified graphically without prior knowledge of XML/SGML language syntax. With the intuitive tree representation, a DTD can be created from scratch or imported, reworked and exported as a revised DTD. Structures can be explored to any level of detail. The drag and drop interface makes working with DTDs easy." [adapted 980511] Links:
The Ace Scripting Language[CR: 19980513] Ace is a high-performance, strongly typed language with comprehensive support for the SGML and XML document standards. It features an extensive library of SGML and XML manipulation functions. It is part of theStructured Information Manager (SIM) product range. Free for use in a non-commercial or commercial application, as long as you does not sell it or include it in a product for sale. "SIM includes a high performance SGML/XML database server for multi-gigabyte databases and a high performance web server. The Ace scripting language is used throughout SIM providing a high degree of configurability." Links:
HXA/HXP - Hubick's XML Analyzer, Parser[CR: 19980723] On July 22, 1998,Chris Hubick posted an announcement for the availability of the beta version of an 'Online XML Analysis Tool'. "HXA - Hubick's XML Analyzer is a [grammar] production based online XML parser/analysis tool. . . it is a pure Java tool built upon a low level XML parser (HXP) which breaks an XML file down into its constituent productions for analysis. HXA allows one to examine the production hierarchy for any character in an XML document or document fragment. For easy reference, HXA also provides links from each production in the analysis to its corresponding section in the XML specification." The XML parser used with HXA is said to be 'not yet' a proper XML parser. Links: Microsoft XML Notepad[CR: 19980723] On July 22, 1998, Microsoft Corporation released the Beta 1 version of a "Microsoft XML Notepad." The online description says: "Microsoft XML Notepad is a simple prototyping application for HTML authors and developers that enables the rapid building and editing of small sets of XML-based data. With XML Notepad, developers can quickly create XML prototypes in an iterative fashion, using familiar metaphors. XML Notepad offers an intuitive and simple user interface that graphically represents the tree structure of XML data. . . XML Notepad's user interface is simple and intuitive. The XML source is represented graphically. The topmost element is the root element. Every XML file can have only one root element. Elements are represented by either folder icons, if they have dependent structures (for example, attributes or other elements), or by leaf icons if they have no substructures. Attributes are represented by 3-D blocks while text and comments are represented by text icons and exclamation mark icons, respectively. The structure of the data is represented in the left column while the values of the nodes are displayed in the right column." Interesting features: 1) search and replace of text can be restricted to one or more of 'content, element type names, attribute names, attribute values, and comments'; 2) files for editing can be nominated by system (filename) or URL; 3) drag-and-drop nodes. Links:
xmlproc: A Python XML parser[CR: 19980724] Lars Marius Garshol is developingxmlproc as part of a larger project,""Tools for parsing XML with Python," itself "a part of the ongoing effort to make Python the language of choice for XML processing." "xmlproc is an XML parser written in Python. It is a fairly complete validating parser, but does not do everything required of a validating parser, or even a well-formedness parser. The average user should not run into any omissions, though. Later releases will be more complete. xmlproc can be used both as a command-line parser and as a parser API you can use to write XML applications. xmlproc supports both SGML Open Catalogs and XCatalog 0.1." [Version 0.50, July 18, 1998] Links:
xmlarch.py: An XML architectural forms processor[CR: 19990323] [March 23, 1999] In connection with the release oftmproc, note that Geir Grønmo has released a new version ofxmlarch [0.25] - An XML Architectural Forms Processor. "The xmlarch module contains an XML architectural forms processor written in Python. It allows you to process XML architectural forms using any parser that uses the SAX interfaces. The module allow you to process several architectures in one parse-pass. Architectural document events for an architecture can even be broadcasted to multiple DocumentHandlers." [July 24, 1998] Geir Ove Grønmo (STEP Infotek) has announced the 'very early release' of an XML architectural forms processor in Python.xmlarch.py is a module which contains "an XML architectural forms processor written in Python. It allows you to process XML architectural forms using any parser that uses the SAX interfaces. The module allow you to process several architectures in one parse pass. Architectural document events for an architecture can even be broadcasted to multiple DocumentHandlers. (e.g. you can have 2 handlers for the RDF architecture, 3 for the XLink architecture and perhaps one for the HyTime architecture.) The architecture processor uses the SAX DocumentHandler interface which means that you can register the architecture handler (ArchDocHandler) with any SAX 1.0 compliant parser." The online documentation contains two complete examples and links for architectural forms processing in SGML/XML. The author solicits feedback on his software. Links: DB2XML[CR: 19990301] On March 01, 1999,Volker Turau (Fachhochschule Wiesbaden, Fachbereich Informatik) announced the public release of DB2XML,available now for download. "DB2XML is a tool for transforming relational databases into XML (Extensible Markup Language) documents. It is written in Java. DB2XML provides two main functions: 1) Transforming the results of database queries into XML documents; 2) Providing attributes describing the characteristics of the data. DB2XML comes with an easy to use graphical user interface and accesses databases using JDBC drivers. It requires JDK 1.1 (or higher) and a database with a JDBC driver (or a ODBC driver using the JDBC-ODBC bridge). DB2XML is well documented and can be used freely." | ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
| |||||
![]() |
Document URI:http://xml.coverpages.org/publicSW.html — Legal stuff
Robin Cover, Editor:robin@oasis-open.org