Files correlati : cg0.exe cg0700a.msk cg0700b.msk cg3.exe cg4.exe Bug : Commento: Merge 1.0 libraries
662 lines
28 KiB
XML
662 lines
28 KiB
XML
<?xml version="1.0" encoding="iso-8859-2"?>
|
||
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN"
|
||
"http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd">
|
||
|
||
<article id="libxslt">
|
||
<title>libxslt: An Extended Tutorial</title>
|
||
|
||
<articleinfo>
|
||
<author><firstname>Panos</firstname><surname>Louridas</surname></author>
|
||
<copyright>
|
||
<year>2004</year>
|
||
<holder>Panagiotis Louridas</holder>
|
||
</copyright>
|
||
<legalnotice>
|
||
<para>Permission is hereby granted, free of charge, to
|
||
any person obtaining a copy of this software and associated
|
||
documentation files (the "Software"), to deal in the Software
|
||
without restriction, including without limitation the rights to use,
|
||
copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||
copies of the Software, and to permit persons to whom the Software
|
||
is furnished to do so, subject to the following conditions:
|
||
</para>
|
||
|
||
<para>The above copyright notice and this permission notice shall be
|
||
included in all copies or substantial portions of the Software.
|
||
</para>
|
||
|
||
<para>THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
||
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
||
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
||
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
||
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
||
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.</para>
|
||
|
||
</legalnotice>
|
||
</articleinfo>
|
||
|
||
<sect1><title>Introduction</title>
|
||
|
||
<para>The Extensible Stylesheet Language Transformations (XSLT)
|
||
specification defines an XML template language for transforming XML
|
||
documents. An XSLT engine reads an XSLT file and an XML document and
|
||
transforms the document accordingly.</para>
|
||
|
||
<para>We want to perform a series of XSLT transformations to a series
|
||
of documents. An obvious solution is to use the operating system's
|
||
pipe mechanism and start a series of transformation processes, each
|
||
one taking as input the output of the previous transformation. It
|
||
would be interesting, though, and perhaps more efficient if we could
|
||
do our job within a single process.</para>
|
||
|
||
<para>libxslt is a library for doing XSLT transformations. It is built
|
||
on libxml, which is a library for handling XML documents. libxml and
|
||
libxslt are used by the GNOME project. Although developed in the
|
||
*NIX world, both libxml and libxslt have been
|
||
ported to the MS-Windows platform. In principle an application using
|
||
libxslt should be easily portable between the two systems. In
|
||
practice, however, there arise various wrinkles. These do not have
|
||
anything to do with libxml or libxslt per se, but rather with the
|
||
different compilation and linking procedures of each system.</para>
|
||
|
||
<para>The presented solution is an extension of <ulink
|
||
url="https://gnome.pages.gitlab.gnome.org/libxslt/tutorial/libxslttutorial.html">John
|
||
Fleck's libxslt tutorial</ulink>, but the present tutorial tries to be
|
||
self-contained. It develops a minimal libxslt application
|
||
(libxslt_pipes) that can perform a series of transformations to a
|
||
series of files in a pipe-like manner. An invocation might be:</para>
|
||
|
||
<para>
|
||
<userinput>
|
||
libxslt_pipes --out results.xml foo.xsl bar.xsl doc1.xml doc2.xml
|
||
</userinput>
|
||
</para>
|
||
|
||
<para>The <filename>foo.xsl</filename> stylesheet will be applied to
|
||
<filename> doc1.xml</filename> and the <filename>bar.xsl</filename>
|
||
stylesheet will be applied to the resulting document; then the two
|
||
stylesheets will be applied in the same sequence to
|
||
<filename>bar.xsl</filename>. The results are sent to
|
||
<filename>results.xml</filename> (if no output is specified they are
|
||
sent to standard output).</para>
|
||
|
||
<para>The application is compiled in both *NIX
|
||
systems and MS-Windows, where by *NIX systems we
|
||
mean Linux, BSD, and other members of the
|
||
family. The gcc suite is used in the *NIX platform
|
||
and the Microsoft compiler and linker are used in the
|
||
MS-Windows platform.</para>
|
||
|
||
</sect1>
|
||
|
||
<sect1><title>Setting the Scene</title>
|
||
|
||
<para>
|
||
We need to include the necessary libraries:
|
||
|
||
<programlisting>
|
||
<![CDATA[
|
||
#include <stdio.h>
|
||
#include <string.h>
|
||
#include <stdlib.h>
|
||
|
||
#include <libxslt/transform.h>
|
||
#include <libxslt/xsltutils.h>
|
||
]]>
|
||
</programlisting>
|
||
</para>
|
||
|
||
<para>The first group of include directives includes general C
|
||
libraries. The libraries we need to make libxslt work are in the
|
||
second group. The <filename>transform.h</filename> header file
|
||
declares the API that does the bulk of the actual processing. The
|
||
<filename>xsltutils.h</filename> header file declares the API for some
|
||
generic utility functions of the XSLT engine; among other things,
|
||
saving to a file, which is what we need it for.</para>
|
||
|
||
<para>
|
||
If our input files contain entities through external subsets, we need
|
||
to tell libxslt to load them. The global variable
|
||
<function>xmlLoadExtDtdDefaultValue</function>, defined in
|
||
<filename>libxml/globals.h</filename>, is responsible for that. As the
|
||
variable is defined outside our program we must specify external
|
||
linkage:
|
||
<programlisting>
|
||
extern int xmlLoadExtDtdDefaultValue;
|
||
</programlisting>
|
||
</para>
|
||
|
||
<para>
|
||
The program is called from the command line. We anticipate that the
|
||
user may not call it the right way, so we define a function for
|
||
describing its usage:
|
||
<programlisting>
|
||
static void usage(const char *name) {
|
||
printf("Usage: %s [options] stylesheet [stylesheet ...] file [file ...]\n",
|
||
name);
|
||
printf(" --out file: send output to file\n");
|
||
printf(" --param name value: pass a (parameter,value) pair\n");
|
||
}
|
||
</programlisting>
|
||
</para>
|
||
</sect1>
|
||
|
||
<sect1><title>Program Start</title>
|
||
|
||
<para>We need to define a few variables that are used throughout the
|
||
program:
|
||
<programlisting>
|
||
int main(int argc, char **argv) {
|
||
int arg_indx;
|
||
const char *params[16 + 1];
|
||
int params_indx = 0;
|
||
int stylesheet_indx = 0;
|
||
int file_indx = 0;
|
||
int i, j, k;
|
||
FILE *output_file = stdout;
|
||
xsltStylesheetPtr *stylesheets =
|
||
(xsltStylesheetPtr *) calloc(argc, sizeof(xsltStylesheetPtr));
|
||
xmlDocPtr *files = (xmlDocPtr *) calloc(argc, sizeof(xmlDocPtr));
|
||
int return_value = 0;
|
||
</programlisting>
|
||
</para>
|
||
|
||
<para>The <varname>arg_indx</varname> integer is an index used to
|
||
iterate over the program arguments. The <varname>params</varname>
|
||
string array is used to collect the XSLT parameters. In XSLT,
|
||
additional information may be passed to the processor via
|
||
parameters. The user of the program specifies these in key-value pairs
|
||
in the command line following the <userinput>--param</userinput>
|
||
command line argument. We accept up to 8 such key-value pairs, which
|
||
we track with the <varname>params_indx</varname> integer. libxslt
|
||
expects the parameters array to be null-terminated, so we have to
|
||
allocate one extra place (16 + 1) for it. The
|
||
<varname>file_indx</varname> is an index to iterate over the files to
|
||
be processed. The <varname>i</varname>, <varname>j</varname>,
|
||
<varname>k</varname> integers are additional indices for iteration
|
||
purposes, and <varname>return_value</varname> is the value the program
|
||
returns to the operating system. We expect the result of the
|
||
transformation to be the standard output in most cases, but the user
|
||
may wish otherwise via the <option>--out</option> command line
|
||
option, so we need to keep track of the situation with the
|
||
<varname>output_file</varname> file pointer.</para>
|
||
|
||
<para>In libxslt, XSLT stylesheets are internally stored in
|
||
<structname>xsltStylesheet</structname> structures; similarly, in
|
||
libxml XML documents are stored in <structname>xmlDoc</structname>
|
||
structures. <type>xsltStylesheetPtr</type> and <type>xmlDocPtr</type>
|
||
are simply typedefs of pointers to them. The user may specify any
|
||
number of stylesheets that will be applied to the documents one after
|
||
the other. To save time we parse the stylesheets and the documents as
|
||
we read them from the command line and keep the parsed representation
|
||
of them. The parsed results are kept in arrays. These are dynamically
|
||
allocated and sized to the number of arguments; this wastes some
|
||
space, but not much (the size of <type>xmlStyleSheetPtr</type> and
|
||
<type>xmlDocPtr</type> is the size of a pointer) and simplifies code
|
||
later on. The array memory is allocated with
|
||
<function>calloc</function> to ensure contents are initialised to
|
||
zero.
|
||
</para>
|
||
|
||
</sect1>
|
||
|
||
<sect1><title>Arguments Collection</title>
|
||
|
||
<para>If the program gets no arguments at all, we print the usage
|
||
description, set the program return value to 1 and exit. Instead of
|
||
returning directly we go to (literally) to the end of the program text
|
||
where some housekeeping takes place.</para>
|
||
|
||
<para>
|
||
<programlisting>
|
||
<![CDATA[
|
||
if (argc <= 1) {
|
||
usage(argv[0]);
|
||
return_value = 1;
|
||
goto finish;
|
||
}
|
||
|
||
/* Collect arguments */
|
||
for (arg_indx = 1; arg_indx < argc; arg_indx++) {
|
||
if (argv[arg_indx][0] != '-')
|
||
break;
|
||
if ((!strcmp(argv[arg_indx], "-param"))
|
||
|| (!strcmp(argv[arg_indx], "--param"))) {
|
||
arg_indx++;
|
||
params[params_indx++] = argv[arg_indx++];
|
||
params[params_indx++] = argv[arg_indx];
|
||
if (params_indx >= 16) {
|
||
fprintf(stderr, "too many params\n");
|
||
return_value = 1;
|
||
goto finish;
|
||
}
|
||
} else if ((!strcmp(argv[arg_indx], "-o"))
|
||
|| (!strcmp(argv[arg_indx], "--out"))) {
|
||
arg_indx++;
|
||
output_file = fopen(argv[arg_indx], "w");
|
||
} else {
|
||
fprintf(stderr, "Unknown option %s\n", argv[arg_indx]);
|
||
usage(argv[0]);
|
||
return_value = 1;
|
||
goto finish;
|
||
}
|
||
}
|
||
params[params_indx] = 0;
|
||
]]>
|
||
</programlisting>
|
||
</para>
|
||
|
||
<para>If the user passes arguments we have to collect them. This is a
|
||
matter of iterating over the program argument list while we encounter
|
||
arguments starting with a dash. The XSLT parameters are put into the
|
||
<varname>params</varname> array and the <varname>output_file</varname>
|
||
is set to the user request, if any. After processing all the parameter
|
||
key-value pairs we set the last element of the <varname>params</varname>
|
||
array to null.
|
||
</para>
|
||
</sect1>
|
||
|
||
<sect1><title>Parsing</title>
|
||
|
||
<para>The rest of the argument list is taken to be stylesheets and
|
||
files to be transformed. Stylesheets are identified by their suffix,
|
||
which is expected to be xsl (case sensitive). All other files are
|
||
assumed to be XML documents, regardless of suffix.</para>
|
||
|
||
<para>
|
||
<programlisting>
|
||
<![CDATA[
|
||
/* Collect and parse stylesheets and files to be transformed */
|
||
for (; arg_indx < argc; arg_indx++) {
|
||
char *argument =
|
||
(char *) malloc(sizeof(char) * (strlen(argv[arg_indx]) + 1));
|
||
strcpy(argument, argv[arg_indx]);
|
||
if (strtok(argument, ".")) {
|
||
char *suffix = strtok(0, ".");
|
||
if (suffix && !strcmp(suffix, "xsl")) {
|
||
stylesheets[stylesheet_indx++] =
|
||
xsltParseStylesheetFile((const xmlChar *)argv[arg_indx]);;
|
||
} else {
|
||
files[file_indx++] = xmlParseFile(argv[arg_indx]);
|
||
}
|
||
} else {
|
||
files[file_indx++] = xmlParseFile(argv[arg_indx]);
|
||
}
|
||
free(argument);
|
||
}
|
||
]]>
|
||
</programlisting>
|
||
</para>
|
||
|
||
<para>Stylesheets are parsed using the
|
||
<function>xsltParseStylesheetFile</function>
|
||
function. <function>xsltParseStylesheetFile</function> takes as
|
||
argument a pointer to an <type>xmlChar</type>, a typedef of an
|
||
unsigned char; in effect, the filename of the stylesheet. The
|
||
resulting <type>xsltStylesheetPtr</type> is placed in the
|
||
<varname>stylesheets</varname> array. In the same vein, XML files are
|
||
parsed using the <function>xmlParseFile</function> function that takes
|
||
as argument the file's name; the resulting <type>xmlDocPtr</type> is
|
||
placed in the <varname>files</varname> array.
|
||
</para>
|
||
|
||
</sect1>
|
||
|
||
<sect1><title>File Processing</title>
|
||
|
||
<para>All stylesheets are applied to each file one after the
|
||
other. Stylesheets are applied with the
|
||
<function>xsltApplyStylesheet</function> function that takes as
|
||
argument the stylesheet to be applied, the file to be transformed and
|
||
any parameters we have collected. The in-memory representation of an
|
||
XML document takes space, which we free using the
|
||
<function>xmlFreeDoc</function> function. The file is then saved to the
|
||
specified output.</para>
|
||
|
||
<para>
|
||
<programlisting>
|
||
<![CDATA[
|
||
/* Process files */
|
||
for (i = 0; files[i]; i++) {
|
||
doc = files[i];
|
||
res = doc;
|
||
for (j = 0; stylesheets[j]; j++) {
|
||
res = xsltApplyStylesheet(stylesheets[j], doc, params);
|
||
xmlFreeDoc(doc);
|
||
doc = res;
|
||
}
|
||
|
||
if (stylesheets[0]) {
|
||
xsltSaveResultToFile(output_file, res, stylesheets[j-1]);
|
||
} else {
|
||
xmlDocDump(output_file, res);
|
||
}
|
||
xmlFreeDoc(res);
|
||
}
|
||
|
||
fclose(output_file);
|
||
|
||
for (k = 0; stylesheets[k]; k++) {
|
||
xsltFreeStylesheet(stylesheets[k]);
|
||
}
|
||
|
||
xsltCleanupGlobals();
|
||
xmlCleanupParser();
|
||
|
||
finish:
|
||
free(stylesheets);
|
||
free(files);
|
||
return(return_value);
|
||
]]>
|
||
</programlisting>
|
||
</para>
|
||
|
||
<para>To output an XML document we have in memory we use the
|
||
<function>xlstSaveResultToFile</function> function, where we specify
|
||
the destination, the document and the stylesheet that has been applied
|
||
to it. The stylesheet is required so that output-related information
|
||
contained in the stylesheet, such as the encoding to be used, is used
|
||
in output. If no transformation has taken place, which will happen
|
||
when the user specifies no stylesheets at all in the command line, we
|
||
use the <function>xmlDocDump</function> libxml function that saves the
|
||
source document to the file without further ado.</para>
|
||
|
||
<para>As parsed stylesheets take up space in memory, we take care to
|
||
free that memory after use with a call to
|
||
<function>xmlFreeStyleSheet</function>. When all work is done, we
|
||
clean up all global variables used by the XSLT library using
|
||
<function>xsltCleanupGlobals</function>. Likewise, all global memory
|
||
allocated for the XML parser is reclaimed by a call to
|
||
<function>xmlCleanupParser</function>. Before returning we deallocate
|
||
the memory allocated for the holding the pointers to the XML documents
|
||
and stylesheets.</para>
|
||
|
||
</sect1>
|
||
|
||
<sect1><title>*NIX Compiling and Linking</title>
|
||
|
||
<para>Compiling and linking in a *NIX environment
|
||
is easy, as the required libraries are almost certain to be already in
|
||
place (remember that libxml and libxslt are used by the GNOME project,
|
||
so they are present in most installations). The program can be
|
||
dynamically linked so that its footprint is minimized, or statically
|
||
linked, so that it stands by itself, carrying all required code.</para>
|
||
|
||
<para>For dynamic linking the following one liner will do:</para>
|
||
|
||
<para>
|
||
<userinput>gcc -o libxslt_pipes -Wall -I/usr/include/libxml2 -lxslt
|
||
-lxml2 -L/usr/lib libxslt_pipes.c</userinput>
|
||
</para>
|
||
|
||
<para>We assume that the necessary header files are in <filename
|
||
class="directory">/usr/include/libxml2</filename> and that the
|
||
required libraries (<filename>libxslt.so</filename>,
|
||
<filename>libxml2.so</filename>) are in <filename
|
||
class="directory">/usr/lib</filename>.</para>
|
||
|
||
<para>In general, a program may need to link to additional libraries,
|
||
depending on the processing it actually performs. A good way to start
|
||
is to use the <command>xslt-config</command> script. The
|
||
<option>--help</option> option displays usage
|
||
information. Running</para>
|
||
|
||
<para>
|
||
<userinput>
|
||
xslt-config --cflags
|
||
</userinput>
|
||
</para>
|
||
|
||
<para>we get compile flags, while running</para>
|
||
|
||
<para>
|
||
<userinput>
|
||
xslt-config --libs
|
||
</userinput>
|
||
</para>
|
||
|
||
<para>we get the library settings for the linker.</para>
|
||
|
||
<para>For static linking we must list more libraries than we did for
|
||
dynamic linking, as the libraries on which the libxsl and libxslt
|
||
libraries depend are also needed. Using <command>xslt-config</command>
|
||
on a particular installation we create the following one-liner:</para>
|
||
|
||
<para>
|
||
<userinput>
|
||
gcc -o libxslt_pipes -Wall -I/usr/include/libxml2 libxslt_pipes.c
|
||
-static -L/usr/lib -lxslt -lxml2 -lz -lpthread -lm
|
||
</userinput>
|
||
</para>
|
||
|
||
<para>If we get warnings to the effect that some function in
|
||
statically linked applications requires at runtime the shared
|
||
libraries used from the glibc version used for linking, that means
|
||
that the binary is not completely static. Although we statically
|
||
linked against the GNU C runtime library glibc, glibc uses external
|
||
libraries to perform some of its functions. Same version libraries
|
||
must be present on the system we want the application to run. One way
|
||
to avoid this it to use an alternative C runtime, for example <ulink
|
||
url="http://www.uclibc.org">uClibc</ulink>, which requires obtaining
|
||
and building a uClibc toolchain first (if the reason for trying to get
|
||
a statically linked version of the program is to embed it somewhere,
|
||
using uClibc might be a good idea anyway).
|
||
</para>
|
||
|
||
</sect1>
|
||
|
||
<sect1 id="windows-build"><title>MS-Windows Compiling and
|
||
Linking</title>
|
||
|
||
<para>Compiling and linking in MS-Windows requires
|
||
some attention. First, the MS-Windows ports must be
|
||
downloaded and installed in the programming workstation. The ports are
|
||
available in <ulink url="http://www.zlatkovic.com/libxml.en.html">Igor
|
||
Zlatkovi<EFBFBD>'s site</ulink>. We need the ports for iconv, zlib, libxml,
|
||
and libxslt. In contrast to *NIX environments, we
|
||
cannot assume that the libraries needed will be present in other
|
||
computers where the program will be used. One solution is to
|
||
distribute the program along with the necessary dynamic
|
||
libraries. Another solution is to statically link the program so that
|
||
only a single executable file will have to be distributed.</para>
|
||
|
||
<para>We assume that we have decompressed the downloaded ports and
|
||
have placed the required contents of their <filename
|
||
class="directory">include</filename> directories in an <filename
|
||
class="directory">include</filename> directory in our file system. The
|
||
required contents include everything apart from the <filename
|
||
class="directory">libexslt</filename> directory of the libxslt port,
|
||
as we are not using EXLST (an initiative to provide extensions to
|
||
XSLT) in this project. In order to compile the program we have to make
|
||
sure that all necessary header files are included. When using the
|
||
Microsoft compiler this translates to adding the required
|
||
<option>/I</option> switches in the command line. If using a Visual
|
||
Studio product the same effect is attained by specifying additional
|
||
include directories in the compilation options. In the end, if the
|
||
headers have been copied in <filename
|
||
class="directory">C:\include</filename> the command line must contain
|
||
<option>/I"C:\include" /I"C:\include\libslt"
|
||
/I"C:\include\libxml"</option>.</para>
|
||
|
||
<para>This being a C program, it needs to be compiled against an
|
||
implementation of the C libraries. Microsoft provides various
|
||
implementations. The ports, however, have been compiled against the
|
||
<filename>msvcrt.dll</filename> implementation, so it is wise to use
|
||
the same runtime in our project, lest we wish to come against
|
||
unexpected runtime crashes. The <filename>msvcrt.dll</filename> is a
|
||
multi-threaded implementation and is specified by giving
|
||
<option>/MD</option> as a compiler option. Unfortunately, the
|
||
correspondence between the <option>/MD</option> switch and
|
||
<filename>msvcrt.dll</filename> breaks after version 6 of the
|
||
Microsoft compiler. In version 7 and later (i.e., Visual Studio .NET),
|
||
<option>/MD</option> links against a different DLL; in version 7.1
|
||
this is <filename>msvcrt71.dll</filename>. The end result of this bit
|
||
of esoterica is that if you try to dynamically link your application
|
||
with a compiler whose version is greater than 6, your program is
|
||
likely to crash unexpectedly. Alternatively, you may wish to compile
|
||
all iconv, zlib, libxml and libxslt yourself, using the new runtime
|
||
library. This is not a tall order, and some details are given
|
||
<link linkend="windows-ports-build">below</link>.</para>
|
||
|
||
<para>There are three kinds of libraries in MS-Windows. Dynamically
|
||
Linked Libraries (DLLs), like <filename>msvcrt.dll</filename> we met
|
||
above, are used for dynamic linking; an application links to them at
|
||
runtime, so the application does not include the code contained in
|
||
them. Static libraries are used for static linking; an application
|
||
adds the libraries' code to its own code at link time. Import
|
||
libraries are used when building an application that uses DLLs. For
|
||
the application to be built, the linker must somehow find the
|
||
definitions of the functions that will be provided in runtime by the
|
||
DLLs, otherwise it will complain about unresolved references. Import
|
||
libraries contain function stubs that, for each DLL function we want
|
||
to call, know where to look for it in the DLL. In essence, in order to
|
||
use a DLL we must link against its corresponding import library. DLLs
|
||
have a <filename>.dll</filename> suffix; static and import libraries
|
||
both have a <filename>.lib</filename> suffix. In the MS-Windows ports
|
||
of libxml and libxslt static libraries are distinguished by their name
|
||
ending in <filename>_a.lib</filename>, while in the zlib port the
|
||
import library is <filename>zdll.lib</filename> and the static library
|
||
is <filename>zlib.lib</filename>. In what follows we assume we have a
|
||
<filename class="directory">lib</filename> directory in our filesystem
|
||
where we place the libraries we need for linking.</para>
|
||
|
||
<para>If we want to link dynamically we must make sure the <filename
|
||
class="directory">lib</filename> directory contains
|
||
<filename>iconv.lib</filename>, <filename>libxslt.lib</filename>,
|
||
<filename>libxml2.lib</filename>, and
|
||
<filename>zdll.lib</filename>. When using the Microsoft linker this
|
||
translates to adding the required <option>/LIBPATH</option>
|
||
switch and the necessary libraries in the command line. In Visual
|
||
Studio we must specify an additional library directory for <filename
|
||
class="directory">lib</filename> and put the necessary libraries in
|
||
the additional dependencies. In the end, the command line must include
|
||
<option>/LIBPATH:"C:\lib" "lib\iconv.lib" "lib\libxslt.lib"
|
||
"lib\libxml2.lib" "lib\zdll.lib"</option>, provided the libraries'
|
||
directory is <filename class="directory">C:\lib</filename>. In order
|
||
for the resulting executable to run, the ports DLLs must be present;
|
||
one way is to place all DLLs contained in the ports in the home
|
||
directory of our application, and make sure they are distributed
|
||
together.</para>
|
||
|
||
<para>If we want to link statically we must make sure the <filename
|
||
class="directory">lib</filename> directory contains
|
||
<filename>iconv_a.lib</filename>, <filename>libxslt_a.lib</filename>,
|
||
<filename>libxml2_a.lib</filename>, and
|
||
<filename>zlib.lib</filename>. Adding <filename
|
||
class="directory">lib</filename> as a library directory and putting
|
||
the necessary libraries in the additional dependencies, we get a
|
||
command line that should include <option>/LIBPATH:"C:\lib"
|
||
"lib\iconv_a.lib" "lib\libxslt_a.lib" "lib\libxml2_a.lib"
|
||
"lib\zlib.lib"</option>. The resulting executable is much bigger
|
||
than if we linked dynamically; it is, however, self-contained and can
|
||
be distributed more easily, in theory at least. In practice, however,
|
||
the executable is not completely static. We saw that the ports are
|
||
compiled against <filename>msvcrt.dll</filename>, so the program does
|
||
require that DLL at runtime. Moreover, since when using a version of
|
||
Microsoft developer tools with a version number greater than 6, we are
|
||
no longer using <filename>msvcrt.dll</filename>, but another runtime
|
||
like <filename>msvcrt71.dll</filename>, and we then need that DLL. In
|
||
contrast to <filename>msvcrt.dll</filename> it may not be present on
|
||
the target computer, so we may have to copy it along.</para>
|
||
|
||
<sect2 id="windows-ports-build"><title>Building the Ports in
|
||
MS-Windows</title>
|
||
|
||
<para>The source code of the ports is readily available on the web,
|
||
one has to check the ports sites. Each port can be built without
|
||
problems in an MS-Windows environment using Microsoft development
|
||
tools. The necessary command line tools (compiler, linker,
|
||
<command>nmake</command>) must be available. This means running a
|
||
batch file called <command>vcvars32.bat</command> that comes with
|
||
Visual Studio (its exact location in the directory tree may vary
|
||
depending on the version of Visual Studio, but a file search will find
|
||
it anyway). Makefiles for the Microsoft tools are found in all
|
||
ports. They are distinguished by their suffix, e.g.,
|
||
<filename>Makefile.msvc</filename> or
|
||
<filename>Makefile.msc</filename>. To build zlib it suffices to run
|
||
<command>nmake</command> against <filename>Makefile.msc</filename>
|
||
(i.e., with the <option>/F</option> option); similarly, to build
|
||
<filename>iconv</filename> it suffices to run <command>nmake</command>
|
||
against <filename>Makefile.msvc</filename>. Building libxml and
|
||
libxslt requires an extra configuration step; we must run the
|
||
<filename>configure.js</filename> configuration script with the
|
||
<command>cscript</command> command. <filename>configure.js</filename>
|
||
is found in the <filename class="directory">win32</filename> directory
|
||
in the distributions. It is written in JScript, Microsoft's
|
||
implementation of the ECMA 262 language specification (ECMAScript
|
||
Edition 3), a JavaScript offspring. The configuration string takes a
|
||
number of parameters detailing our environment and needs;
|
||
<userinput>cscript configure.js help</userinput> documents
|
||
them.</para>
|
||
|
||
<para>It is wise to read all documentation files in the source
|
||
distributions before starting; moreover, pay attention to the
|
||
dependencies between the ports. If we configure libxml and libxslt to
|
||
use iconv and zlib we must build these two first and make sure their
|
||
headers and libraries can be found by the compiler and the
|
||
linker when building libxml and libxslt.</para>
|
||
|
||
</sect2>
|
||
|
||
</sect1>
|
||
|
||
<sect1><title>zlib, iconv and All That</title>
|
||
|
||
<para>We saw that libxml and libxslt depend on various other
|
||
libraries, for instance zlib, iconv, and so forth. Taking a look into
|
||
them gives us clues on the capabilities of libxml and libxslt.</para>
|
||
|
||
<para><ulink url="http://www.zlib.org">zlib</ulink> is a free general
|
||
purpose lossless data compression library. It is a venerable
|
||
workhorse; more than <ulink
|
||
url="http://www.gzip.org/zlib/apps.html">500 applications</ulink>
|
||
(both commercial and open source) seem to use the library. libxml uses
|
||
zlib so that it can read from or write to compressed files
|
||
directly. The <function>xmlParseFile</function> function can
|
||
transparently parse a compressed document to produce an
|
||
<structname>xmlDoc</structname>. If we want to create a compressed
|
||
document with libxml we can use an
|
||
<structname>xmlTextWriterPtr</structname> (obtained through
|
||
<function>xmlNewTextWriterDoc</function>), or another related
|
||
structure from <filename>libxml/xmlwriter.h</filename>, with
|
||
compression enabled.</para>
|
||
|
||
<para>XML allows documents to use a variety of different character
|
||
encodings. <ulink
|
||
url="http://www.gnu.org/software/libiconv">iconv</ulink> is a free
|
||
library for converting between different character encodings. libxml
|
||
provides a set of default converters for some encodings: UTF-8, UTF-16
|
||
(little endian and big endian), ISO-8859-1, ASCII, and HTML (a
|
||
specific handler for the conversion of UTF-8 to ASCII with HTML
|
||
predefined entities like &copy; for the copyright sign). However,
|
||
when compiled with iconv support, libxml and libxslt can handle the
|
||
full range of encodings provided by iconv; these should cover most
|
||
needs.</para>
|
||
|
||
<para>libxml and libxslt can be used in multi-threaded
|
||
applications. In MS-Windows they are linked against
|
||
<filename>MSVCRT.DLL</filename> (or one of its descendants, as we saw
|
||
<link linkend="windows-build">above</link>). In *NIX the pthreads
|
||
(POSIX threads) library is used.</para>
|
||
|
||
</sect1>
|
||
|
||
<sect1><title>The Complete Program</title>
|
||
|
||
<para>
|
||
The complete program listing is given below. The program is also
|
||
<ulink url="libxslt_pipes.c">available online</ulink>.
|
||
</para>
|
||
|
||
<para>
|
||
<programlisting>
|
||
<xi:include href="libxslt_pipes.c" parse="text"
|
||
xmlns:xi="http://www.w3.org/2003/XInclude"/>
|
||
</programlisting>
|
||
</para>
|
||
|
||
</sect1>
|
||
|
||
</article>
|