diff options
Diffstat (limited to 'doc/xsd-epilogue.xhtml')
| -rw-r--r-- | doc/xsd-epilogue.xhtml | 429 | 
1 files changed, 429 insertions, 0 deletions
| diff --git a/doc/xsd-epilogue.xhtml b/doc/xsd-epilogue.xhtml new file mode 100644 index 0000000..178cf8b --- /dev/null +++ b/doc/xsd-epilogue.xhtml @@ -0,0 +1,429 @@ +  <h1>NAMING CONVENTION</h1> + +  <p>The compiler can be instructed to use a particular naming +     convention in the generated code. A number of widely-used +     conventions can be selected using the <code><b>--type-naming</b></code> +     and <code><b>--function-naming</b></code> options. A custom +     naming convention can be achieved using the +     <code><b>--type-regex</b></code>, +     <code><b>--accessor-regex</b></code>, +     <code><b>--one-accessor-regex</b></code>, +     <code><b>--opt-accessor-regex</b></code>, +     <code><b>--seq-accessor-regex</b></code>, +     <code><b>--modifier-regex</b></code>, +     <code><b>--one-modifier-regex</b></code>, +     <code><b>--opt-modifier-regex</b></code>, +     <code><b>--seq-modifier-regex</b></code>, +     <code><b>--parser-regex</b></code>, +     <code><b>--serializer-regex</b></code>, +     <code><b>--const-regex</b></code>, +     <code><b>--enumerator-regex</b></code>, and +     <code><b>--element-type-regex</b></code> options. +  </p> + +  <p>The <code><b>--type-naming</b></code> option specifies the +     convention that should be used for naming C++ types. Possible +     values for this option are <code><b>knr</b></code> (default), +     <code><b>ucc</b></code>, and <code><b>java</b></code>. The +     <code><b>knr</b></code> value (stands for K&R) signifies +     the standard, lower-case naming convention with the underscore +     used as a word delimiter, for example: <code>foo</code>, +     <code>foo_bar</code>. The <code><b>ucc</b></code> (stands +     for upper-camel-case) and +     <code><b>java</b></code> values a synonyms for the same +     naming convention where the first letter of each word in the +     name is capitalized, for example: <code>Foo</code>, +     <code>FooBar</code>.</p> + +  <p>Similarly, the <code><b>--function-naming</b></code> option +     specifies the convention that should be used for naming C++ +     functions. Possible values for this option are <code><b>knr</b></code> +     (default), <code><b>lcc</b></code>, <code><b>ucc</b></code>, and +     <code><b>java</b></code>. The <code><b>knr</b></code> value (stands +     for K&R) signifies the standard, lower-case naming convention +     with the underscore used as a word delimiter, for example: +     <code>foo()</code>, <code>foo_bar()</code>. The <code><b>lcc</b></code> +     value (stands for lower-camel-case) signifies a naming convention +     where the first letter of each word except the first is capitalized, +     for example: <code>foo()</code>, <code>fooBar()</code>. The +     <code><b>ucc</b></code> value (stands for upper-camel-case) signifies +     a naming convention where the first letter of each word is capitalized, +     for example: <code>Foo()</code>, <code>FooBar()</code>. +     The <code><b>java</b></code> naming convention is similar to +     the lower-camel-case one except that accessor functions are prefixed +     with <code>get</code>, modifier functions are prefixed +     with <code>set</code>, parsing functions are prefixed +     with <code>parse</code>, and serialization functions are +     prefixed with <code>serialize</code>, for example: +     <code>getFoo()</code>, <code>setFooBar()</code>, +     <code>parseRoot()</code>, <code>serializeRoot()</code>.</p> + +  <p>Note that the naming conventions specified with the +     <code><b>--type-naming</b></code> and +     <code><b>--function-naming</b></code> options perform only limited +     transformations on the names that come from the schema in the +     form of type, attribute, and element names. In other words, to +     get consistent results, your schemas should follow a similar +     naming convention as the one you would like to have in the +     generated code. Alternatively, you can use the +     <code><b>--*-regex</b></code> options (discussed below) +     to perform further transformations on the names that come from +     the schema.</p> + +  <p>The +     <code><b>--type-regex</b></code>, +     <code><b>--accessor-regex</b></code>, +     <code><b>--one-accessor-regex</b></code>, +     <code><b>--opt-accessor-regex</b></code>, +     <code><b>--seq-accessor-regex</b></code>, +     <code><b>--modifier-regex</b></code>, +     <code><b>--one-modifier-regex</b></code>, +     <code><b>--opt-modifier-regex</b></code>, +     <code><b>--seq-modifier-regex</b></code>, +     <code><b>--parser-regex</b></code>, +     <code><b>--serializer-regex</b></code>, +     <code><b>--const-regex</b></code>, +     <code><b>--enumerator-regex</b></code>, and +     <code><b>--element-type-regex</b></code> options allow you to +     specify extra regular expressions for each name category in +     addition to the predefined set that is added depending on +     the <code><b>--type-naming</b></code> and +     <code><b>--function-naming</b></code> options. Expressions +     that are provided with the <code><b>--*-regex</b></code> +     options are evaluated prior to any predefined expressions. +     This allows you to selectively override some or all of the +     predefined transformations. When debugging your own expressions, +     it is often useful to see which expressions match which names. +     The <code><b>--name-regex-trace</b></code> option allows you +     to trace the process of applying regular expressions to +     names.</p> + +  <p>The value for the <code><b>--*-regex</b></code> options should be +     a perl-like regular expression in the form +     <code><b>/</b><i>pattern</i><b>/</b><i>replacement</i><b>/</b></code>. +     Any character can be used as a delimiter instead of <code><b>/</b></code>. +     Escaping of the delimiter character in <code><i>pattern</i></code> or +     <code><i>replacement</i></code> is not supported. +     All the regular expressions for each category are pushed into a +     category-specific stack with the last specified expression +     considered first. The first match that succeeds is used. For the +     <code><b>--one-accessor-regex</b></code> (accessors with cardinality one), +     <code><b>--opt-accessor-regex</b></code> (accessors with cardinality optional), and +     <code><b>--seq-accessor-regex</b></code> (accessors with cardinality sequence) +     categories the  <code><b>--accessor-regex</b></code> expressions are +     used as a fallback. For the +     <code><b>--one-modifier-regex</b></code>, +     <code><b>--opt-modifier-regex</b></code>, and +     <code><b>--seq-modifier-regex</b></code> +     categories the  <code><b>--modifier-regex</b></code> expressions are +     used as a fallback. For the <code><b>--element-type-regex</b></code> +     category the <code><b>--type-regex</b></code> expressions are +     used as a fallback.</p> + +  <p>The type name expressions (<code><b>--type-regex</b></code>) +     are evaluated on the name string that has the following +     format:</p> + +  <p><code>[<i>namespace</i> ]<i>name</i>[,<i>name</i>][,<i>name</i>][,<i>name</i>]</code></p> + +  <p>The element type name expressions +     (<code><b>--element-type-regex</b></code>), effective only when +     the <code><b>--generate-element-type</b></code> option is specified, +     are evaluated on the name string that has the following +     format:</p> + +  <p><code><i>namespace</i> <i>name</i></code></p> + +  <p>In the type name format the <code><i>namespace</i></code> part +     followed by a space is only present for global type names. For +     global types and elements defined in schemas without a target +     namespace, the <code><i>namespace</i></code> part is empty but +     the space is still present. In the type name format after the +     initial <code><i>name</i></code> component, up to three additional +     <code><i>name</i></code> components can be present, separated +     by commas. For example:</p> + +  <p><code><b>http://example.com/hello type</b></code></p> +  <p><code><b>foo</b></code></p> +  <p><code><b>foo,iterator</b></code></p> +  <p><code><b>foo,const,iterator</b></code></p> + +  <p>The following set of predefined regular expressions is used to +     transform type names when the upper-camel-case naming convention +     is selected:</p> + +  <p><code><b>/(?:[^ ]* )?([^,]+)/\u$1/</b></code></p> +  <p><code><b>/(?:[^ ]* )?([^,]+),([^,]+)/\u$1\u$2/</b></code></p> +  <p><code><b>/(?:[^ ]* )?([^,]+),([^,]+),([^,]+)/\u$1\u$2\u$3/</b></code></p> +  <p><code><b>/(?:[^ ]* )?([^,]+),([^,]+),([^,]+),([^,]+)/\u$1\u$2\u$3\u$4/</b></code></p> + +  <p>The accessor and modifier expressions +     (<code><b>--*accessor-regex</b></code> and +     <code><b>--*modifier-regex</b></code>) are evaluated on the name string +     that has the following format:</p> + +  <p><code><i>name</i>[,<i>name</i>][,<i>name</i>]</code></p> + +  <p>After the initial <code><i>name</i></code> component, up to two +     additional <code><i>name</i></code> components can be present, +     separated by commas. For example:</p> + +  <p><code><b>foo</b></code></p> +  <p><code><b>dom,document</b></code></p> +  <p><code><b>foo,default,value</b></code></p> + +  <p>The following set of predefined regular expressions is used to +     transform accessor names when the <code><b>java</b></code> naming +     convention is selected:</p> + +  <p><code><b>/([^,]+)/get\u$1/</b></code></p> +  <p><code><b>/([^,]+),([^,]+)/get\u$1\u$2/</b></code></p> +  <p><code><b>/([^,]+),([^,]+),([^,]+)/get\u$1\u$2\u$3/</b></code></p> + +  <p>For the parser, serializer, and enumerator categories, the +     corresponding regular expressions are evaluated on local names of +     elements and on enumeration values, respectively. For example, the +     following predefined regular expression is used to transform parsing +     function names when the <code><b>java</b></code> naming convention +     is selected:</p> + +  <p><code><b>/(.+)/parse\u$1/</b></code></p> + +  <p>The const category is used to create C++ constant names for the +     element/wildcard/text content ids in ordered types.</p> + +  <p>See also the REGEX AND SHELL QUOTING section below.</p> + +  <h1>TYPE MAP</h1> + +  <p>Type map files are used in C++/Parser to define a mapping between +     XML Schema and C++ types. The compiler uses this information +     to determine the return types of <code><b>post_*</b></code> +     functions in parser skeletons corresponding to XML Schema +     types as well as argument types for callbacks corresponding +     to elements and attributes of these types.</p> + +  <p>The compiler has a set of predefined mapping rules that map +     built-in XML Schema types to suitable C++ types (discussed +     below) and all other types to <code><b>void</b></code>. +     By providing your own type maps you can override these predefined +     rules. The format of the type map file is presented below: +  </p> + +  <pre> +namespace <schema-namespace> [<cxx-namespace>] +{ +  (include <file-name>;)* +  ([type] <schema-type> <cxx-ret-type> [<cxx-arg-type>];)* +} +  </pre> + +  <p>Both <code><i><schema-namespace></i></code> and +     <code><i><schema-type></i></code> are regex patterns while +     <code><i><cxx-namespace></i></code>, +     <code><i><cxx-ret-type></i></code>, and +     <code><i><cxx-arg-type></i></code> are regex pattern +     substitutions. All names can be optionally enclosed in +     <code><b>" "</b></code>, for example, to include white-spaces.</p> + +  <p><code><i><schema-namespace></i></code> determines XML +     Schema namespace. Optional <code><i><cxx-namespace></i></code> +     is prefixed to every C++ type name in this namespace declaration. +     <code><i><cxx-ret-type></i></code> is a C++ type name that is +     used as a return type for the <code><b>post_*</b></code> functions. +     Optional <code><i><cxx-arg-type></i></code> is an argument +     type for callback functions corresponding to elements and attributes +     of this type. If +     <code><i><cxx-arg-type></i></code> is not specified, it defaults +     to <code><i><cxx-ret-type></i></code> if <code><i><cxx-ret-type></i></code> +     ends with <code><b>*</b></code> or <code><b>&</b></code> (that is, +     it is a pointer or a reference) and +     <code><b>const</b> <i><cxx-ret-type></i><b>&</b></code> +     otherwise. +     <code><i><file-name></i></code> is a file name either in the +     <code><b>" "</b></code> or <code><b>< ></b></code> format +     and is added with the <code><b>#include</b></code> directive to +     the generated code.</p> + +  <p>The <code><b>#</b></code> character starts a comment that ends +     with a new line or end of file. To specify a name that contains +     <code><b>#</b></code> enclose it in <code><b>" "</b></code>. +     For example:</p> + +  <pre> +namespace http://www.example.com/xmlns/my my +{ +  include "my.hxx"; + +  # Pass apples by value. +  # +  apple apple; + +  # Pass oranges as pointers. +  # +  orange orange_t*; +} +  </pre> + +  <p>In the example above, for the +     <code><b>http://www.example.com/xmlns/my#orange</b></code> +     XML Schema type, the <code><b>my::orange_t*</b></code> C++ type will +     be used as both return and argument types.</p> + +  <p>Several namespace declarations can be specified in a single +     file. The namespace declaration can also be completely +     omitted to map types in a schema without a namespace. For +     instance:</p> + +  <pre> +include "my.hxx"; +apple apple; + +namespace http://www.example.com/xmlns/my +{ +  orange "const orange_t*"; +} +  </pre> + +  <p>The compiler has a number of predefined mapping rules that can be +     presented as the following map files. The string-based XML Schema +     built-in types are mapped to either <code><b>std::string</b></code> +     or <code><b>std::wstring</b></code> depending on the character type +     selected with the <code><b>--char-type</b></code> option +     (<code><b>char</b></code> by default). The binary XML Schema types are +     mapped to either <code>std::unique_ptr<xml_schema::buffer></code> +     or <code>std::auto_ptr<xml_schema::buffer></code> depending on the C++ +     standard selected with the <code><b>--std</b></code> option +     (<code><b>c++11</b></code> by default).</p> + +  <pre> +namespace http://www.w3.org/2001/XMLSchema +{ +  boolean bool bool; + +  byte "signed char" "signed char"; +  unsignedByte "unsigned char" "unsigned char"; + +  short short short; +  unsignedShort "unsigned short" "unsigned short"; + +  int int int; +  unsignedInt "unsigned int" "unsigned int"; + +  long "long long" "long long"; +  unsignedLong "unsigned long long" "unsigned long long"; + +  integer "long long" "long long"; + +  negativeInteger "long long" "long long"; +  nonPositiveInteger "long long" "long long"; + +  positiveInteger "unsigned long long" "unsigned long long"; +  nonNegativeInteger "unsigned long long" "unsigned long long"; + +  float float float; +  double double double; +  decimal double double; + +  string std::string; +  normalizedString std::string; +  token std::string; +  Name std::string; +  NMTOKEN std::string; +  NCName std::string; +  ID std::string; +  IDREF std::string; +  language std::string; +  anyURI std::string; + +  NMTOKENS xml_schema::string_sequence; +  IDREFS xml_schema::string_sequence; + +  QName xml_schema::qname; + +  base64Binary std::[unique|auto]_ptr<xml_schema::buffer> +               std::[unique|auto]_ptr<xml_schema::buffer>; +  hexBinary std::[unique|auto]_ptr<xml_schema::buffer> +            std::[unique|auto]_ptr<xml_schema::buffer>; + +  date xml_schema::date; +  dateTime xml_schema::date_time; +  duration xml_schema::duration; +  gDay xml_schema::gday; +  gMonth xml_schema::gmonth; +  gMonthDay xml_schema::gmonth_day; +  gYear xml_schema::gyear; +  gYearMonth xml_schema::gyear_month; +  time xml_schema::time; +} +  </pre> + +  <p>The last predefined rule maps anything that wasn't mapped by +     previous rules to <code><b>void</b></code>:</p> + +  <pre> +namespace .* +{ +  .* void void; +} +  </pre> + + +  <p>When you provide your own type maps with the +     <code><b>--type-map</b></code> option, they are evaluated first. +     This allows you to selectively override predefined rules.</p> + +  <h1>REGEX AND SHELL QUOTING</h1> + +  <p>When entering a regular expression argument in the shell +     command line it is often necessary to use quoting (enclosing +     the argument in <code><b>" "</b></code> or +     <code><b>' '</b></code>) in order to prevent the shell +     from interpreting certain characters, for example, spaces as +     argument separators and <code><b>$</b></code> as variable +     expansions.</p> + +  <p>Unfortunately it is hard to achieve this in a manner that is +     portable across POSIX shells, such as those found on +     GNU/Linux and UNIX, and Windows shell. For example, if you +     use <code><b>" "</b></code> for quoting you will get a +     wrong result with POSIX shells if your expression contains +     <code><b>$</b></code>. The standard way of dealing with this +     on POSIX systems is to use <code><b>' '</b></code> instead. +     Unfortunately, Windows shell does not remove <code><b>' '</b></code> +     from arguments when they are passed to applications. As a result you +     may have to use <code><b>' '</b></code> for POSIX and +     <code><b>" "</b></code> for Windows (<code><b>$</b></code> is +     not treated as a special character on Windows).</p> + +  <p>Alternatively, you can save regular expression options into +     a file, one option per line, and use this file with the +     <code><b>--options-file</b></code> option. With this approach +     you don't need to worry about shell quoting.</p> + +  <h1>DIAGNOSTICS</h1> + +  <p>If the input file is not a valid W3C XML Schema definition, +    <code><b>xsd</b></code> will issue diagnostic messages to STDERR +    and exit with non-zero exit code.</p> + +  <h1>BUGS</h1> + +  <p>Send bug reports to the +     <a href="mailto:xsd-users@codesynthesis.com">xsd-users@codesynthesis.com</a> mailing list.</p> + +  </div> +  <div id="footer"> +    Copyright © $copyright$. + +    <div id="terms"> +      Permission is granted to copy, distribute and/or modify this +      document under the terms of the +      <a href="https://www.codesynthesis.com/licenses/fdl-1.2.txt">GNU Free +      Documentation License, version 1.2</a>; with no Invariant Sections, +      no Front-Cover Texts and no Back-Cover Texts. +    </div> +  </div> +</div> +</body> +</html> | 
