diff options
author | Jörg Frings-Fürst <debian@jff.email> | 2025-03-19 15:41:36 +0100 |
---|---|---|
committer | Jörg Frings-Fürst <debian@jff.email> | 2025-03-19 15:41:36 +0100 |
commit | 018e1ba581ec6f01f069a45ec4cf89f152b44d5f (patch) | |
tree | 0e7dda4bb693a6714066fbe5efcd2f24ff7c1a65 /doc/xsd-epilogue.xhtml | |
parent | 1c188393cd2e271ed2581471b601fb5960777fd8 (diff) |
remerge
Diffstat (limited to 'doc/xsd-epilogue.xhtml')
-rw-r--r-- | doc/xsd-epilogue.xhtml | 429 |
1 files changed, 0 insertions, 429 deletions
diff --git a/doc/xsd-epilogue.xhtml b/doc/xsd-epilogue.xhtml deleted file mode 100644 index 178cf8b..0000000 --- a/doc/xsd-epilogue.xhtml +++ /dev/null @@ -1,429 +0,0 @@ - <h1>NAMING CONVENTION</h1> - - <p>The compiler can be instructed to use a particular naming - convention in the generated code. A number of widely-used - conventions can be selected using the <code><b>--type-naming</b></code> - and <code><b>--function-naming</b></code> options. A custom - naming convention can be achieved using the - <code><b>--type-regex</b></code>, - <code><b>--accessor-regex</b></code>, - <code><b>--one-accessor-regex</b></code>, - <code><b>--opt-accessor-regex</b></code>, - <code><b>--seq-accessor-regex</b></code>, - <code><b>--modifier-regex</b></code>, - <code><b>--one-modifier-regex</b></code>, - <code><b>--opt-modifier-regex</b></code>, - <code><b>--seq-modifier-regex</b></code>, - <code><b>--parser-regex</b></code>, - <code><b>--serializer-regex</b></code>, - <code><b>--const-regex</b></code>, - <code><b>--enumerator-regex</b></code>, and - <code><b>--element-type-regex</b></code> options. - </p> - - <p>The <code><b>--type-naming</b></code> option specifies the - convention that should be used for naming C++ types. Possible - values for this option are <code><b>knr</b></code> (default), - <code><b>ucc</b></code>, and <code><b>java</b></code>. The - <code><b>knr</b></code> value (stands for K&R) signifies - the standard, lower-case naming convention with the underscore - used as a word delimiter, for example: <code>foo</code>, - <code>foo_bar</code>. The <code><b>ucc</b></code> (stands - for upper-camel-case) and - <code><b>java</b></code> values a synonyms for the same - naming convention where the first letter of each word in the - name is capitalized, for example: <code>Foo</code>, - <code>FooBar</code>.</p> - - <p>Similarly, the <code><b>--function-naming</b></code> option - specifies the convention that should be used for naming C++ - functions. Possible values for this option are <code><b>knr</b></code> - (default), <code><b>lcc</b></code>, <code><b>ucc</b></code>, and - <code><b>java</b></code>. The <code><b>knr</b></code> value (stands - for K&R) signifies the standard, lower-case naming convention - with the underscore used as a word delimiter, for example: - <code>foo()</code>, <code>foo_bar()</code>. The <code><b>lcc</b></code> - value (stands for lower-camel-case) signifies a naming convention - where the first letter of each word except the first is capitalized, - for example: <code>foo()</code>, <code>fooBar()</code>. The - <code><b>ucc</b></code> value (stands for upper-camel-case) signifies - a naming convention where the first letter of each word is capitalized, - for example: <code>Foo()</code>, <code>FooBar()</code>. - The <code><b>java</b></code> naming convention is similar to - the lower-camel-case one except that accessor functions are prefixed - with <code>get</code>, modifier functions are prefixed - with <code>set</code>, parsing functions are prefixed - with <code>parse</code>, and serialization functions are - prefixed with <code>serialize</code>, for example: - <code>getFoo()</code>, <code>setFooBar()</code>, - <code>parseRoot()</code>, <code>serializeRoot()</code>.</p> - - <p>Note that the naming conventions specified with the - <code><b>--type-naming</b></code> and - <code><b>--function-naming</b></code> options perform only limited - transformations on the names that come from the schema in the - form of type, attribute, and element names. In other words, to - get consistent results, your schemas should follow a similar - naming convention as the one you would like to have in the - generated code. Alternatively, you can use the - <code><b>--*-regex</b></code> options (discussed below) - to perform further transformations on the names that come from - the schema.</p> - - <p>The - <code><b>--type-regex</b></code>, - <code><b>--accessor-regex</b></code>, - <code><b>--one-accessor-regex</b></code>, - <code><b>--opt-accessor-regex</b></code>, - <code><b>--seq-accessor-regex</b></code>, - <code><b>--modifier-regex</b></code>, - <code><b>--one-modifier-regex</b></code>, - <code><b>--opt-modifier-regex</b></code>, - <code><b>--seq-modifier-regex</b></code>, - <code><b>--parser-regex</b></code>, - <code><b>--serializer-regex</b></code>, - <code><b>--const-regex</b></code>, - <code><b>--enumerator-regex</b></code>, and - <code><b>--element-type-regex</b></code> options allow you to - specify extra regular expressions for each name category in - addition to the predefined set that is added depending on - the <code><b>--type-naming</b></code> and - <code><b>--function-naming</b></code> options. Expressions - that are provided with the <code><b>--*-regex</b></code> - options are evaluated prior to any predefined expressions. - This allows you to selectively override some or all of the - predefined transformations. When debugging your own expressions, - it is often useful to see which expressions match which names. - The <code><b>--name-regex-trace</b></code> option allows you - to trace the process of applying regular expressions to - names.</p> - - <p>The value for the <code><b>--*-regex</b></code> options should be - a perl-like regular expression in the form - <code><b>/</b><i>pattern</i><b>/</b><i>replacement</i><b>/</b></code>. - Any character can be used as a delimiter instead of <code><b>/</b></code>. - Escaping of the delimiter character in <code><i>pattern</i></code> or - <code><i>replacement</i></code> is not supported. - All the regular expressions for each category are pushed into a - category-specific stack with the last specified expression - considered first. The first match that succeeds is used. For the - <code><b>--one-accessor-regex</b></code> (accessors with cardinality one), - <code><b>--opt-accessor-regex</b></code> (accessors with cardinality optional), and - <code><b>--seq-accessor-regex</b></code> (accessors with cardinality sequence) - categories the <code><b>--accessor-regex</b></code> expressions are - used as a fallback. For the - <code><b>--one-modifier-regex</b></code>, - <code><b>--opt-modifier-regex</b></code>, and - <code><b>--seq-modifier-regex</b></code> - categories the <code><b>--modifier-regex</b></code> expressions are - used as a fallback. For the <code><b>--element-type-regex</b></code> - category the <code><b>--type-regex</b></code> expressions are - used as a fallback.</p> - - <p>The type name expressions (<code><b>--type-regex</b></code>) - are evaluated on the name string that has the following - format:</p> - - <p><code>[<i>namespace</i> ]<i>name</i>[,<i>name</i>][,<i>name</i>][,<i>name</i>]</code></p> - - <p>The element type name expressions - (<code><b>--element-type-regex</b></code>), effective only when - the <code><b>--generate-element-type</b></code> option is specified, - are evaluated on the name string that has the following - format:</p> - - <p><code><i>namespace</i> <i>name</i></code></p> - - <p>In the type name format the <code><i>namespace</i></code> part - followed by a space is only present for global type names. For - global types and elements defined in schemas without a target - namespace, the <code><i>namespace</i></code> part is empty but - the space is still present. In the type name format after the - initial <code><i>name</i></code> component, up to three additional - <code><i>name</i></code> components can be present, separated - by commas. For example:</p> - - <p><code><b>http://example.com/hello type</b></code></p> - <p><code><b>foo</b></code></p> - <p><code><b>foo,iterator</b></code></p> - <p><code><b>foo,const,iterator</b></code></p> - - <p>The following set of predefined regular expressions is used to - transform type names when the upper-camel-case naming convention - is selected:</p> - - <p><code><b>/(?:[^ ]* )?([^,]+)/\u$1/</b></code></p> - <p><code><b>/(?:[^ ]* )?([^,]+),([^,]+)/\u$1\u$2/</b></code></p> - <p><code><b>/(?:[^ ]* )?([^,]+),([^,]+),([^,]+)/\u$1\u$2\u$3/</b></code></p> - <p><code><b>/(?:[^ ]* )?([^,]+),([^,]+),([^,]+),([^,]+)/\u$1\u$2\u$3\u$4/</b></code></p> - - <p>The accessor and modifier expressions - (<code><b>--*accessor-regex</b></code> and - <code><b>--*modifier-regex</b></code>) are evaluated on the name string - that has the following format:</p> - - <p><code><i>name</i>[,<i>name</i>][,<i>name</i>]</code></p> - - <p>After the initial <code><i>name</i></code> component, up to two - additional <code><i>name</i></code> components can be present, - separated by commas. For example:</p> - - <p><code><b>foo</b></code></p> - <p><code><b>dom,document</b></code></p> - <p><code><b>foo,default,value</b></code></p> - - <p>The following set of predefined regular expressions is used to - transform accessor names when the <code><b>java</b></code> naming - convention is selected:</p> - - <p><code><b>/([^,]+)/get\u$1/</b></code></p> - <p><code><b>/([^,]+),([^,]+)/get\u$1\u$2/</b></code></p> - <p><code><b>/([^,]+),([^,]+),([^,]+)/get\u$1\u$2\u$3/</b></code></p> - - <p>For the parser, serializer, and enumerator categories, the - corresponding regular expressions are evaluated on local names of - elements and on enumeration values, respectively. For example, the - following predefined regular expression is used to transform parsing - function names when the <code><b>java</b></code> naming convention - is selected:</p> - - <p><code><b>/(.+)/parse\u$1/</b></code></p> - - <p>The const category is used to create C++ constant names for the - element/wildcard/text content ids in ordered types.</p> - - <p>See also the REGEX AND SHELL QUOTING section below.</p> - - <h1>TYPE MAP</h1> - - <p>Type map files are used in C++/Parser to define a mapping between - XML Schema and C++ types. The compiler uses this information - to determine the return types of <code><b>post_*</b></code> - functions in parser skeletons corresponding to XML Schema - types as well as argument types for callbacks corresponding - to elements and attributes of these types.</p> - - <p>The compiler has a set of predefined mapping rules that map - built-in XML Schema types to suitable C++ types (discussed - below) and all other types to <code><b>void</b></code>. - By providing your own type maps you can override these predefined - rules. The format of the type map file is presented below: - </p> - - <pre> -namespace <schema-namespace> [<cxx-namespace>] -{ - (include <file-name>;)* - ([type] <schema-type> <cxx-ret-type> [<cxx-arg-type>];)* -} - </pre> - - <p>Both <code><i><schema-namespace></i></code> and - <code><i><schema-type></i></code> are regex patterns while - <code><i><cxx-namespace></i></code>, - <code><i><cxx-ret-type></i></code>, and - <code><i><cxx-arg-type></i></code> are regex pattern - substitutions. All names can be optionally enclosed in - <code><b>" "</b></code>, for example, to include white-spaces.</p> - - <p><code><i><schema-namespace></i></code> determines XML - Schema namespace. Optional <code><i><cxx-namespace></i></code> - is prefixed to every C++ type name in this namespace declaration. - <code><i><cxx-ret-type></i></code> is a C++ type name that is - used as a return type for the <code><b>post_*</b></code> functions. - Optional <code><i><cxx-arg-type></i></code> is an argument - type for callback functions corresponding to elements and attributes - of this type. If - <code><i><cxx-arg-type></i></code> is not specified, it defaults - to <code><i><cxx-ret-type></i></code> if <code><i><cxx-ret-type></i></code> - ends with <code><b>*</b></code> or <code><b>&</b></code> (that is, - it is a pointer or a reference) and - <code><b>const</b> <i><cxx-ret-type></i><b>&</b></code> - otherwise. - <code><i><file-name></i></code> is a file name either in the - <code><b>" "</b></code> or <code><b>< ></b></code> format - and is added with the <code><b>#include</b></code> directive to - the generated code.</p> - - <p>The <code><b>#</b></code> character starts a comment that ends - with a new line or end of file. To specify a name that contains - <code><b>#</b></code> enclose it in <code><b>" "</b></code>. - For example:</p> - - <pre> -namespace http://www.example.com/xmlns/my my -{ - include "my.hxx"; - - # Pass apples by value. - # - apple apple; - - # Pass oranges as pointers. - # - orange orange_t*; -} - </pre> - - <p>In the example above, for the - <code><b>http://www.example.com/xmlns/my#orange</b></code> - XML Schema type, the <code><b>my::orange_t*</b></code> C++ type will - be used as both return and argument types.</p> - - <p>Several namespace declarations can be specified in a single - file. The namespace declaration can also be completely - omitted to map types in a schema without a namespace. For - instance:</p> - - <pre> -include "my.hxx"; -apple apple; - -namespace http://www.example.com/xmlns/my -{ - orange "const orange_t*"; -} - </pre> - - <p>The compiler has a number of predefined mapping rules that can be - presented as the following map files. The string-based XML Schema - built-in types are mapped to either <code><b>std::string</b></code> - or <code><b>std::wstring</b></code> depending on the character type - selected with the <code><b>--char-type</b></code> option - (<code><b>char</b></code> by default). The binary XML Schema types are - mapped to either <code>std::unique_ptr<xml_schema::buffer></code> - or <code>std::auto_ptr<xml_schema::buffer></code> depending on the C++ - standard selected with the <code><b>--std</b></code> option - (<code><b>c++11</b></code> by default).</p> - - <pre> -namespace http://www.w3.org/2001/XMLSchema -{ - boolean bool bool; - - byte "signed char" "signed char"; - unsignedByte "unsigned char" "unsigned char"; - - short short short; - unsignedShort "unsigned short" "unsigned short"; - - int int int; - unsignedInt "unsigned int" "unsigned int"; - - long "long long" "long long"; - unsignedLong "unsigned long long" "unsigned long long"; - - integer "long long" "long long"; - - negativeInteger "long long" "long long"; - nonPositiveInteger "long long" "long long"; - - positiveInteger "unsigned long long" "unsigned long long"; - nonNegativeInteger "unsigned long long" "unsigned long long"; - - float float float; - double double double; - decimal double double; - - string std::string; - normalizedString std::string; - token std::string; - Name std::string; - NMTOKEN std::string; - NCName std::string; - ID std::string; - IDREF std::string; - language std::string; - anyURI std::string; - - NMTOKENS xml_schema::string_sequence; - IDREFS xml_schema::string_sequence; - - QName xml_schema::qname; - - base64Binary std::[unique|auto]_ptr<xml_schema::buffer> - std::[unique|auto]_ptr<xml_schema::buffer>; - hexBinary std::[unique|auto]_ptr<xml_schema::buffer> - std::[unique|auto]_ptr<xml_schema::buffer>; - - date xml_schema::date; - dateTime xml_schema::date_time; - duration xml_schema::duration; - gDay xml_schema::gday; - gMonth xml_schema::gmonth; - gMonthDay xml_schema::gmonth_day; - gYear xml_schema::gyear; - gYearMonth xml_schema::gyear_month; - time xml_schema::time; -} - </pre> - - <p>The last predefined rule maps anything that wasn't mapped by - previous rules to <code><b>void</b></code>:</p> - - <pre> -namespace .* -{ - .* void void; -} - </pre> - - - <p>When you provide your own type maps with the - <code><b>--type-map</b></code> option, they are evaluated first. - This allows you to selectively override predefined rules.</p> - - <h1>REGEX AND SHELL QUOTING</h1> - - <p>When entering a regular expression argument in the shell - command line it is often necessary to use quoting (enclosing - the argument in <code><b>" "</b></code> or - <code><b>' '</b></code>) in order to prevent the shell - from interpreting certain characters, for example, spaces as - argument separators and <code><b>$</b></code> as variable - expansions.</p> - - <p>Unfortunately it is hard to achieve this in a manner that is - portable across POSIX shells, such as those found on - GNU/Linux and UNIX, and Windows shell. For example, if you - use <code><b>" "</b></code> for quoting you will get a - wrong result with POSIX shells if your expression contains - <code><b>$</b></code>. The standard way of dealing with this - on POSIX systems is to use <code><b>' '</b></code> instead. - Unfortunately, Windows shell does not remove <code><b>' '</b></code> - from arguments when they are passed to applications. As a result you - may have to use <code><b>' '</b></code> for POSIX and - <code><b>" "</b></code> for Windows (<code><b>$</b></code> is - not treated as a special character on Windows).</p> - - <p>Alternatively, you can save regular expression options into - a file, one option per line, and use this file with the - <code><b>--options-file</b></code> option. With this approach - you don't need to worry about shell quoting.</p> - - <h1>DIAGNOSTICS</h1> - - <p>If the input file is not a valid W3C XML Schema definition, - <code><b>xsd</b></code> will issue diagnostic messages to STDERR - and exit with non-zero exit code.</p> - - <h1>BUGS</h1> - - <p>Send bug reports to the - <a href="mailto:xsd-users@codesynthesis.com">xsd-users@codesynthesis.com</a> mailing list.</p> - - </div> - <div id="footer"> - Copyright © $copyright$. - - <div id="terms"> - Permission is granted to copy, distribute and/or modify this - document under the terms of the - <a href="https://www.codesynthesis.com/licenses/fdl-1.2.txt">GNU Free - Documentation License, version 1.2</a>; with no Invariant Sections, - no Front-Cover Texts and no Back-Cover Texts. - </div> - </div> -</div> -</body> -</html> |