Main Page   Class Hierarchy   Alphabetical List   Data Structures   File List   Data Fields   Globals  

RuleBasedCollator Class Reference

The RuleBasedCollator class provides the simple implementation of Collator, using data-driven tables. More...

#include <tblcoll.h>

Inheritance diagram for RuleBasedCollator:

Collator

Public Methods

 RuleBasedCollator (const UnicodeString &rules, UErrorCode &status)
 RuleBasedCollator constructor. More...

 RuleBasedCollator (const UnicodeString &rules, ECollationStrength collationStrength, UErrorCode &status)
 RuleBasedCollator constructor. More...

 RuleBasedCollator (const UnicodeString &rules, UColAttributeValue decompositionMode, UErrorCode &status)
 RuleBasedCollator constructor. More...

 RuleBasedCollator (const UnicodeString &rules, ECollationStrength collationStrength, UColAttributeValue decompositionMode, UErrorCode &status)
 RuleBasedCollator constructor. More...

 RuleBasedCollator (const RuleBasedCollator &other)
 Copy constructor. More...

virtual ~RuleBasedCollator ()
 Destructor. More...

RuleBasedCollator & operator= (const RuleBasedCollator &other)
 Assignment operator. More...

virtual UBool operator== (const Collator &other) const
 Returns true if argument is the same as this object. More...

virtual UBool operator!= (const Collator &other) const
 Returns true if argument is not the same as this object. More...

virtual Collatorclone (void) const
 Makes a deep copy of the object. More...

virtual CollationElementIteratorcreateCollationElementIterator (const UnicodeString &source) const
 Creates a collation element iterator for the source string. More...

virtual CollationElementIteratorcreateCollationElementIterator (const CharacterIterator &source) const
 Creates a collation element iterator for the source. More...

virtual EComparisonResult compare (const UnicodeString &source, const UnicodeString &target) const
 Compares a range of character data stored in two different strings based on the collation rules. More...

virtual EComparisonResult compare (const UnicodeString &source, const UnicodeString &target, int32_t length) const
 Compares a range of character data stored in two different strings based on the collation rules up to the specified length. More...

virtual EComparisonResult compare (const UChar *source, int32_t sourceLength, const UChar *target, int32_t targetLength) const
 The comparison function compares the character data stored in two different string arrays. More...

virtual CollationKeygetCollationKey (const UnicodeString &source, CollationKey &key, UErrorCode &status) const
 Transforms a specified region of the string into a series of characters that can be compared with CollationKey.compare. More...

virtual CollationKeygetCollationKey (const UChar *source, int32_t sourceLength, CollationKey &key, UErrorCode &status) const
 Transforms a specified region of the string into a series of characters that can be compared with CollationKey.compare. More...

virtual int32_t hashCode (void) const
 Generates the hash code for the rule-based collation object. More...

virtual const Locale getLocale (UErrorCode &status) const
 Gets the locale of the Collator. More...

const UnicodeStringgetRules (void) const
 Gets the table-based rules for the collation object. More...

virtual void getVersion (UVersionInfo info) const
 Gets the version information for a Collator. More...

int32_t getMaxExpansion (int32_t order) const
 Return the maximum length of any expansion sequences that end with the specified comparison order. More...

virtual UClassID getDynamicClassID (void) const
 Returns a unique class ID POLYMORPHICALLY. More...

uint8_t * cloneRuleData (int32_t &length, UErrorCode &status)
 Returns the binary format of the class's rules. More...

void getRules (UColRuleOption delta, UnicodeString &buffer)
 Returns current rules. More...

virtual void setAttribute (UColAttribute attr, UColAttributeValue value, UErrorCode &status)
 Universal attribute setter. More...

virtual UColAttributeValue getAttribute (UColAttribute attr, UErrorCode &status)
 Universal attribute getter. More...

virtual uint32_t setVariableTop (const UChar *varTop, int32_t len, UErrorCode &status)
 Sets the variable top to a collation element value of a string supplied. More...

virtual uint32_t setVariableTop (const UnicodeString varTop, UErrorCode &status)
 Sets the variable top to a collation element value of a string supplied. More...

virtual void setVariableTop (const uint32_t varTop, UErrorCode &status)
 Sets the variable top to a collation element value supplied. More...

virtual uint32_t getVariableTop (UErrorCode &status) const
 Gets the variable top value of a Collator. More...

virtual CollatorsafeClone (void)
 Thread safe cloning operation. More...

virtual int32_t getSortKey (const UnicodeString &source, uint8_t *result, int32_t resultLength) const
 Get the sort key as an array of bytes from an UnicodeString. More...

virtual int32_t getSortKey (const UChar *source, int32_t sourceLength, uint8_t *result, int32_t resultLength) const
 Get the sort key as an array of bytes from an UChar buffer. More...

virtual ECollationStrength getStrength (void) const
 Determines the minimum strength that will be use in comparison or transformation. More...

virtual void setStrength (ECollationStrength newStrength)
 Sets the minimum strength to be used in comparison or transformation. More...

 RuleBasedCollator (const UnicodeString &rules, Normalizer::EMode decompositionMode, UErrorCode &status)
 RuleBasedCollator constructor. More...

 RuleBasedCollator (const UnicodeString &rules, ECollationStrength collationStrength, Normalizer::EMode decompositionMode, UErrorCode &status)
 RuleBasedCollator constructor. More...

virtual void setDecomposition (Normalizer::EMode mode)
 Set the decomposition mode of the Collator object. More...

virtual Normalizer::EMode getDecomposition (void) const
 Get the decomposition mode of the Collator object. More...


Static Public Methods

UClassID getStaticClassID (void)
 Returns the class ID for this class. More...


Friends

class CollationElementIterator
 Used to iterate over collation elements in a character source.

class Collator
 Collator ONLY needs access to RuleBasedCollator(const Locale&, UErrorCode&).

class StringSearch
 Searching over collation elements in a character source. More...


Detailed Description

The RuleBasedCollator class provides the simple implementation of Collator, using data-driven tables.

The user can create a customized table-based collation.

Important: The ICU collation service has been reimplemented in order to achieve better performance and UCA compliance. For details, see the collation design document.

RuleBasedCollator is a thin C++ wrapper over the C implementation.

For more information about the collation service see the users guide.

Collation service provides correct sorting orders for most locales supported in ICU. If specific data for a locale is not available, the orders eventually falls back to the UCA sort order.

Sort ordering may be customized by providing your own set of rules. For more on this subject see the Collation customization section of the users guide.

See also:
Collator
Version:
2.0 11/15/2001


Constructor & Destructor Documentation

RuleBasedCollator::RuleBasedCollator const UnicodeString   rules,
UErrorCode   status
 

RuleBasedCollator constructor.

This takes the table rules and builds a collation table out of them. Please see RuleBasedCollator class description for more details on the collation rule syntax.

Parameters:
rules  the collation rules to build the collation table from.
status  reporting a success or an error.
See also:
Locale @stable

RuleBasedCollator::RuleBasedCollator const UnicodeString   rules,
ECollationStrength    collationStrength,
UErrorCode   status
 

RuleBasedCollator constructor.

This takes the table rules and builds a collation table out of them. Please see RuleBasedCollator class description for more details on the collation rule syntax.

Parameters:
rules  the collation rules to build the collation table from.
collationStrength  default strength for comparison
status  reporting a success or an error.
See also:
Locale @stable

RuleBasedCollator::RuleBasedCollator const UnicodeString   rules,
UColAttributeValue    decompositionMode,
UErrorCode   status
 

RuleBasedCollator constructor.

This takes the table rules and builds a collation table out of them. Please see RuleBasedCollator class description for more details on the collation rule syntax.

Parameters:
rules  the collation rules to build the collation table from.
decompositionMode  the normalisation mode
status  reporting a success or an error.
See also:
Locale @stable

RuleBasedCollator::RuleBasedCollator const UnicodeString   rules,
ECollationStrength    collationStrength,
UColAttributeValue    decompositionMode,
UErrorCode   status
 

RuleBasedCollator constructor.

This takes the table rules and builds a collation table out of them. Please see RuleBasedCollator class description for more details on the collation rule syntax.

Parameters:
rules  the collation rules to build the collation table from.
collationStrength  default strength for comparison
decompositionMode  the normalisation mode
status  reporting a success or an error.
See also:
Locale @stable

RuleBasedCollator::RuleBasedCollator const RuleBasedCollator &    other
 

Copy constructor.

Parameters:
the  RuleBasedCollator object to be copied
See also:
Locale @stable

virtual RuleBasedCollator::~RuleBasedCollator   [virtual]
 

Destructor.

@stable

RuleBasedCollator::RuleBasedCollator const UnicodeString   rules,
Normalizer::EMode    decompositionMode,
UErrorCode   status
 

RuleBasedCollator constructor.

This takes the table rules and builds a collation table out of them. Please see RuleBasedCollator class description for more details on the collation rule syntax.

Parameters:
rules  the collation rules to build the collation table from.
decompositionMode  the normalisation mode
status  reporting a success or an error.
See also:
Locale
Deprecated:
To be removed after 2002-sep-30, specify the decomposition mode with a UColAttributeValue.

RuleBasedCollator::RuleBasedCollator const UnicodeString   rules,
ECollationStrength    collationStrength,
Normalizer::EMode    decompositionMode,
UErrorCode   status
 

RuleBasedCollator constructor.

This takes the table rules and builds a collation table out of them. Please see RuleBasedCollator class description for more details on the collation rule syntax.

Parameters:
rules  the collation rules to build the collation table from.
collationStrength  default strength for comparison
decompositionMode  the normalisation mode
status  reporting a success or an error.
See also:
Locale
Deprecated:
To be removed after 2002-sep-30, specify the decomposition mode with a UColAttributeValue.


Member Function Documentation

virtual Collator* RuleBasedCollator::clone void    const [virtual]
 

Makes a deep copy of the object.

The caller owns the returned object.

Returns:
the cloned object. @stable

Implements Collator.

uint8_t* RuleBasedCollator::cloneRuleData int32_t &    length,
UErrorCode   status
 

Returns the binary format of the class's rules.

The format is that of .col files.

Parameters:
length  Returns the length of the data, in bytes
status  the error code status.
Returns:
memory, owned by the caller, of size 'length' bytes. @draft ICU 1.8

virtual EComparisonResult RuleBasedCollator::compare const UChar   source,
int32_t    sourceLength,
const UChar   target,
int32_t    targetLength
const [virtual]
 

The comparison function compares the character data stored in two different string arrays.

Returns information about whether a string array is less than, greater than or equal to another string array.

Example of use:

 .       UErrorCode status = U_ZERO_ERROR;
 .       Collator *myCollation =
 .                         Collator::createInstance(Locale::US, status);
 .       if (U_FAILURE(status)) return;
 .       myCollation->setStrength(Collator::PRIMARY);
 .       // result would be Collator::EQUAL ("abc" == "ABC")
 .       // (no primary difference between "abc" and "ABC")
 .       Collator::UCollationResult result =
 .                              myCollation->compare(L"abc", 3, L"ABC", 3);
 .       myCollation->setStrength(Collator::TERTIARY);
 .       // result would be Collator::LESS (abc" <<< "ABC")
 .       // (with tertiary difference between "abc" and "ABC")
 .       Collator::UCollationResult result =
 .                              myCollation->compare(L"abc", 3, L"ABC", 3);
 
Parameters:
source  the source string array to be compared with.
sourceLength  the length of the source string array. If this value is equal to -1, the string array is null-terminated.
target  the string that is to be compared with the source string.
targetLength  the length of the target string array. If this value is equal to -1, the string array is null-terminated.
Returns:
Returns a byte value. GREATER if source is greater than target; EQUAL if source is equal to target; LESS if source is less than target @draft ICU 1.8

Implements Collator.

virtual EComparisonResult RuleBasedCollator::compare const UnicodeString   source,
const UnicodeString   target,
int32_t    length
const [virtual]
 

Compares a range of character data stored in two different strings based on the collation rules up to the specified length.

Returns information about whether a string is less than, greater than or equal to another string in a language. This can be overriden in a subclass.

Parameters:
source  the source string.
target  the target string to be compared with the source string.
length  compares up to the specified length
Returns:
the comparison result. GREATER if the source string is greater than the target string, LESS if the source is less than the target. Otherwise, returns EQUAL. @stable

Implements Collator.

virtual EComparisonResult RuleBasedCollator::compare const UnicodeString   source,
const UnicodeString   target
const [virtual]
 

Compares a range of character data stored in two different strings based on the collation rules.

Returns information about whether a string is less than, greater than or equal to another string in a language. This can be overriden in a subclass.

Parameters:
source  the source string.
target  the target string to be compared with the source string.
Returns:
the comparison result. GREATER if the source string is greater than the target string, LESS if the source is less than the target. Otherwise, returns EQUAL. @stable

Implements Collator.

virtual CollationElementIterator* RuleBasedCollator::createCollationElementIterator const CharacterIterator   source const [virtual]
 

Creates a collation element iterator for the source.

The caller of this method is responsible for the memory management of the returned pointer.

Parameters:
source  the CharacterIterator which produces the characters over which the CollationElementItgerator will iterate.
Returns:
the collation element iterator of the source using this as the based Collator. @draft ICU 1.8

virtual CollationElementIterator* RuleBasedCollator::createCollationElementIterator const UnicodeString   source const [virtual]
 

Creates a collation element iterator for the source string.

The caller of this method is responsible for the memory management of the return pointer.

Parameters:
source  the string over which the CollationElementIterator will iterate.
Returns:
the collation element iterator of the source string using this as the based Collator. @draft ICU 1.8

virtual UColAttributeValue RuleBasedCollator::getAttribute UColAttribute    attr,
UErrorCode   status
[virtual]
 

Universal attribute getter.

Parameters:
attr  attribute type
status  to indicate whether the operation went on smoothly or there were errors
Returns:
attribute value @draft ICU 1.8

Implements Collator.

virtual CollationKey& RuleBasedCollator::getCollationKey const UChar   source,
int32_t    sourceLength,
CollationKey   key,
UErrorCode   status
const [virtual]
 

Transforms a specified region of the string into a series of characters that can be compared with CollationKey.compare.

Use a CollationKey when you need to do repeated comparisions on the same string. For a single comparison the compare method will be faster.

Parameters:
source  the source string.
key  the transformed key of the source string.
status  the error code status.
Returns:
the transformed key.
See also:
CollationKey @stable

Implements Collator.

virtual CollationKey& RuleBasedCollator::getCollationKey const UnicodeString   source,
CollationKey   key,
UErrorCode   status
const [virtual]
 

Transforms a specified region of the string into a series of characters that can be compared with CollationKey.compare.

Use a CollationKey when you need to do repeated comparisions on the same string. For a single comparison the compare method will be faster.

Parameters:
source  the source string.
key  the transformed key of the source string.
status  the error code status.
Returns:
the transformed key.
See also:
CollationKey @stable

Implements Collator.

virtual Normalizer::EMode RuleBasedCollator::getDecomposition void    const [virtual]
 

Get the decomposition mode of the Collator object.

Returns:
the decomposition mode
See also:
Collator::setDecomposition
Deprecated:
To be removed after 2002-sep-30; use getAttribute().

Implements Collator.

virtual UClassID RuleBasedCollator::getDynamicClassID void    const [inline, virtual]
 

Returns a unique class ID POLYMORPHICALLY.

Pure virtual override. This method is to implement a simple version of RTTI, since not all C++ compilers support genuine RTTI. Polymorphic operator==() and clone() methods call this method.

Returns:
The class ID for this object. All objects of a given class have the same class ID. Objects of other classes have different class IDs. @stable

Implements Collator.

virtual const Locale RuleBasedCollator::getLocale UErrorCode   status const [virtual]
 

Gets the locale of the Collator.

Returns:
locale where the collation data lives. If the collator was instantiated from rules, locale is empty. @draft ICU 2.1

Implements Collator.

int32_t RuleBasedCollator::getMaxExpansion int32_t    order const
 

Return the maximum length of any expansion sequences that end with the specified comparison order.

Parameters:
order  a collation order returned by previous or next.
Returns:
maximum size of the expansion sequences ending with the collation element or 1 if collation element does not occur at the end of any expansion sequence
See also:
CollationElementIterator::getMaxExpansion @stable

void RuleBasedCollator::getRules UColRuleOption    delta,
UnicodeString   buffer
 

Returns current rules.

Delta defines whether full rules are returned or just the tailoring.

Parameters:
delta  one of UCOL_TAILORING_ONLY, UCOL_FULL_RULES.
buffer  UnicodeString to store the result rules @draft ICU 1.8

const UnicodeString& RuleBasedCollator::getRules void    const
 

Gets the table-based rules for the collation object.

Returns:
returns the collation rules that the table collation object was created from. @stable

virtual int32_t RuleBasedCollator::getSortKey const UChar   source,
int32_t    sourceLength,
uint8_t *    result,
int32_t    resultLength
const [virtual]
 

Get the sort key as an array of bytes from an UChar buffer.

Parameters:
source  string to be processed.
sourceLength  length of string to be processed. If -1, the string is 0 terminated and length will be decided by the function.
result  buffer to store result in. If NULL, number of bytes needed will be returned.
resultLength  length of the result buffer. If if not enough the buffer will be filled to capacity.
Returns:
Number of bytes needed for storing the sort key @draft ICU 1.8

Implements Collator.

virtual int32_t RuleBasedCollator::getSortKey const UnicodeString   source,
uint8_t *    result,
int32_t    resultLength
const [virtual]
 

Get the sort key as an array of bytes from an UnicodeString.

Parameters:
source  string to be processed.
result  buffer to store result in. If NULL, number of bytes needed will be returned.
resultLength  length of the result buffer. If if not enough the buffer will be filled to capacity.
Returns:
Number of bytes needed for storing the sort key

Implements Collator.

UClassID RuleBasedCollator::getStaticClassID void    [inline, static]
 

Returns the class ID for this class.

This is useful only for comparing to a return value from getDynamicClassID(). For example:

 Base* polymorphic_pointer = createPolymorphicObject();
 if (polymorphic_pointer->getDynamicClassID() ==
                                          Derived::getStaticClassID()) ...
 
Returns:
The class ID for all objects of this class. @stable

virtual ECollationStrength RuleBasedCollator::getStrength void    const [virtual]
 

Determines the minimum strength that will be use in comparison or transformation.

E.g. with strength == SECONDARY, the tertiary difference is ignored

E.g. with strength == PRIMARY, the secondary and tertiary difference are ignored.

Returns:
the current comparison level.
See also:
RuleBasedCollator::setStrength @stable

Implements Collator.

virtual uint32_t RuleBasedCollator::getVariableTop UErrorCode   status const [virtual]
 

Gets the variable top value of a Collator.

Lower 16 bits are undefined and should be ignored.

Parameters:
status  error code (not changed by function). If error code is set, the return value is undefined. @draft ICU 2.0

Implements Collator.

virtual void RuleBasedCollator::getVersion UVersionInfo    info const [virtual]
 

Gets the version information for a Collator.

Parameters:
info  the version # information, the result will be filled in @stable

Implements Collator.

virtual int32_t RuleBasedCollator::hashCode void    const [virtual]
 

Generates the hash code for the rule-based collation object.

Returns:
the hash code. @stable

Implements Collator.

UBool RuleBasedCollator::operator!= const Collator   other const [inline, virtual]
 

Returns true if argument is not the same as this object.

Parameters:
other  Collator object to be compared
Returns:
returns true if argument is not the same as this object. @stable

Reimplemented from Collator.

RuleBasedCollator& RuleBasedCollator::operator= const RuleBasedCollator &    other
 

Assignment operator.

Parameters:
other  other RuleBasedCollator object to compare with. @stable

virtual UBool RuleBasedCollator::operator== const Collator   other const [virtual]
 

Returns true if argument is the same as this object.

Parameters:
other  Collator object to be compared.
Returns:
true if arguments is the same as this object. @stable

Reimplemented from Collator.

virtual Collator* RuleBasedCollator::safeClone void    [virtual]
 

Thread safe cloning operation.

Returns:
pointer to the new clone, user should remove it. @draft ICU 1.8

Implements Collator.

virtual void RuleBasedCollator::setAttribute UColAttribute    attr,
UColAttributeValue    value,
UErrorCode   status
[virtual]
 

Universal attribute setter.

Parameters:
attr  attribute type
value  attribute value
status  to indicate whether the operation went on smoothly or there were errors @draft ICU 1.8

Implements Collator.

virtual void RuleBasedCollator::setDecomposition Normalizer::EMode    mode [virtual]
 

Set the decomposition mode of the Collator object.

success is equal to U_ILLEGAL_ARGUMENT_ERROR if error occurs.

Parameters:
the  new decomposition mode
See also:
Collator::getDecomposition
Deprecated:
To be removed after 2002-sep-30; use setAttribute().

Implements Collator.

virtual void RuleBasedCollator::setStrength ECollationStrength    newStrength [virtual]
 

Sets the minimum strength to be used in comparison or transformation.

See also:
RuleBasedCollator::getStrength
Parameters:
newStrength  the new comparison level. @stable

Implements Collator.

virtual void RuleBasedCollator::setVariableTop const uint32_t    varTop,
UErrorCode   status
[virtual]
 

Sets the variable top to a collation element value supplied.

Variable top is set to the upper 16 bits. Lower 16 bits are ignored.

Parameters:
varTop  CE value, as returned by setVariableTop or ucol)getVariableTop
status  error code (not changed by function) @draft ICU 2.0

Implements Collator.

virtual uint32_t RuleBasedCollator::setVariableTop const UnicodeString    varTop,
UErrorCode   status
[virtual]
 

Sets the variable top to a collation element value of a string supplied.

Parameters:
varTop  an UnicodeString size 1 or more (if contraction) of UChars to which the variable top should be set
status  error code. If error code is set, the return value is undefined. Errors set by this function are:
U_CE_NOT_FOUND_ERROR if more than one character was passed and there is no such a contraction
U_PRIMARY_TOO_LONG_ERROR if the primary for the variable top has more than two bytes
Returns:
a 32 bit value containing the value of the variable top in upper 16 bits. Lower 16 bits are undefined @draft ICU 2.0

Implements Collator.

virtual uint32_t RuleBasedCollator::setVariableTop const UChar   varTop,
int32_t    len,
UErrorCode   status
[virtual]
 

Sets the variable top to a collation element value of a string supplied.

Parameters:
varTop  one or more (if contraction) UChars to which the variable top should be set
len  length of variable top string. If -1 it is considered to be zero terminated.
status  error code. If error code is set, the return value is undefined. Errors set by this function are:
U_CE_NOT_FOUND_ERROR if more than one character was passed and there is no such a contraction
U_PRIMARY_TOO_LONG_ERROR if the primary for the variable top has more than two bytes
Returns:
a 32 bit value containing the value of the variable top in upper 16 bits. Lower 16 bits are undefined @draft ICU 2.0

Implements Collator.


Friends And Related Function Documentation

friend class StringSearch [friend]
 

Searching over collation elements in a character source.

Created by: Helena Shih

Modification History:

Date Name Description 2/5/97 aliu Added streamIn and streamOut methods. Added constructor which reads RuleBasedCollator object from a binary file. Added writeToFile method which streams RuleBasedCollator out to a binary file. The streamIn and streamOut methods use istream and ostream objects in binary mode. 2/12/97 aliu Modified to use TableCollationData sub-object to hold invariant data. 2/13/97 aliu Moved several methods into this class from Collation. Added a private RuleBasedCollator(Locale&) constructor, to be used by Collator::createDefault(). General clean up. 2/20/97 helena Added clone, operator==, operator!=, operator=, and copy constructor and getDynamicClassID. 3/5/97 aliu Modified constructFromFile() to add parameter specifying whether or not binary loading is to be attempted. This is required for dynamic rule loading. 05/07/97 helena Added memory allocation error detection. 6/17/97 helena Added IDENTICAL strength for compare, changed getRules to use MergeCollation::getPattern. 6/20/97 helena Java class name change. 8/18/97 helena Added internal API documentation. 09/03/97 helena Added createCollationKeyValues(). 02/10/98 damiba Added compare with "length" parameter 08/05/98 erm Synched with 1.2 version of RuleBasedCollator.java 04/23/99 stephen Removed EDecompositionMode, merged with Normalizer::EMode 06/14/99 stephen Removed kResourceBundleSuffix 11/02/99 helena Collator performance enhancements. Eliminates the UnicodeString construction and special case for NO_OP. 11/23/99 srl More performance enhancements. Updates to NormalizerIterator internal state management. 12/15/99 aliu Update to support Thai collation. Move NormalizerIterator to implementation file. 01/29/01 synwee Modified into a C++ wrapper which calls C API (ucol.h)


The documentation for this class was generated from the following file:
Generated on Mon Mar 4 20:09:30 2002 for ICU 2.0 by doxygen1.2.14 written by Dimitri van Heesch, © 1997-2002