Main Page   Class Hierarchy   Alphabetical List   Data Structures   File List   Data Fields   Globals  

CharacterIterator Class Reference

Abstract class that defines an API for iteration on text objects. More...

#include <chariter.h>

Inheritance diagram for CharacterIterator:

ForwardCharacterIterator UCharCharacterIterator StringCharacterIterator

Public Types

enum  EOrigin { kStart, kCurrent, kEnd }
 Origin enumeration for the move() and move32() functions. More...


Public Methods

virtual CharacterIterator * clone (void) const=0
 Returns a pointer to a new CharacterIterator of the same concrete class as this one, and referring to the same character in the same text-storage object as this one. More...

virtual UChar first (void)=0
 Sets the iterator to refer to the first code unit in its iteration range, and returns that code unit. More...

virtual UChar firstPostInc (void)
 Sets the iterator to refer to the first code unit in its iteration range, returns that code unit, and moves the position to the second code unit. More...

virtual UChar32 first32 (void)=0
 Sets the iterator to refer to the first code point in its iteration range, and returns that code unit, This can be used to begin an iteration with next32(). More...

virtual UChar32 first32PostInc (void)
 Sets the iterator to refer to the first code point in its iteration range, returns that code point, and moves the position to the second code point. More...

UTextOffset setToStart ()
 Sets the iterator to refer to the first code unit or code point in its iteration range. More...

virtual UChar last (void)=0
 Sets the iterator to refer to the last code unit in its iteration range, and returns that code unit. More...

virtual UChar32 last32 (void)=0
 Sets the iterator to refer to the last code point in its iteration range, and returns that code unit. More...

UTextOffset setToEnd ()
 Sets the iterator to the end of its iteration range, just behind the last code unit or code point. More...

virtual UChar setIndex (UTextOffset position)=0
 Sets the iterator to refer to the "position"-th code unit in the text-storage object the iterator refers to, and returns that code unit. More...

virtual UChar32 setIndex32 (UTextOffset position)=0
 Sets the iterator to refer to the beginning of the code point that contains the "position"-th code unit in the text-storage object the iterator refers to, and returns that code point. More...

virtual UChar current (void) const=0
 Returns the code unit the iterator currently refers to. More...

virtual UChar32 current32 (void) const=0
 Returns the code point the iterator currently refers to. More...

virtual UChar next (void)=0
 Advances to the next code unit in the iteration range (toward endIndex()), and returns that code unit. More...

virtual UChar32 next32 (void)=0
 Advances to the next code point in the iteration range (toward endIndex()), and returns that code point. More...

virtual UChar previous (void)=0
 Advances to the previous code unit in the iteration rance (toward startIndex()), and returns that code unit. More...

virtual UChar32 previous32 (void)=0
 Advances to the previous code point in the iteration rance (toward startIndex()), and returns that code point. More...

virtual UBool hasPrevious ()=0
 Returns FALSE if there are no more code units or code points before the current position in the iteration range. More...

UTextOffset startIndex (void) const
 Returns the numeric index in the underlying text-storage object of the character returned by first(). More...

UTextOffset endIndex (void) const
 Returns the numeric index in the underlying text-storage object of the position immediately BEYOND the character returned by last(). More...

UTextOffset getIndex (void) const
 Returns the numeric index in the underlying text-storage object of the character the iterator currently refers to (i.e., the character returned by current()). More...

int32_t getLength () const
 Returns the length of the entire text in the underlying text-storage object. More...

virtual UTextOffset move (int32_t delta, EOrigin origin)=0
 Moves the current position relative to the start or end of the iteration range, or relative to the current position itself. More...

virtual UTextOffset move32 (int32_t delta, EOrigin origin)=0
 Moves the current position relative to the start or end of the iteration range, or relative to the current position itself. More...

virtual void getText (UnicodeString &result)=0
 Copies the text under iteration into the UnicodeString referred to by "result". More...


Protected Methods

 CharacterIterator ()
 CharacterIterator (int32_t length)
 CharacterIterator (int32_t length, UTextOffset position)
 CharacterIterator (int32_t length, UTextOffset textBegin, UTextOffset textEnd, UTextOffset position)
 CharacterIterator (const CharacterIterator &that)
CharacterIterator & operator= (const CharacterIterator &that)

Protected Attributes

int32_t textLength
UTextOffset pos
UTextOffset begin
UTextOffset end

Detailed Description

Abstract class that defines an API for iteration on text objects.

This is an interface for forward and backward iteration and random access into a text object.

The API provides backward compatibility to the Java and older ICU CharacterIterator classes but extends them significantly:

  1. CharacterIterator is now a subclass of ForwardCharacterIterator.
  2. While the old API functions provided forward iteration with "pre-increment" semantics, the new one also provides functions with "post-increment" semantics. They are more efficient and should be the preferred iterator functions for new implementations. The backward iteration always had "pre-decrement" semantics, which are efficient.
  3. Just like ForwardCharacterIterator, it provides access to both code units and code points. Code point access versions are available for the old and the new iteration semantics.
  4. There are new functions for setting and moving the current position without returning a character, for efficiency.
      See ForwardCharacterIterator for examples for using the new forward iteration functions. For backward iteration, there is also a hasPrevious() function that can be used analogously to hasNext(). The old functions work as before and are shown below.

      Examples for some of the new functions:

      Forward iteration with hasNext():

       void forward1(CharacterIterator &it) {
           UChar32 c;
           for(it.setToStart(); it.hasNext();) {
               c=it.next32PostInc();
               // use c
           }
        }
      
      Forward iteration more similar to loops with the old forward iteration, showing a way to convert simple for() loops:
       void forward2(CharacterIterator &it) {
           UChar c;
           for(c=it.firstPostInc(); c!=CharacterIterator::DONE; c=it.nextPostInc()) {
                // use c
            }
       }
      
      Backward iteration with setToEnd() and hasPrevious():
        void backward1(CharacterIterator &it) {
            UChar32 c;
            for(it.setToEnd(); it.hasPrevious();) {
               c=it.previous32();
                // use c
            }
        }
      
      Backward iteration with a more traditional for() loop:
       void backward2(CharacterIterator &it) {
           UChar c;
           for(c=it.last(); c!=CharacterIterator::DONE; c=it.previous()) {
               // use c
            }
        }
      

      Example for random access:

        void random(CharacterIterator &it) {
            // set to the third code point from the beginning
            it.move32(3, CharacterIterator::kStart);
            // get a code point from here without moving the position
            UChar32 c=it.current32();
            // get the position
            int32_t pos=it.getIndex();
            // get the previous code unit
            UChar u=it.previous();
            // move back one more code unit
            it.move(-1, CharacterIterator::kCurrent);
            // set the position back to where it was
            // and read the same code point c and move beyond it
            it.setIndex(pos);
            if(c!=it.next32PostInc()) {
                exit(1); // CharacterIterator inconsistent
            }
        }
      

      Examples, especially for the old API:

      Function processing characters, in this example simple output

       
        void processChar( UChar c )
        {
            cout << " " << c;
        }
      
      Traverse the text from start to finish
       
       
        void traverseForward(CharacterIterator& iter)
        {
            for(UChar c = iter.first(); c != CharacterIterator.DONE; c = iter.next()) {
                processChar(c);
            }
        }
      
      Traverse the text backwards, from end to start
       
        void traverseBackward(CharacterIterator& iter)
        {
            for(UChar c = iter.last(); c != CharacterIterator.DONE; c = iter.previous()) {
                processChar(c);
            }
        }
      
      Traverse both forward and backward from a given position in the text. Calls to notBoundary() in this example represents some additional stopping criteria.
       
       void traverseOut(CharacterIterator& iter, UTextOffset pos)
       {
            UChar c;
            for (c = iter.setIndex(pos);
            c != CharacterIterator.DONE && (Unicode::isLetter(c) || Unicode::isDigit(c));
                c = iter.next()) {}
            UTextOffset end = iter.getIndex();
            for (c = iter.setIndex(pos);
                c != CharacterIterator.DONE && (Unicode::isLetter(c) || Unicode::isDigit(c));
                c = iter.previous()) {}
            UTextOffset start = iter.getIndex() + 1;
        
            cout << "start: " << start << " end: " << end << endl;
            for (c = iter.setIndex(start); iter.getIndex() < end; c = iter.next() ) {
                processChar(c);
           }
        }
      
      Creating a StringCharacterIterator and calling the test functions
       
        void CharacterIterator_Example( void )
         {
             cout << endl << "===== CharacterIterator_Example: =====" << endl;
             UnicodeString text("Ein kleiner Satz.");
             StringCharacterIterator iterator(text);
             cout << "----- traverseForward: -----------" << endl;
             traverseForward( iterator );
             cout << endl << endl << "----- traverseBackward: ----------" << endl;
             traverseBackward( iterator );
             cout << endl << endl << "----- traverseOut: ---------------" << endl;
             traverseOut( iterator, 7 );
             cout << endl << endl << "-----" << endl;
         }
      

      @stable


      Member Enumeration Documentation

      enum CharacterIterator::EOrigin
       

      Origin enumeration for the move() and move32() functions.

      @stable


      Member Function Documentation

      virtual CharacterIterator* CharacterIterator::clone void    const [pure virtual]
       

      Returns a pointer to a new CharacterIterator of the same concrete class as this one, and referring to the same character in the same text-storage object as this one.

      The caller is responsible for deleting the new clone. @stable

      Implemented in StringCharacterIterator.

      virtual UChar CharacterIterator::current void    const [pure virtual]
       

      Returns the code unit the iterator currently refers to.

      @stable

      Implemented in UCharCharacterIterator.

      virtual UChar32 CharacterIterator::current32 void    const [pure virtual]
       

      Returns the code point the iterator currently refers to.

      @stable

      Implemented in UCharCharacterIterator.

      UTextOffset CharacterIterator::endIndex void    const [inline]
       

      Returns the numeric index in the underlying text-storage object of the position immediately BEYOND the character returned by last().

      @stable

      virtual UChar CharacterIterator::first void    [pure virtual]
       

      Sets the iterator to refer to the first code unit in its iteration range, and returns that code unit.

      This can be used to begin an iteration with next(). @stable

      Implemented in UCharCharacterIterator.

      virtual UChar32 CharacterIterator::first32 void    [pure virtual]
       

      Sets the iterator to refer to the first code point in its iteration range, and returns that code unit, This can be used to begin an iteration with next32().

      Note that an iteration with next32PostInc(), beginning with, e.g., setToStart() or firstPostInc(), is more efficient. @stable

      Implemented in UCharCharacterIterator.

      virtual UChar32 CharacterIterator::first32PostInc void    [virtual]
       

      Sets the iterator to refer to the first code point in its iteration range, returns that code point, and moves the position to the second code point.

      This is an alternative to setToStart() for forward iteration with next32PostInc(). @stable

      Reimplemented in UCharCharacterIterator.

      virtual UChar CharacterIterator::firstPostInc void    [virtual]
       

      Sets the iterator to refer to the first code unit in its iteration range, returns that code unit, and moves the position to the second code unit.

      This is an alternative to setToStart() for forward iteration with nextPostInc(). @stable

      Reimplemented in UCharCharacterIterator.

      UTextOffset CharacterIterator::getIndex void    const [inline]
       

      Returns the numeric index in the underlying text-storage object of the character the iterator currently refers to (i.e., the character returned by current()).

      @stable

      int32_t CharacterIterator::getLength   const [inline]
       

      Returns the length of the entire text in the underlying text-storage object.

      @stable

      virtual void CharacterIterator::getText UnicodeString   result [pure virtual]
       

      Copies the text under iteration into the UnicodeString referred to by "result".

      Parameters:
      result  Receives a copy of the text under iteration. @stable

      Implemented in StringCharacterIterator.

      virtual UBool CharacterIterator::hasPrevious   [pure virtual]
       

      Returns FALSE if there are no more code units or code points before the current position in the iteration range.

      This is used with previous() or previous32() in backward iteration. @stable

      Implemented in UCharCharacterIterator.

      virtual UChar CharacterIterator::last void    [pure virtual]
       

      Sets the iterator to refer to the last code unit in its iteration range, and returns that code unit.

      This can be used to begin an iteration with previous(). @stable

      Implemented in UCharCharacterIterator.

      virtual UChar32 CharacterIterator::last32 void    [pure virtual]
       

      Sets the iterator to refer to the last code point in its iteration range, and returns that code unit.

      This can be used to begin an iteration with previous32(). @stable

      Implemented in UCharCharacterIterator.

      virtual UTextOffset CharacterIterator::move int32_t    delta,
      EOrigin    origin
      [pure virtual]
       

      Moves the current position relative to the start or end of the iteration range, or relative to the current position itself.

      The movement is expressed in numbers of code units forward or backward by specifying a positive or negative delta.

      Returns:
      the new position @stable

      Implemented in UCharCharacterIterator.

      virtual UTextOffset CharacterIterator::move32 int32_t    delta,
      EOrigin    origin
      [pure virtual]
       

      Moves the current position relative to the start or end of the iteration range, or relative to the current position itself.

      The movement is expressed in numbers of code points forward or backward by specifying a positive or negative delta.

      Returns:
      the new position @stable

      Implemented in UCharCharacterIterator.

      virtual UChar CharacterIterator::next void    [pure virtual]
       

      Advances to the next code unit in the iteration range (toward endIndex()), and returns that code unit.

      If there are no more code units to return, returns DONE. @stable

      Implemented in UCharCharacterIterator.

      virtual UChar32 CharacterIterator::next32 void    [pure virtual]
       

      Advances to the next code point in the iteration range (toward endIndex()), and returns that code point.

      If there are no more code points to return, returns DONE. Note that iteration with "pre-increment" semantics is less efficient than iteration with "post-increment" semantics that is provided by next32PostInc(). @stable

      Implemented in UCharCharacterIterator.

      virtual UChar CharacterIterator::previous void    [pure virtual]
       

      Advances to the previous code unit in the iteration rance (toward startIndex()), and returns that code unit.

      If there are no more code units to return, returns DONE. @stable

      Implemented in UCharCharacterIterator.

      virtual UChar32 CharacterIterator::previous32 void    [pure virtual]
       

      Advances to the previous code point in the iteration rance (toward startIndex()), and returns that code point.

      If there are no more code points to return, returns DONE. @stable

      Implemented in UCharCharacterIterator.

      virtual UChar CharacterIterator::setIndex UTextOffset    position [pure virtual]
       

      Sets the iterator to refer to the "position"-th code unit in the text-storage object the iterator refers to, and returns that code unit.

      @stable

      Implemented in UCharCharacterIterator.

      virtual UChar32 CharacterIterator::setIndex32 UTextOffset    position [pure virtual]
       

      Sets the iterator to refer to the beginning of the code point that contains the "position"-th code unit in the text-storage object the iterator refers to, and returns that code point.

      The current position is adjusted to the beginning of the code point (its first code unit). @stable

      Implemented in UCharCharacterIterator.

      UTextOffset CharacterIterator::setToEnd   [inline]
       

      Sets the iterator to the end of its iteration range, just behind the last code unit or code point.

      This can be used to begin a backward iteration with previous() or previous32().

      Returns:
      the end position of the iteration range @stable

      UTextOffset CharacterIterator::setToStart   [inline]
       

      Sets the iterator to refer to the first code unit or code point in its iteration range.

      This can be used to begin a forward iteration with nextPostInc() or next32PostInc().

      Returns:
      the start position of the iteration range @stable

      UTextOffset CharacterIterator::startIndex void    const [inline]
       

      Returns the numeric index in the underlying text-storage object of the character returned by first().

      Since it's possible to create an iterator that iterates across only part of a text-storage object, this number isn't necessarily 0. @stable


      The documentation for this class was generated from the following file:
      Generated on Mon Mar 4 23:18:56 2002 for ICU 2.0 by doxygen1.2.14 written by Dimitri van Heesch, © 1997-2002