Movatterモバイル変換


[0]ホーム

URL:


cppreference.com
Namespaces
Variants
    Actions

      std::regex_token_iterator

      From cppreference.com
      <cpp‎ |regex
       
       
       
      Regular expressions library
      Classes
      (C++11)
      Algorithms
      Iterators
      regex_token_iterator
      (C++11)
      Exceptions
      Traits
      Constants
      (C++11)
      Regex Grammar
       
       
      Defined in header<regex>
      template<

         class BidirIt,
         class CharT=typenamestd::iterator_traits<BidirIt>::value_type,
         class Traits=std::regex_traits<CharT>

      >class regex_token_iterator
      (since C++11)

      std::regex_token_iterator is a read-onlyLegacyForwardIterator that accesses the individual sub-matches of every match of a regular expression within the underlying character sequence. It can also be used to access the parts of the sequence that were not matched by the given regular expression (e.g. as a tokenizer).

      On construction, it constructs anstd::regex_iterator and on every increment it steps through the requested sub-matches from the current match_results, incrementing the underlyingstd::regex_iterator when incrementing away from the last submatch.

      The default-constructedstd::regex_token_iterator is the end-of-sequence iterator. When a validstd::regex_token_iterator is incremented after reaching the last submatch of the last match, it becomes equal to the end-of-sequence iterator. Dereferencing or incrementing it further invokes undefined behavior.

      Just before becoming the end-of-sequence iterator, astd::regex_token_iterator may become asuffix iterator, if the index-1 (non-matched fragment) appears in the list of the requested submatch indices. Such iterator, if dereferenced, returns a match_results corresponding to the sequence of characters between the last match and the end of sequence.

      A typical implementation ofstd::regex_token_iterator holds the underlyingstd::regex_iterator, a container (e.g.std::vector<int>) of the requested submatch indices, the internal counter equal to the index of the submatch, a pointer tostd::sub_match, pointing at the current submatch of the current match, and astd::match_results object containing the last non-matched character sequence (used in tokenizer mode).

      Contents

      [edit]Type requirements

      -
      BidirIt must meet the requirements ofLegacyBidirectionalIterator.

      [edit]Specializations

      Several specializations for common character sequence types are defined:

      Defined in header<regex>
      Type Definition
      std::cregex_token_iteratorstd::regex_token_iterator<constchar*>
      std::wcregex_token_iteratorstd::regex_token_iterator<constwchar_t*>
      std::sregex_token_iteratorstd::regex_token_iterator<std::string::const_iterator>
      std::wsregex_token_iteratorstd::regex_token_iterator<std::wstring::const_iterator>

      [edit]Member types

      Member type Definition
      value_typestd::sub_match<BidirIt>
      difference_typestd::ptrdiff_t
      pointerconst value_type*
      referenceconst value_type&
      iterator_categorystd::forward_iterator_tag
      iterator_concept(C++20)std::input_iterator_tag
      regex_typestd::basic_regex<CharT, Traits>

      [edit]Member functions

      constructs a newregex_token_iterator
      (public member function)[edit]
      (destructor)
      (implicitly declared)
      destructs aregex_token_iterator, including the cached value
      (public member function)[edit]
      assigns contents
      (public member function)[edit]
      (removed in C++20)
      compares tworegex_token_iterators
      (public member function)[edit]
      accesses current submatch
      (public member function)[edit]
      advances the iterator to the next submatch
      (public member function)[edit]

      [edit]Notes

      It is the programmer's responsibility to ensure that thestd::basic_regex object passed to the iterator's constructor outlives the iterator. Because the iterator stores astd::regex_iterator which stores a pointer to the regex, incrementing the iterator after the regex was destroyed results in undefined behavior.

      [edit]Example

      Run this code
      #include <algorithm>#include <fstream>#include <iostream>#include <iterator>#include <regex> int main(){// Tokenization (non-matched fragments)// Note that regex is matched only two times; when the third value is obtained// the iterator is a suffix iterator.conststd::string text="Quick brown fox.";conststd::regex ws_re("\\s+");// whitespacestd::copy(std::sregex_token_iterator(text.begin(), text.end(), ws_re,-1),              std::sregex_token_iterator(),std::ostream_iterator<std::string>(std::cout,"\n")); std::cout<<'\n'; // Iterating the first submatchesconststd::string html= R"(<p><a href="http://google.com">google</a> )"                             R"(< a HREF ="http://cppreference.com">cppreference</a>\n</p>)";conststd::regex url_re(R"!!(<\s*A\s+[^>]*href\s*=\s*"([^"]*)")!!", std::regex::icase);    std::copy(std::sregex_token_iterator(html.begin(), html.end(), url_re, 1),              std::sregex_token_iterator(),              std::ostream_iterator<std::string>(std::cout, "\n"));}

      Output:

      Quickbrownfox. http://google.comhttp://cppreference.com

      [edit]Defect reports

      The following behavior-changing defect reports were applied retroactively to previously published C++ standards.

      DRApplied toBehavior as publishedCorrect behavior
      LWG 3698
      (P2770R0)
      C++20regex_token_iterator was aforward_iterator
      while being a stashing iterator
      madeinput_iterator[1]
      1. iterator_category was unchanged by the resolution, because changing it tostd::input_iterator_tag might break too much existing code.
      Retrieved from "https://en.cppreference.com/mwiki/index.php?title=cpp/regex/regex_token_iterator&oldid=170722"

      [8]ページ先頭

      ©2009-2025 Movatter.jp