class StringScanner

ClassStringScanner supports processing a stored string as a stream; this code creates a newStringScanner object with string'foobarbaz':

require'strscan'scanner =StringScanner.new('foobarbaz')

About the Examples

All examples here assume thatStringScanner has been required:

require'strscan'

Some examples here assume that these constants are defined:

MULTILINE_TEXT =<<~EOTGo placidly amid the noise and haste,and remember what peace there may be in silence.EOTHIRAGANA_TEXT ='こんにちは'ENGLISH_TEXT ='Hello'

Some examples here assume that certain helper methods are defined:

See examples athelper methods.

TheStringScanner Object

This code creates aStringScanner object (we’ll call it simply ascanner), and shows some of its basic properties:

scanner =StringScanner.new('foobarbaz')scanner.string# => "foobarbaz"put_situation(scanner)# Situation:#   pos:       0#   charpos:   0#   rest:      "foobarbaz"#   rest_size: 9

The scanner has:

Stored String

Thestored string is the string stored in theStringScanner object.

Each of these methods sets, modifies, or returns the stored string:

MethodEffect
::new(string)Creates a new scanner for the given string.
string=(new_string)Replaces the existing stored string.
concat(more_string)Appends a string to the existing stored string.
stringReturns the stored string.

Positions

AStringScanner object maintains a zero-basedbyte position and a zero-basedcharacter position.

Each of these methods explicitly sets positions:

MethodEffect
resetSets both positions to zero (beginning of stored string).
terminateSets both positions to the end of the stored string.
pos=(new_byte_position)Sets byte position; adjusts character position.

Byte Position (Position)

The byte position (or simplyposition) is a zero-based index into the bytes in the scanner’s stored string; for a newStringScanner object, the byte position is zero.

When the byte position is:

To get or set the byte position:

Many methods use the byte position as the basis for finding matches; many others set, increment, or decrement the byte position:

scanner =StringScanner.new('foobar')scanner.pos# => 0scanner.scan(/foo/)# => "foo" # Match found.scanner.pos# => 3     # Byte position incremented.scanner.scan(/foo/)# => nil   # Match not found.scanner.pos# => 3             # Byte position not changed.

Some methods implicitly modify the byte position; see:

The values of these methods are derived directly from the values ofpos andstring:

Character Position

The character position is a zero-based index into thecharacters in the stored string; for a newStringScanner object, the character position is zero.

Methodcharpos returns the character position; its value may not be reset explicitly.

Some methods change (increment or reset) the character position; see:

Example (string includes multi-byte characters):

scanner =StringScanner.new(ENGLISH_TEXT)# Five 1-byte characters.scanner.concat(HIRAGANA_TEXT)# Five 3-byte charactersscanner.string# => "Helloこんにちは"       # Twenty bytes in all.put_situation(scanner)# Situation:#   pos:       0#   charpos:   0#   rest:      "Helloこんにちは"#   rest_size: 20scanner.scan(/Hello/)# => "Hello" # Five 1-byte characters.put_situation(scanner)# Situation:#   pos:       5#   charpos:   5#   rest:      "こんにちは"#   rest_size: 15scanner.getch# => "こ"    # One 3-byte character.put_situation(scanner)# Situation:#   pos:       8#   charpos:   6#   rest:      "んにちは"#   rest_size: 12

Target Substring

The target substring is the part of thestored string that extends from the currentbyte position to the end of the stored string; it is always either:

The target substring is returned by methodrest, and its size is returned by methodrest_size.

Examples:

scanner =StringScanner.new('foobarbaz')put_situation(scanner)# Situation:#   pos:       0#   charpos:   0#   rest:      "foobarbaz"#   rest_size: 9scanner.pos =3put_situation(scanner)# Situation:#   pos:       3#   charpos:   3#   rest:      "barbaz"#   rest_size: 6scanner.pos =9put_situation(scanner)# Situation:#   pos:       9#   charpos:   9#   rest:      ""#   rest_size: 0

Setting the Target Substring

The target substring is set whenever:

Querying the Target Substring

This table summarizes (details and examples at the links):

MethodReturns
restTarget substring.
rest_sizeSize (bytes) of target substring.

Searching the Target Substring

Asearch method examines the target substring, but does not advance thepositions or (by implication) shorten the target substring.

This table summarizes (details and examples at the links):

MethodReturnsSets Match Values?
check(pattern)Matched leading substring ornil.Yes.
check_until(pattern)Matched substring (anywhere) ornil.Yes.
exist?(pattern)Matched substring (anywhere) end index.Yes.
match?(pattern)Size of matched leading substring ornil.Yes.
peek(size)Leading substring of given length (bytes).No.
peek_byteInteger leading byte ornil.No.
restTarget substring (from byte position to end).No.

Traversing the Target Substring

Atraversal method examines the target substring, and, if successful:

This table summarizes (details and examples at links):

MethodReturnsSets Match Values?
get_byteLeading byte ornil.No.
getchLeading character ornil.No.
scan(pattern)Matched leading substring ornil.Yes.
scan_byteInteger leading byte ornil.No.
scan_until(pattern)Matched substring (anywhere) ornil.Yes.
skip(pattern)Matched leading substring size ornil.Yes.
skip_until(pattern)Position delta to end-of-matched-substring ornil.Yes.
unscanself.No.

Querying the Scanner

Each of these methods queries the scanner object without modifying it (details and examples at links)

MethodReturns
beginning_of_line?true orfalse.
charposCharacter position.
eos?true orfalse.
fixed_anchor?true orfalse.
inspectString representation ofself.
posByte position.
restTarget substring.
rest_sizeSize of target substring.
stringStored string.

Matching

StringScanner implements pattern matching via Ruby classRegexp, and its matching behaviors are the same as Ruby’s except for thefixed-anchor property.

Matcher Methods

Eachmatcher method takes a single argumentpattern, and attempts to find a matching substring in thetarget substring.

MethodPattern TypeMatches Target SubstringSuccess ReturnMay Update Positions?
checkRegexp orString.At beginning.Matched substring.No.
check_untilRegexp orString.Anywhere.Substring.No.
match?Regexp orString.At beginning.Match size.No.
exist?Regexp orString.Anywhere.Substring size.No.
scanRegexp orString.At beginning.Matched substring.Yes.
scan_untilRegexp orString.Anywhere.Substring.Yes.
skipRegexp orString.At beginning.Match size.Yes.
skip_untilRegexp orString.Anywhere.Substring size.Yes.


Which matcher you choose will depend on:

Match Values

Thematch values in aStringScanner object generally contain the results of the most recent attempted match.

Each match value may be thought of as:

Each of these methods clears match values:

Each of these methods attempts a match based on a pattern, and either sets match values (if successful) or clears them (if not);

Basic Match Values

Basic match values are those not related to captures.

Each of these methods returns a basic match value:

MethodReturn After MatchReturn After No Match
matched?true.false.
matched_sizeSize of matched substring.nil.
matchedMatched substring.nil.
pre_matchSubstring preceding matched substring.nil.
post_matchSubstring following matched substring.nil.


See examples below.

Captured Match Values

Captured match values are those related tocaptures.

Each of these methods returns a captured match value:

MethodReturn After MatchReturn After No Match
sizeCount of captured substrings.nil.
#nth captured substring.nil.
capturesArray of all captured substrings.nil.
values_at(*n)Array of specified captured substrings.nil.
named_capturesHash of named captures.{}.


See examples below.

Match Values Examples

Successful basic match attempt (no captures):

scanner =StringScanner.new('foobarbaz')scanner.exist?(/bar/)put_match_values(scanner)# Basic match values:#   matched?:       true#   matched_size:   3#   pre_match:      "foo"#   matched  :      "bar"#   post_match:     "baz"# Captured match values:#   size:           1#   captures:       []#   named_captures: {}#   values_at:      ["bar", nil]#   []:#     [0]:          "bar"#     [1]:          nil

Failed basic match attempt (no captures);

scanner =StringScanner.new('foobarbaz')scanner.exist?(/nope/)match_values_cleared?(scanner)# => true

Successful unnamed capture match attempt:

scanner =StringScanner.new('foobarbazbatbam')scanner.exist?(/(foo)bar(baz)bat(bam)/)put_match_values(scanner)# Basic match values:#   matched?:       true#   matched_size:   15#   pre_match:      ""#   matched  :      "foobarbazbatbam"#   post_match:     ""# Captured match values:#   size:           4#   captures:       ["foo", "baz", "bam"]#   named_captures: {}#   values_at:      ["foobarbazbatbam", "foo", "baz", "bam", nil]#   []:#     [0]:          "foobarbazbatbam"#     [1]:          "foo"#     [2]:          "baz"#     [3]:          "bam"#     [4]:          nil

Successful named capture match attempt; same as unnamed above, except fornamed_captures:

scanner =StringScanner.new('foobarbazbatbam')scanner.exist?(/(?<x>foo)bar(?<y>baz)bat(?<z>bam)/)scanner.named_captures# => {"x"=>"foo", "y"=>"baz", "z"=>"bam"}

Failed unnamed capture match attempt:

scanner =StringScanner.new('somestring')scanner.exist?(/(foo)bar(baz)bat(bam)/)match_values_cleared?(scanner)# => true

Failed named capture match attempt; same as unnamed above, except fornamed_captures:

scanner =StringScanner.new('somestring')scanner.exist?(/(?<x>foo)bar(?<y>baz)bat(?<z>bam)/)match_values_cleared?(scanner)# => falsescanner.named_captures# => {"x"=>nil, "y"=>nil, "z"=>nil}

Fixed-Anchor Property

Pattern matching inStringScanner is the same as in Ruby’s, except for its fixed-anchor property, which determines the meaning of'\A':

The fixed-anchor property is set when theStringScanner object is created, and may not be modified (seeStringScanner.new); methodfixed_anchor? returns the setting.

Public Class Methods

Source
static VALUEstrscan_initialize(int argc, VALUE *argv, VALUE self){    struct strscanner *p;    VALUE str, options;    p = check_strscan(self);    rb_scan_args(argc, argv, "11", &str, &options);    options = rb_check_hash_type(options);    if (!NIL_P(options)) {        VALUE fixed_anchor;        ID keyword_ids[1];        keyword_ids[0] = rb_intern("fixed_anchor");        rb_get_kwargs(options, keyword_ids, 0, 1, &fixed_anchor);        if (fixed_anchor == Qundef) {            p->fixed_anchor_p = false;        }        else {            p->fixed_anchor_p = RTEST(fixed_anchor);        }    }    else {        p->fixed_anchor_p = false;    }    StringValue(str);    RB_OBJ_WRITE(self, &p->str, str);    return self;}

Returns a newStringScanner object whosestored string is the givenstring; sets thefixed-anchor property:

scanner =StringScanner.new('foobarbaz')scanner.string# => "foobarbaz"scanner.fixed_anchor?# => falseput_situation(scanner)# Situation:#   pos:       0#   charpos:   0#   rest:      "foobarbaz"#   rest_size: 9

Public Instance Methods

Alias for:concat
Source
static VALUEstrscan_aref(VALUE self, VALUE idx){    const char *name;    struct strscanner *p;    long i;    GET_SCANNER(self, p);    if (! MATCHED_P(p))        return Qnil;    switch (TYPE(idx)) {        case T_SYMBOL:            idx = rb_sym2str(idx);            /* fall through */        case T_STRING:            RSTRING_GETMEM(idx, name, i);            i = name_to_backref_number(&(p->regs), p->regex, name, name + i, rb_enc_get(idx));            break;        default:            i = NUM2LONG(idx);    }    if (i < 0)        i += p->regs.num_regs;    if (i < 0)                 return Qnil;    if (i >= p->regs.num_regs) return Qnil;    if (p->regs.beg[i] == -1)  return Qnil;    return extract_range(p,                         adjust_register_position(p, p->regs.beg[i]),                         adjust_register_position(p, p->regs.end[i]));}

Returns a captured substring ornil; seeCaptured Match Values.

When there are captures:

scanner =StringScanner.new('Fri Dec 12 1975 14:39')scanner.scan(/(?<wday>\w+) (?<month>\w+) (?<day>\d+) /)
  • specifier zero: returns the entire matched substring:

    scanner[0]# => "Fri Dec 12 "scanner.pre_match# => ""scanner.post_match# => "1975 14:39"
  • specifier positive integer. returns thenth capture, ornil if out of range:

    scanner[1]# => "Fri"scanner[2]# => "Dec"scanner[3]# => "12"scanner[4]# => nil
  • specifier negative integer. counts backward from the last subgroup:

    scanner[-1]# => "12"scanner[-4]# => "Fri Dec 12 "scanner[-5]# => nil
  • specifier symbol or string. returns the named subgroup, ornil if no such:

    scanner[:wday]# => "Fri"scanner['wday']# => "Fri"scanner[:month]# => "Dec"scanner[:day]# => "12"scanner[:nope]# => nil

When there are no captures, only[0] returns non-nil:

scanner =StringScanner.new('foobarbaz')scanner.exist?(/bar/)scanner[0]# => "bar"scanner[1]# => nil

For a failed match, even[0] returnsnil:

scanner.scan(/nope/)# => nilscanner[0]# => nilscanner[1]# => nil
Source
static VALUEstrscan_bol_p(VALUE self){    struct strscanner *p;    GET_SCANNER(self, p);    if (CURPTR(p) > S_PEND(p)) return Qnil;    if (p->curr == 0) return Qtrue;    return (*(CURPTR(p) - 1) == '\n') ? Qtrue : Qfalse;}

Returns whether theposition is at the beginning of a line; that is, at the beginning of thestored string or immediately after a newline:

scanner =StringScanner.new(MULTILINE_TEXT)scanner.string# => "Go placidly amid the noise and haste,\nand remember what peace there may be in silence.\n"scanner.pos# => 0scanner.beginning_of_line?# => truescanner.scan_until(/,/)# => "Go placidly amid the noise and haste,"scanner.beginning_of_line?# => falsescanner.scan(/\n/)# => "\n"scanner.beginning_of_line?# => truescanner.terminatescanner.beginning_of_line?# => truescanner.concat('x')scanner.terminatescanner.beginning_of_line?# => false

StringScanner#bol? is an alias forStringScanner#beginning_of_line?.

Source
static VALUEstrscan_captures(VALUE self){    struct strscanner *p;    int   i, num_regs;    VALUE new_ary;    GET_SCANNER(self, p);    if (! MATCHED_P(p))        return Qnil;    num_regs = p->regs.num_regs;    new_ary  = rb_ary_new2(num_regs);    for (i = 1; i < num_regs; i++) {        VALUE str;        if (p->regs.beg[i] == -1)            str = Qnil;        else            str = extract_range(p,                                adjust_register_position(p, p->regs.beg[i]),                                adjust_register_position(p, p->regs.end[i]));        rb_ary_push(new_ary, str);    }    return new_ary;}

Returns the array ofcaptured match values at indexes(1..) if the most recent match attempt succeeded, ornil otherwise:

scanner =StringScanner.new('Fri Dec 12 1975 14:39')scanner.captures# => nilscanner.exist?(/(?<wday>\w+) (?<month>\w+) (?<day>\d+) /)scanner.captures# => ["Fri", "Dec", "12"]scanner.values_at(*0..4)# => ["Fri Dec 12 ", "Fri", "Dec", "12", nil]scanner.exist?(/Fri/)scanner.captures# => []scanner.scan(/nope/)scanner.captures# => nil
Source
static VALUEstrscan_get_charpos(VALUE self){    struct strscanner *p;    GET_SCANNER(self, p);    return LONG2NUM(rb_enc_strlen(S_PBEG(p), CURPTR(p), rb_enc_get(p->str)));}

call-seq: charpos -> character_position

Returns thecharacter position (initially zero), which may be different from thebyte position given by methodpos:

scanner =StringScanner.new(HIRAGANA_TEXT)scanner.string# => "こんにちは"scanner.getch# => "こ" # 3-byte character.scanner.getch# => "ん" # 3-byte character.put_situation(scanner)# Situation:#   pos:       6#   charpos:   2#   rest:      "にちは"#   rest_size: 9
Source
static VALUEstrscan_check(VALUE self, VALUE re){    return strscan_do_scan(self, re, 0, 1, 1);}

Attempts tomatch the givenpattern at the beginning of thetarget substring; does not modify thepositions.

If the match succeeds:

scanner =StringScanner.new('foobarbaz')scanner.pos =3scanner.check('bar')# => "bar"put_match_values(scanner)# Basic match values:#   matched?:       true#   matched_size:   3#   pre_match:      "foo"#   matched  :      "bar"#   post_match:     "baz"# Captured match values:#   size:           1#   captures:       []#   named_captures: {}#   values_at:      ["bar", nil]#   []:#     [0]:          "bar"#     [1]:          nil# => 0..1put_situation(scanner)# Situation:#   pos:       3#   charpos:   3#   rest:      "barbaz"#   rest_size: 6

If the match fails:

scanner.check(/nope/)# => nilmatch_values_cleared?(scanner)# => true
Source
static VALUEstrscan_check_until(VALUE self, VALUE re){    return strscan_do_scan(self, re, 0, 1, 0);}

Attempts tomatch the givenpattern anywhere (at anyposition) in thetarget substring; does not modify thepositions.

If the match succeeds:

  • Sets allmatch values.

  • Returns the matched substring, which extends from the currentposition to the end of the matched substring.

scanner =StringScanner.new('foobarbazbatbam')scanner.pos =6scanner.check_until(/bat/)# => "bazbat"put_match_values(scanner)# Basic match values:#   matched?:       true#   matched_size:   3#   pre_match:      "foobarbaz"#   matched  :      "bat"#   post_match:     "bam"# Captured match values:#   size:           1#   captures:       []#   named_captures: {}#   values_at:      ["bat", nil]#   []:#     [0]:          "bat"#     [1]:          nilput_situation(scanner)# Situation:#   pos:       6#   charpos:   6#   rest:      "bazbatbam"#   rest_size: 9

If the match fails:

scanner.check_until(/nope/)# => nilmatch_values_cleared?(scanner)# => true
Source
static VALUEstrscan_concat(VALUE self, VALUE str){    struct strscanner *p;    GET_SCANNER(self, p);    StringValue(str);    rb_str_append(p->str, str);    return self;}
scanner =StringScanner.new('foo')scanner.string# => "foo"scanner.terminatescanner.concat('barbaz')# => #<StringScanner 3/9 "foo" @ "barba...">scanner.string# => "foobarbaz"put_situation(scanner)# Situation:#   pos:       3#   charpos:   3#   rest:      "barbaz"#   rest_size: 6
Also aliased as:<<
Source
static VALUEstrscan_eos_p(VALUE self){    struct strscanner *p;    GET_SCANNER(self, p);    return EOS_P(p) ? Qtrue : Qfalse;}

Returns whether theposition is at the end of thestored string:

scanner =StringScanner.new('foobarbaz')scanner.eos?# => falsepos =3scanner.eos?# => falsescanner.terminatescanner.eos?# => true
Source
static VALUEstrscan_exist_p(VALUE self, VALUE re){    return strscan_do_scan(self, re, 0, 0, 0);}

Attempts tomatch the givenpattern anywhere (at anyposition) n thetarget substring; does not modify thepositions.

If the match succeeds:

  • Returns a byte offset: the distance in bytes between the currentposition and the end of the matched substring.

  • Sets allmatch values.

scanner =StringScanner.new('foobarbazbatbam')scanner.pos =6scanner.exist?(/bat/)# => 6put_match_values(scanner)# Basic match values:#   matched?:       true#   matched_size:   3#   pre_match:      "foobarbaz"#   matched  :      "bat"#   post_match:     "bam"# Captured match values:#   size:           1#   captures:       []#   named_captures: {}#   values_at:      ["bat", nil]#   []:#     [0]:          "bat"#     [1]:          nilput_situation(scanner)# Situation:#   pos:       6#   charpos:   6#   rest:      "bazbatbam"#   rest_size: 9

If the match fails:

scanner.exist?(/nope/)# => nilmatch_values_cleared?(scanner)# => true
Source
static VALUEstrscan_fixed_anchor_p(VALUE self){    struct strscanner *p;    p = check_strscan(self);    return p->fixed_anchor_p ? Qtrue : Qfalse;}

Returns whether thefixed-anchor property is set.

Source
static VALUEstrscan_get_byte(VALUE self){    struct strscanner *p;    GET_SCANNER(self, p);    CLEAR_MATCH_STATUS(p);    if (EOS_P(p))        return Qnil;    p->prev = p->curr;    p->curr++;    MATCHED(p);    adjust_registers_to_matched(p);    return extract_range(p,                         adjust_register_position(p, p->regs.beg[0]),                         adjust_register_position(p, p->regs.end[0]));}

call-seq:get_byte -> byte_as_character or nil

Returns the next byte, if available:

  • If theposition is not at the end of thestored string:

    scanner =StringScanner.new(HIRAGANA_TEXT)# => #<StringScanner 0/15 @ "\xE3\x81\x93\xE3\x82...">scanner.string# => "こんにちは"[scanner.get_byte,scanner.pos,scanner.charpos]# => ["\xE3", 1, 1][scanner.get_byte,scanner.pos,scanner.charpos]# => ["\x81", 2, 2][scanner.get_byte,scanner.pos,scanner.charpos]# => ["\x93", 3, 1][scanner.get_byte,scanner.pos,scanner.charpos]# => ["\xE3", 4, 2][scanner.get_byte,scanner.pos,scanner.charpos]# => ["\x82", 5, 3][scanner.get_byte,scanner.pos,scanner.charpos]# => ["\x93", 6, 2]
  • Otherwise, returnsnil, and does not change the positions.

    scanner.terminate[scanner.get_byte,scanner.pos,scanner.charpos]# => [nil, 15, 5]
Source
static VALUEstrscan_getch(VALUE self){    struct strscanner *p;    long len;    GET_SCANNER(self, p);    CLEAR_MATCH_STATUS(p);    if (EOS_P(p))        return Qnil;    len = rb_enc_mbclen(CURPTR(p), S_PEND(p), rb_enc_get(p->str));    len = minl(len, S_RESTLEN(p));    p->prev = p->curr;    p->curr += len;    MATCHED(p);    adjust_registers_to_matched(p);    return extract_range(p,                         adjust_register_position(p, p->regs.beg[0]),                         adjust_register_position(p, p->regs.end[0]));}

call-seq: getch -> character or nil

Returns the next (possibly multibyte) character, if available:

  • If theposition is at the beginning of a character:

    scanner =StringScanner.new(HIRAGANA_TEXT)scanner.string# => "こんにちは"[scanner.getch,scanner.pos,scanner.charpos]# => ["こ", 3, 1][scanner.getch,scanner.pos,scanner.charpos]# => ["ん", 6, 2][scanner.getch,scanner.pos,scanner.charpos]# => ["に", 9, 3][scanner.getch,scanner.pos,scanner.charpos]# => ["ち", 12, 4][scanner.getch,scanner.pos,scanner.charpos]# => ["は", 15, 5][scanner.getch,scanner.pos,scanner.charpos]# => [nil, 15, 5]
  • If theposition is within a multi-byte character (that is, not at its beginning), behaves likeget_byte (returns a 1-byte character):

    scanner.pos =1[scanner.getch,scanner.pos,scanner.charpos]# => ["\x81", 2, 2][scanner.getch,scanner.pos,scanner.charpos]# => ["\x93", 3, 1][scanner.getch,scanner.pos,scanner.charpos]# => ["ん", 6, 2]
  • If theposition is at the end of thestored string, returnsnil and does not modify the positions:

    scanner.terminate[scanner.getch,scanner.pos,scanner.charpos]# => [nil, 15, 5]
Source
static VALUEstrscan_inspect(VALUE self){    struct strscanner *p;    VALUE a, b;    p = check_strscan(self);    if (NIL_P(p->str)) {        a = rb_sprintf("#<%"PRIsVALUE" (uninitialized)>", rb_obj_class(self));        return a;    }    if (EOS_P(p)) {        a = rb_sprintf("#<%"PRIsVALUE" fin>", rb_obj_class(self));        return a;    }    if (p->curr == 0) {        b = inspect2(p);        a = rb_sprintf("#<%"PRIsVALUE" %ld/%ld @ %"PRIsVALUE">",                       rb_obj_class(self),                       p->curr, S_LEN(p),                       b);        return a;    }    a = inspect1(p);    b = inspect2(p);    a = rb_sprintf("#<%"PRIsVALUE" %ld/%ld %"PRIsVALUE" @ %"PRIsVALUE">",                   rb_obj_class(self),                   p->curr, S_LEN(p),                   a, b);    return a;}

Returns a string representation ofself that may show:

  1. The currentposition.

  2. The size (in bytes) of thestored string.

  3. The substring preceding the current position.

  4. The substring following the current position (which is also thetarget substring).

scanner =StringScanner.new("Fri Dec 12 1975 14:39")scanner.pos =11scanner.inspect# => "#<StringScanner 11/21 \"...c 12 \" @ \"1975 ...\">"

If at beginning-of-string, item 4 above (following substring) is omitted:

scanner.resetscanner.inspect# => "#<StringScanner 0/21 @ \"Fri D...\">"

If at end-of-string, all items above are omitted:

scanner.terminatescanner.inspect# => "#<StringScanner fin>"
Source
static VALUEstrscan_match_p(VALUE self, VALUE re){    return strscan_do_scan(self, re, 0, 0, 1);}

Attempts tomatch the givenpattern at the beginning of thetarget substring; does not modify thepositions.

If the match succeeds:

  • Setsmatch values.

  • Returns the size in bytes of the matched substring.

scanner =StringScanner.new('foobarbaz')scanner.pos =3scanner.match?(/bar/)=>3put_match_values(scanner)# Basic match values:#   matched?:       true#   matched_size:   3#   pre_match:      "foo"#   matched  :      "bar"#   post_match:     "baz"# Captured match values:#   size:           1#   captures:       []#   named_captures: {}#   values_at:      ["bar", nil]#   []:#     [0]:          "bar"#     [1]:          nilput_situation(scanner)# Situation:#   pos:       3#   charpos:   3#   rest:      "barbaz"#   rest_size: 6

If the match fails:

  • Clears match values.

  • Returnsnil.

  • Does not increment positions.

scanner.match?(/nope/)# => nilmatch_values_cleared?(scanner)# => true
Source
static VALUEstrscan_matched(VALUE self){    struct strscanner *p;    GET_SCANNER(self, p);    if (! MATCHED_P(p)) return Qnil;    return extract_range(p,                         adjust_register_position(p, p->regs.beg[0]),                         adjust_register_position(p, p->regs.end[0]));}

Returns the matched substring from the most recentmatch attempt if it was successful, ornil otherwise; seeBasic Matched Values:

scanner =StringScanner.new('foobarbaz')scanner.matched# => nilscanner.pos =3scanner.match?(/bar/)# => 3scanner.matched# => "bar"scanner.match?(/nope/)# => nilscanner.matched# => nil
Source
static VALUEstrscan_matched_p(VALUE self){    struct strscanner *p;    GET_SCANNER(self, p);    return MATCHED_P(p) ? Qtrue : Qfalse;}

Returnstrue of the most recentmatch attempt was successful,false otherwise; seeBasic Matched Values:

scanner =StringScanner.new('foobarbaz')scanner.matched?# => falsescanner.pos =3scanner.exist?(/baz/)# => 6scanner.matched?# => truescanner.exist?(/nope/)# => nilscanner.matched?# => false
Source
static VALUEstrscan_matched_size(VALUE self){    struct strscanner *p;    GET_SCANNER(self, p);    if (! MATCHED_P(p)) return Qnil;    return LONG2NUM(p->regs.end[0] - p->regs.beg[0]);}

Returns the size (in bytes) of the matched substring from the most recent matchmatch attempt if it was successful, ornil otherwise; seeBasic Matched Values:

scanner =StringScanner.new('foobarbaz')scanner.matched_size# => nilpos =3scanner.exist?(/baz/)# => 9scanner.matched_size# => 3scanner.exist?(/nope/)# => nilscanner.matched_size# => nil
Source
static VALUEstrscan_named_captures(VALUE self){    struct strscanner *p;    named_captures_data data;    GET_SCANNER(self, p);    data.self = self;    data.captures = rb_hash_new();    if (!RB_NIL_P(p->regex)) {        onig_foreach_name(RREGEXP_PTR(p->regex), named_captures_iter, &data);    }    return data.captures;}

Returns the array of captured match values at indexes (1..) if the most recent match attempt succeeded, or nil otherwise; seeCaptured Match Values:

scanner =StringScanner.new('Fri Dec 12 1975 14:39')scanner.named_captures# => {}pattern =/(?<wday>\w+) (?<month>\w+) (?<day>\d+) /scanner.match?(pattern)scanner.named_captures# => {"wday"=>"Fri", "month"=>"Dec", "day"=>"12"}scanner.string ='nope'scanner.match?(pattern)scanner.named_captures# => {"wday"=>nil, "month"=>nil, "day"=>nil}scanner.match?(/nosuch/)scanner.named_captures# => {}
Source
static VALUEstrscan_peek(VALUE self, VALUE vlen){    struct strscanner *p;    long len;    GET_SCANNER(self, p);    len = NUM2LONG(vlen);    if (EOS_P(p))        return str_new(p, "", 0);    len = minl(len, S_RESTLEN(p));    return extract_beg_len(p, p->curr, len);}

Returns the substringstring[pos, length]; does not updatematch values orpositions:

scanner =StringScanner.new('foobarbaz')scanner.pos =3scanner.peek(3)# => "bar"scanner.terminatescanner.peek(3)# => ""
Source
static VALUEstrscan_peek_byte(VALUE self){    struct strscanner *p;    GET_SCANNER(self, p);    if (EOS_P(p))        return Qnil;    return INT2FIX((unsigned char)*CURPTR(p));}

Peeks at the current byte and returns it as an integer.

s =StringScanner.new('ab')s.peek_byte# => 97

call-seq: pos -> byte_position

Returns the integerbyte position, which may be different from thecharacter position:

scanner =StringScanner.new(HIRAGANA_TEXT)scanner.string# => "こんにちは"scanner.pos# => 0scanner.getch# => "こ" # 3-byte character.scanner.charpos# => 1scanner.pos# => 3
Alias for:pos

call-seq: pos = n -> n pointer = n -> n

Sets thebyte position and thecharacter position; returnsn.

Does not affectmatch values.

For non-negativen, sets the position ton:

scanner =StringScanner.new(HIRAGANA_TEXT)scanner.string# => "こんにちは"scanner.pos =3# => 3scanner.rest# => "んにちは"scanner.charpos# => 1

For negativen, counts from the end of thestored string:

scanner.pos =-9# => -9scanner.pos# => 6scanner.rest# => "にちは"scanner.charpos# => 2
Alias for:pos=
Source
static VALUEstrscan_get_pos(VALUE self){    struct strscanner *p;    GET_SCANNER(self, p);    return LONG2NUM(p->curr);}

call-seq: pos -> byte_position

Returns the integerbyte position, which may be different from thecharacter position:

scanner =StringScanner.new(HIRAGANA_TEXT)scanner.string# => "こんにちは"scanner.pos# => 0scanner.getch# => "こ" # 3-byte character.scanner.charpos# => 1scanner.pos# => 3
Also aliased as:pointer
Source
static VALUEstrscan_set_pos(VALUE self, VALUE v){    struct strscanner *p;    long i;    GET_SCANNER(self, p);    i = NUM2LONG(v);    if (i < 0) i += S_LEN(p);    if (i < 0) rb_raise(rb_eRangeError, "index out of range");    if (i > S_LEN(p)) rb_raise(rb_eRangeError, "index out of range");    p->curr = i;    return LONG2NUM(i);}

call-seq: pos = n -> n pointer = n -> n

Sets thebyte position and thecharacter position; returnsn.

Does not affectmatch values.

For non-negativen, sets the position ton:

scanner =StringScanner.new(HIRAGANA_TEXT)scanner.string# => "こんにちは"scanner.pos =3# => 3scanner.rest# => "んにちは"scanner.charpos# => 1

For negativen, counts from the end of thestored string:

scanner.pos =-9# => -9scanner.pos# => 6scanner.rest# => "にちは"scanner.charpos# => 2
Also aliased as:pointer=
Source
static VALUEstrscan_post_match(VALUE self){    struct strscanner *p;    GET_SCANNER(self, p);    if (! MATCHED_P(p)) return Qnil;    return extract_range(p,                         adjust_register_position(p, p->regs.end[0]),                         S_LEN(p));}

Returns the substring that follows the matched substring from the most recent match attempt if it was successful, ornil otherwise; seeBasic Match Values:

scanner =StringScanner.new('foobarbaz')scanner.post_match# => nilscanner.pos =3scanner.match?(/bar/)# => 3scanner.post_match# => "baz"scanner.match?(/nope/)# => nilscanner.post_match# => nil
Source
static VALUEstrscan_pre_match(VALUE self){    struct strscanner *p;    GET_SCANNER(self, p);    if (! MATCHED_P(p)) return Qnil;    return extract_range(p,                         0,                         adjust_register_position(p, p->regs.beg[0]));}

Returns the substring that precedes the matched substring from the most recent match attempt if it was successful, ornil otherwise; seeBasic Match Values:

scanner =StringScanner.new('foobarbaz')scanner.pre_match# => nilscanner.pos =3scanner.exist?(/baz/)# => 6scanner.pre_match# => "foobar" # Substring of entire string, not just target string.scanner.exist?(/nope/)# => nilscanner.pre_match# => nil
Source
static VALUEstrscan_reset(VALUE self){    struct strscanner *p;    GET_SCANNER(self, p);    p->curr = 0;    CLEAR_MATCH_STATUS(p);    return self;}

Sets bothbyte position andcharacter position to zero, and clearsmatch values; returnsself:

scanner =StringScanner.new('foobarbaz')scanner.exist?(/bar/)# => 6scanner.reset# => #<StringScanner 0/9 @ "fooba...">put_situation(scanner)# Situation:#   pos:       0#   charpos:   0#   rest:      "foobarbaz"#   rest_size: 9# => nilmatch_values_cleared?(scanner)# => true
Source
static VALUEstrscan_rest(VALUE self){    struct strscanner *p;    GET_SCANNER(self, p);    if (EOS_P(p)) {        return str_new(p, "", 0);    }    return extract_range(p, p->curr, S_LEN(p));}

Returns the ‘rest’ of thestored string (all after the currentposition), which is thetarget substring:

scanner =StringScanner.new('foobarbaz')scanner.rest# => "foobarbaz"scanner.pos =3scanner.rest# => "barbaz"scanner.terminatescanner.rest# => ""
Source
static VALUEstrscan_rest_size(VALUE self){    struct strscanner *p;    long i;    GET_SCANNER(self, p);    if (EOS_P(p)) {        return INT2FIX(0);    }    i = S_RESTLEN(p);    return INT2FIX(i);}

Returns the size (in bytes) of therest of thestored string:

scanner =StringScanner.new('foobarbaz')scanner.rest# => "foobarbaz"scanner.rest_size# => 9scanner.pos =3scanner.rest# => "barbaz"scanner.rest_size# => 6scanner.terminatescanner.rest# => ""scanner.rest_size# => 0
Source
static VALUEstrscan_scan(VALUE self, VALUE re){    return strscan_do_scan(self, re, 1, 1, 1);}

call-seq: scan(pattern) -> substring or nil

Attempts tomatch the givenpattern at the beginning of thetarget substring.

If the match succeeds:

scanner =StringScanner.new(HIRAGANA_TEXT)scanner.string# => "こんにちは"scanner.pos =6scanner.scan(/に/)# => "に"put_match_values(scanner)# Basic match values:#   matched?:       true#   matched_size:   3#   pre_match:      "こん"#   matched  :      "に"#   post_match:     "ちは"# Captured match values:#   size:           1#   captures:       []#   named_captures: {}#   values_at:      ["に", nil]#   []:#     [0]:          "に"#     [1]:          nilput_situation(scanner)# Situation:#   pos:       9#   charpos:   3#   rest:      "ちは"#   rest_size: 6

If the match fails:

  • Returnsnil.

  • Does not increment byte and character positions.

  • Clears match values.

scanner.scan(/nope/)# => nilmatch_values_cleared?(scanner)# => true
Source
static VALUEstrscan_scan_byte(VALUE self){    struct strscanner *p;    VALUE byte;    GET_SCANNER(self, p);    CLEAR_MATCH_STATUS(p);    if (EOS_P(p))        return Qnil;    byte = INT2FIX((unsigned char)*CURPTR(p));    p->prev = p->curr;    p->curr++;    MATCHED(p);    adjust_registers_to_matched(p);    return byte;}

Scans one byte and returns it as an integer. This method is not multibyte character sensitive. See also:getch.

Source
static VALUEstrscan_scan_until(VALUE self, VALUE re){    return strscan_do_scan(self, re, 1, 1, 0);}

call-seq:scan_until(pattern) -> substring or nil

Attempts tomatch the givenpattern anywhere (at anyposition) in thetarget substring.

If the match attempt succeeds:

scanner =StringScanner.new(HIRAGANA_TEXT)scanner.string# => "こんにちは"scanner.pos =6scanner.scan_until(/ち/)# => "にち"put_match_values(scanner)# Basic match values:#   matched?:       true#   matched_size:   3#   pre_match:      "こんに"#   matched  :      "ち"#   post_match:     "は"# Captured match values:#   size:           1#   captures:       []#   named_captures: {}#   values_at:      ["ち", nil]#   []:#     [0]:          "ち"#     [1]:          nilput_situation(scanner)# Situation:#   pos:       12#   charpos:   4#   rest:      "は"#   rest_size: 3

If the match attempt fails:

  • Clears match data.

  • Returnsnil.

  • Does not update positions.

scanner.scan_until(/nope/)# => nilmatch_values_cleared?(scanner)# => true
Source
static VALUEstrscan_size(VALUE self){    struct strscanner *p;    GET_SCANNER(self, p);    if (! MATCHED_P(p))        return Qnil;    return INT2FIX(p->regs.num_regs);}

Returns the count of captures if the most recent match attempt succeeded,nil otherwise; seeCaptures Match Values:

scanner =StringScanner.new('Fri Dec 12 1975 14:39')scanner.size# => nilpattern =/(?<wday>\w+) (?<month>\w+) (?<day>\d+) /scanner.match?(pattern)scanner.values_at(*0..scanner.size)# => ["Fri Dec 12 ", "Fri", "Dec", "12", nil]scanner.size# => 4scanner.match?(/nope/)# => nilscanner.size# => nil
Source
static VALUEstrscan_skip(VALUE self, VALUE re){    return strscan_do_scan(self, re, 1, 0, 1);}

call-seq: skip(pattern) match_size or nil

Attempts tomatch the givenpattern at the beginning of thetarget substring;

If the match succeeds:

scanner =StringScanner.new(HIRAGANA_TEXT)scanner.string# => "こんにちは"scanner.pos =6scanner.skip(/に/)# => 3put_match_values(scanner)# Basic match values:#   matched?:       true#   matched_size:   3#   pre_match:      "こん"#   matched  :      "に"#   post_match:     "ちは"# Captured match values:#   size:           1#   captures:       []#   named_captures: {}#   values_at:      ["に", nil]#   []:#     [0]:          "に"#     [1]:          nilput_situation(scanner)# Situation:#   pos:       9#   charpos:   3#   rest:      "ちは"#   rest_size: 6scanner.skip(/nope/)# => nilmatch_values_cleared?(scanner)# => true
Source
static VALUEstrscan_skip_until(VALUE self, VALUE re){    return strscan_do_scan(self, re, 1, 0, 0);}

call-seq:skip_until(pattern) -> matched_substring_size or nil

Attempts tomatch the givenpattern anywhere (at anyposition) in thetarget substring; does not modify the positions.

If the match attempt succeeds:

scanner =StringScanner.new(HIRAGANA_TEXT)scanner.string# => "こんにちは"scanner.pos =6scanner.skip_until(/ち/)# => 6put_match_values(scanner)# Basic match values:#   matched?:       true#   matched_size:   3#   pre_match:      "こんに"#   matched  :      "ち"#   post_match:     "は"# Captured match values:#   size:           1#   captures:       []#   named_captures: {}#   values_at:      ["ち", nil]#   []:#     [0]:          "ち"#     [1]:          nilput_situation(scanner)# Situation:#   pos:       12#   charpos:   4#   rest:      "は"#   rest_size: 3

If the match attempt fails:

  • Clears match values.

  • Returnsnil.

scanner.skip_until(/nope/)# => nilmatch_values_cleared?(scanner)# => true
Source
static VALUEstrscan_get_string(VALUE self){    struct strscanner *p;    GET_SCANNER(self, p);    return p->str;}

Returns thestored string:

scanner =StringScanner.new('foobar')scanner.string# => "foobar"scanner.concat('baz')scanner.string# => "foobarbaz"
Source
static VALUEstrscan_set_string(VALUE self, VALUE str){    struct strscanner *p = check_strscan(self);    StringValue(str);    RB_OBJ_WRITE(self, &p->str, str);    p->curr = 0;    CLEAR_MATCH_STATUS(p);    return str;}

Replaces thestored string with the givenother_string:

scanner =StringScanner.new('foobar')scanner.scan(/foo/)put_situation(scanner)# Situation:#   pos:       3#   charpos:   3#   rest:      "bar"#   rest_size: 3match_values_cleared?(scanner)# => falsescanner.string ='baz'# => "baz"put_situation(scanner)# Situation:#   pos:       0#   charpos:   0#   rest:      "baz"#   rest_size: 3match_values_cleared?(scanner)# => true
Source
static VALUEstrscan_terminate(VALUE self){    struct strscanner *p;    GET_SCANNER(self, p);    p->curr = S_LEN(p);    CLEAR_MATCH_STATUS(p);    return self;}

call-seq: terminate -> self

Sets the scanner to end-of-string; returnsself:

scanner =StringScanner.new(HIRAGANA_TEXT)scanner.string# => "こんにちは"scanner.scan_until(/に/)put_situation(scanner)# Situation:#   pos:       9#   charpos:   3#   rest:      "ちは"#   rest_size: 6match_values_cleared?(scanner)# => falsescanner.terminate# => #<StringScanner fin>put_situation(scanner)# Situation:#   pos:       15#   charpos:   5#   rest:      ""#   rest_size: 0match_values_cleared?(scanner)# => true
Source
static VALUEstrscan_unscan(VALUE self){    struct strscanner *p;    GET_SCANNER(self, p);    if (! MATCHED_P(p))        rb_raise(ScanError, "unscan failed: previous match record not exist");    p->curr = p->prev;    CLEAR_MATCH_STATUS(p);    return self;}

Sets theposition to its value previous to the recent successfulmatch attempt:

scanner =StringScanner.new('foobarbaz')scanner.scan(/foo/)put_situation(scanner)# Situation:#   pos:       3#   charpos:   3#   rest:      "barbaz"#   rest_size: 6scanner.unscan# => #<StringScanner 0/9 @ "fooba...">put_situation(scanner)# Situation:#   pos:       0#   charpos:   0#   rest:      "foobarbaz"#   rest_size: 9

Raises an exception if match values are clear:

scanner.scan(/nope/)# => nilmatch_values_cleared?(scanner)# => truescanner.unscan# Raises StringScanner::Error.
Source
static VALUEstrscan_values_at(int argc, VALUE *argv, VALUE self){    struct strscanner *p;    long i;    VALUE new_ary;    GET_SCANNER(self, p);    if (! MATCHED_P(p))        return Qnil;    new_ary = rb_ary_new2(argc);    for (i = 0; i<argc; i++) {        rb_ary_push(new_ary, strscan_aref(self, argv[i]));    }    return new_ary;}

Returns an array of captured substrings, ornil of none.

For eachspecifier, the returned substring is[specifier]; see[].

scanner =StringScanner.new('Fri Dec 12 1975 14:39')pattern =/(?<wday>\w+) (?<month>\w+) (?<day>\d+) /scanner.match?(pattern)scanner.values_at(*0..3)# => ["Fri Dec 12 ", "Fri", "Dec", "12"]scanner.values_at(*%i[wday month day])# => ["Fri", "Dec", "12"]

Private Instance Methods

Source
static VALUEstrscan_init_copy(VALUE vself, VALUE vorig){    struct strscanner *self, *orig;    self = check_strscan(vself);    orig = check_strscan(vorig);    if (self != orig) {        self->flags = orig->flags;        RB_OBJ_WRITE(vself, &self->str, orig->str);        self->prev = orig->prev;        self->curr = orig->curr;        if (rb_reg_region_copy(&self->regs, &orig->regs))            rb_memerror();        RB_GC_GUARD(vorig);    }    return vself;}

Returns a shallow copy ofself; thestored string in the copy is the same string as inself.