class StringScanner
ClassStringScanner supports processing a stored string as a stream; this code creates a newStringScanner object with string'foobarbaz':
require'strscan'scanner =StringScanner.new('foobarbaz')
About the Examples¶↑
All examples here assume thatStringScanner has been required:
require'strscan'
Some examples here assume that these constants are defined:
MULTILINE_TEXT =<<~EOTGo placidly amid the noise and haste,and remember what peace there may be in silence.EOTHIRAGANA_TEXT ='こんにちは'ENGLISH_TEXT ='Hello'
Some examples here assume that certain helper methods are defined:
put_situation(scanner): Displays the values of the scanner’s methodspos,charpos,rest, andrest_size.put_match_values(scanner): Displays the scanner’smatch values.match_values_cleared?(scanner): Returns whether the scanner’smatch values are cleared.
See examples athelper methods.
TheStringScanner Object¶↑
This code creates aStringScanner object (we’ll call it simply ascanner), and shows some of its basic properties:
scanner =StringScanner.new('foobarbaz')scanner.string# => "foobarbaz"put_situation(scanner)# Situation:# pos: 0# charpos: 0# rest: "foobarbaz"# rest_size: 9
The scanner has:
Astored string, which is:
Initially set by
StringScanner.new(string)to the givenstring('foobarbaz'in the example above).Modifiable by methods
string=(new_string)andconcat(more_string).Returned by method
string.
More atStored String below.
Aposition; a zero-based index into the bytes of the stored string (not into its characters):
Initially set by
StringScanner.newto0.Returned by method
pos.Modifiable explicitly by methods
reset,terminate, andpos=(new_pos).Modifiable implicitly (various traversing methods, among others).
More atByte Position below.
Atarget substring, which is a trailing substring of the stored string; it extends from the current position to the end of the stored string:
Initially set by
StringScanner.new(string)to the givenstring('foobarbaz'in the example above).Returned by method
rest.Modified by any modification to either the stored string or the position.
Most importantly: the searching and traversing methods operate on the target substring, which may be (and often is) less than the entire stored string.
More atTarget Substring below.
Stored String¶↑
Thestored string is the string stored in theStringScanner object.
Each of these methods sets, modifies, or returns the stored string:
| Method | Effect |
|---|---|
::new(string) | Creates a new scanner for the given string. |
string=(new_string) | Replaces the existing stored string. |
concat(more_string) | Appends a string to the existing stored string. |
string | Returns the stored string. |
Positions¶↑
AStringScanner object maintains a zero-basedbyte position and a zero-basedcharacter position.
Each of these methods explicitly sets positions:
| Method | Effect |
|---|---|
reset | Sets both positions to zero (beginning of stored string). |
terminate | Sets both positions to the end of the stored string. |
pos=(new_byte_position) | Sets byte position; adjusts character position. |
Byte Position (Position)¶↑
The byte position (or simplyposition) is a zero-based index into the bytes in the scanner’s stored string; for a newStringScanner object, the byte position is zero.
When the byte position is:
Zero (at the beginning), the target substring is the entire stored string.
Equal to the size of the stored string (at the end), the target substring is the empty string
''.
To get or set the byte position:
pos: returns the byte position.pos=(new_pos): sets the byte position.
Many methods use the byte position as the basis for finding matches; many others set, increment, or decrement the byte position:
scanner =StringScanner.new('foobar')scanner.pos# => 0scanner.scan(/foo/)# => "foo" # Match found.scanner.pos# => 3 # Byte position incremented.scanner.scan(/foo/)# => nil # Match not found.scanner.pos# => 3 # Byte position not changed.
Some methods implicitly modify the byte position; see:
The values of these methods are derived directly from the values ofpos andstring:
rest: thetarget substring.rest_size:rest.size.
Character Position¶↑
The character position is a zero-based index into thecharacters in the stored string; for a newStringScanner object, the character position is zero.
Methodcharpos returns the character position; its value may not be reset explicitly.
Some methods change (increment or reset) the character position; see:
Example (string includes multi-byte characters):
scanner =StringScanner.new(ENGLISH_TEXT)# Five 1-byte characters.scanner.concat(HIRAGANA_TEXT)# Five 3-byte charactersscanner.string# => "Helloこんにちは" # Twenty bytes in all.put_situation(scanner)# Situation:# pos: 0# charpos: 0# rest: "Helloこんにちは"# rest_size: 20scanner.scan(/Hello/)# => "Hello" # Five 1-byte characters.put_situation(scanner)# Situation:# pos: 5# charpos: 5# rest: "こんにちは"# rest_size: 15scanner.getch# => "こ" # One 3-byte character.put_situation(scanner)# Situation:# pos: 8# charpos: 6# rest: "んにちは"# rest_size: 12
Target Substring¶↑
The target substring is the part of thestored string that extends from the currentbyte position to the end of the stored string; it is always either:
The entire stored string (byte position is zero).
A trailing substring of the stored string (byte position positive).
The target substring is returned by methodrest, and its size is returned by methodrest_size.
Examples:
scanner =StringScanner.new('foobarbaz')put_situation(scanner)# Situation:# pos: 0# charpos: 0# rest: "foobarbaz"# rest_size: 9scanner.pos =3put_situation(scanner)# Situation:# pos: 3# charpos: 3# rest: "barbaz"# rest_size: 6scanner.pos =9put_situation(scanner)# Situation:# pos: 9# charpos: 9# rest: ""# rest_size: 0
Setting the Target Substring¶↑
The target substring is set whenever:
Thestored string is set (position reset to zero; target substring set to stored string).
Thebyte position is set (target substring adjusted accordingly).
Querying the Target Substring¶↑
This table summarizes (details and examples at the links):
| Method | Returns |
|---|---|
rest | Target substring. |
rest_size | Size (bytes) of target substring. |
Searching the Target Substring¶↑
Asearch method examines the target substring, but does not advance thepositions or (by implication) shorten the target substring.
This table summarizes (details and examples at the links):
| Method | Returns | Sets Match Values? |
|---|---|---|
check(pattern) | Matched leading substring ornil. | Yes. |
check_until(pattern) | Matched substring (anywhere) ornil. | Yes. |
exist?(pattern) | Matched substring (anywhere) end index. | Yes. |
match?(pattern) | Size of matched leading substring ornil. | Yes. |
peek(size) | Leading substring of given length (bytes). | No. |
peek_byte | Integer leading byte ornil. | No. |
rest | Target substring (from byte position to end). | No. |
Traversing the Target Substring¶↑
Atraversal method examines the target substring, and, if successful:
Advances thepositions.
Shortens the target substring.
This table summarizes (details and examples at links):
| Method | Returns | Sets Match Values? |
|---|---|---|
get_byte | Leading byte ornil. | No. |
getch | Leading character ornil. | No. |
scan(pattern) | Matched leading substring ornil. | Yes. |
scan_byte | Integer leading byte ornil. | No. |
scan_until(pattern) | Matched substring (anywhere) ornil. | Yes. |
skip(pattern) | Matched leading substring size ornil. | Yes. |
skip_until(pattern) | Position delta to end-of-matched-substring ornil. | Yes. |
unscan | self. | No. |
Querying the Scanner¶↑
Each of these methods queries the scanner object without modifying it (details and examples at links)
| Method | Returns |
|---|---|
beginning_of_line? | true orfalse. |
charpos | Character position. |
eos? | true orfalse. |
fixed_anchor? | true orfalse. |
inspect | String representation ofself. |
pos | Byte position. |
rest | Target substring. |
rest_size | Size of target substring. |
string | Stored string. |
Matching¶↑
StringScanner implements pattern matching via Ruby classRegexp, and its matching behaviors are the same as Ruby’s except for thefixed-anchor property.
Matcher Methods¶↑
Eachmatcher method takes a single argumentpattern, and attempts to find a matching substring in thetarget substring.
| Method | Pattern Type | Matches Target Substring | Success Return | May Update Positions? |
|---|---|---|---|---|
check | Regexp orString. | At beginning. | Matched substring. | No. |
check_until | Regexp orString. | Anywhere. | Substring. | No. |
match? | Regexp orString. | At beginning. | Match size. | No. |
exist? | Regexp orString. | Anywhere. | Substring size. | No. |
scan | Regexp orString. | At beginning. | Matched substring. | Yes. |
scan_until | Regexp orString. | Anywhere. | Substring. | Yes. |
skip | Regexp orString. | At beginning. | Match size. | Yes. |
skip_until | Regexp orString. | Anywhere. | Substring size. | Yes. |
Which matcher you choose will depend on:
Where you want to find a match:
Only at the beginning of the target substring:
check,match?,scan,skip.Anywhere in the target substring:
check_until,exist?,scan_until,skip_until.
Whether you want to:
Traverse, by advancing the positions:
scan,scan_until,skip,skip_until.Keep the positions unchanged:
check,check_until,match?,exist?.
What you want for the return value:
The substring:
check_until,scan_until.The substring size:
exist?,skip_until.
Match Values¶↑
Thematch values in aStringScanner object generally contain the results of the most recent attempted match.
Each match value may be thought of as:
Clear: Initially, or after an unsuccessful match attempt: usually,
false,nil, or{}.Set: After a successful match attempt:
true, string, array, or hash.
Each of these methods clears match values:
Each of these methods attempts a match based on a pattern, and either sets match values (if successful) or clears them (if not);
Basic Match Values¶↑
Basic match values are those not related to captures.
Each of these methods returns a basic match value:
| Method | Return After Match | Return After No Match |
|---|---|---|
matched? | true. | false. |
matched_size | Size of matched substring. | nil. |
matched | Matched substring. | nil. |
pre_match | Substring preceding matched substring. | nil. |
post_match | Substring following matched substring. | nil. |
See examples below.
Captured Match Values¶↑
Captured match values are those related tocaptures.
Each of these methods returns a captured match value:
| Method | Return After Match | Return After No Match |
|---|---|---|
size | Count of captured substrings. | nil. |
| # | nth captured substring. | nil. |
captures | Array of all captured substrings. | nil. |
values_at(*n) | Array of specified captured substrings. | nil. |
named_captures | Hash of named captures. | {}. |
See examples below.
Match Values Examples¶↑
Successful basic match attempt (no captures):
scanner =StringScanner.new('foobarbaz')scanner.exist?(/bar/)put_match_values(scanner)# Basic match values:# matched?: true# matched_size: 3# pre_match: "foo"# matched : "bar"# post_match: "baz"# Captured match values:# size: 1# captures: []# named_captures: {}# values_at: ["bar", nil]# []:# [0]: "bar"# [1]: nil
Failed basic match attempt (no captures);
scanner =StringScanner.new('foobarbaz')scanner.exist?(/nope/)match_values_cleared?(scanner)# => true
Successful unnamed capture match attempt:
scanner =StringScanner.new('foobarbazbatbam')scanner.exist?(/(foo)bar(baz)bat(bam)/)put_match_values(scanner)# Basic match values:# matched?: true# matched_size: 15# pre_match: ""# matched : "foobarbazbatbam"# post_match: ""# Captured match values:# size: 4# captures: ["foo", "baz", "bam"]# named_captures: {}# values_at: ["foobarbazbatbam", "foo", "baz", "bam", nil]# []:# [0]: "foobarbazbatbam"# [1]: "foo"# [2]: "baz"# [3]: "bam"# [4]: nil
Successful named capture match attempt; same as unnamed above, except fornamed_captures:
scanner =StringScanner.new('foobarbazbatbam')scanner.exist?(/(?<x>foo)bar(?<y>baz)bat(?<z>bam)/)scanner.named_captures# => {"x"=>"foo", "y"=>"baz", "z"=>"bam"}
Failed unnamed capture match attempt:
scanner =StringScanner.new('somestring')scanner.exist?(/(foo)bar(baz)bat(bam)/)match_values_cleared?(scanner)# => true
Failed named capture match attempt; same as unnamed above, except fornamed_captures:
scanner =StringScanner.new('somestring')scanner.exist?(/(?<x>foo)bar(?<y>baz)bat(?<z>bam)/)match_values_cleared?(scanner)# => falsescanner.named_captures# => {"x"=>nil, "y"=>nil, "z"=>nil}
Fixed-Anchor Property¶↑
Pattern matching inStringScanner is the same as in Ruby’s, except for its fixed-anchor property, which determines the meaning of'\A':
false(the default): matches the current byte position.scanner =StringScanner.new('foobar')scanner.scan(/\A./)# => "f"scanner.scan(/\A./)# => "o"scanner.scan(/\A./)# => "o"scanner.scan(/\A./)# => "b"
true: matches the beginning of the target substring; never matches unless the byte position is zero:scanner =StringScanner.new('foobar',fixed_anchor:true)scanner.scan(/\A./)# => "f"scanner.scan(/\A./)# => nilscanner.resetscanner.scan(/\A./)# => "f"
The fixed-anchor property is set when theStringScanner object is created, and may not be modified (seeStringScanner.new); methodfixed_anchor? returns the setting.
Public Class Methods
Source
static VALUEstrscan_initialize(int argc, VALUE *argv, VALUE self){ struct strscanner *p; VALUE str, options; p = check_strscan(self); rb_scan_args(argc, argv, "11", &str, &options); options = rb_check_hash_type(options); if (!NIL_P(options)) { VALUE fixed_anchor; ID keyword_ids[1]; keyword_ids[0] = rb_intern("fixed_anchor"); rb_get_kwargs(options, keyword_ids, 0, 1, &fixed_anchor); if (fixed_anchor == Qundef) { p->fixed_anchor_p = false; } else { p->fixed_anchor_p = RTEST(fixed_anchor); } } else { p->fixed_anchor_p = false; } StringValue(str); RB_OBJ_WRITE(self, &p->str, str); return self;}Returns a newStringScanner object whosestored string is the givenstring; sets thefixed-anchor property:
scanner =StringScanner.new('foobarbaz')scanner.string# => "foobarbaz"scanner.fixed_anchor?# => falseput_situation(scanner)# Situation:# pos: 0# charpos: 0# rest: "foobarbaz"# rest_size: 9
Public Instance Methods
Source
static VALUEstrscan_aref(VALUE self, VALUE idx){ const char *name; struct strscanner *p; long i; GET_SCANNER(self, p); if (! MATCHED_P(p)) return Qnil; switch (TYPE(idx)) { case T_SYMBOL: idx = rb_sym2str(idx); /* fall through */ case T_STRING: RSTRING_GETMEM(idx, name, i); i = name_to_backref_number(&(p->regs), p->regex, name, name + i, rb_enc_get(idx)); break; default: i = NUM2LONG(idx); } if (i < 0) i += p->regs.num_regs; if (i < 0) return Qnil; if (i >= p->regs.num_regs) return Qnil; if (p->regs.beg[i] == -1) return Qnil; return extract_range(p, adjust_register_position(p, p->regs.beg[i]), adjust_register_position(p, p->regs.end[i]));}Returns a captured substring ornil; seeCaptured Match Values.
When there are captures:
scanner =StringScanner.new('Fri Dec 12 1975 14:39')scanner.scan(/(?<wday>\w+) (?<month>\w+) (?<day>\d+) /)
specifierzero: returns the entire matched substring:scanner[0]# => "Fri Dec 12 "scanner.pre_match# => ""scanner.post_match# => "1975 14:39"
specifierpositive integer. returns thenth capture, ornilif out of range:scanner[1]# => "Fri"scanner[2]# => "Dec"scanner[3]# => "12"scanner[4]# => nil
specifiernegative integer. counts backward from the last subgroup:scanner[-1]# => "12"scanner[-4]# => "Fri Dec 12 "scanner[-5]# => nil
specifiersymbol or string. returns the named subgroup, ornilif no such:scanner[:wday]# => "Fri"scanner['wday']# => "Fri"scanner[:month]# => "Dec"scanner[:day]# => "12"scanner[:nope]# => nil
When there are no captures, only[0] returns non-nil:
scanner =StringScanner.new('foobarbaz')scanner.exist?(/bar/)scanner[0]# => "bar"scanner[1]# => nil
For a failed match, even[0] returnsnil:
scanner.scan(/nope/)# => nilscanner[0]# => nilscanner[1]# => nil
Source
static VALUEstrscan_bol_p(VALUE self){ struct strscanner *p; GET_SCANNER(self, p); if (CURPTR(p) > S_PEND(p)) return Qnil; if (p->curr == 0) return Qtrue; return (*(CURPTR(p) - 1) == '\n') ? Qtrue : Qfalse;}Returns whether theposition is at the beginning of a line; that is, at the beginning of thestored string or immediately after a newline:
scanner =StringScanner.new(MULTILINE_TEXT)scanner.string# => "Go placidly amid the noise and haste,\nand remember what peace there may be in silence.\n"scanner.pos# => 0scanner.beginning_of_line?# => truescanner.scan_until(/,/)# => "Go placidly amid the noise and haste,"scanner.beginning_of_line?# => falsescanner.scan(/\n/)# => "\n"scanner.beginning_of_line?# => truescanner.terminatescanner.beginning_of_line?# => truescanner.concat('x')scanner.terminatescanner.beginning_of_line?# => false
StringScanner#bol? is an alias forStringScanner#beginning_of_line?.
Source
static VALUEstrscan_captures(VALUE self){ struct strscanner *p; int i, num_regs; VALUE new_ary; GET_SCANNER(self, p); if (! MATCHED_P(p)) return Qnil; num_regs = p->regs.num_regs; new_ary = rb_ary_new2(num_regs); for (i = 1; i < num_regs; i++) { VALUE str; if (p->regs.beg[i] == -1) str = Qnil; else str = extract_range(p, adjust_register_position(p, p->regs.beg[i]), adjust_register_position(p, p->regs.end[i])); rb_ary_push(new_ary, str); } return new_ary;}Returns the array ofcaptured match values at indexes(1..) if the most recent match attempt succeeded, ornil otherwise:
scanner =StringScanner.new('Fri Dec 12 1975 14:39')scanner.captures# => nilscanner.exist?(/(?<wday>\w+) (?<month>\w+) (?<day>\d+) /)scanner.captures# => ["Fri", "Dec", "12"]scanner.values_at(*0..4)# => ["Fri Dec 12 ", "Fri", "Dec", "12", nil]scanner.exist?(/Fri/)scanner.captures# => []scanner.scan(/nope/)scanner.captures# => nil
Source
static VALUEstrscan_get_charpos(VALUE self){ struct strscanner *p; GET_SCANNER(self, p); return LONG2NUM(rb_enc_strlen(S_PBEG(p), CURPTR(p), rb_enc_get(p->str)));}call-seq: charpos -> character_position
Returns thecharacter position (initially zero), which may be different from thebyte position given by methodpos:
scanner =StringScanner.new(HIRAGANA_TEXT)scanner.string# => "こんにちは"scanner.getch# => "こ" # 3-byte character.scanner.getch# => "ん" # 3-byte character.put_situation(scanner)# Situation:# pos: 6# charpos: 2# rest: "にちは"# rest_size: 9
Source
static VALUEstrscan_check(VALUE self, VALUE re){ return strscan_do_scan(self, re, 0, 1, 1);}Attempts tomatch the givenpattern at the beginning of thetarget substring; does not modify thepositions.
If the match succeeds:
Returns the matched substring.
Sets allmatch values.
scanner =StringScanner.new('foobarbaz')scanner.pos =3scanner.check('bar')# => "bar"put_match_values(scanner)# Basic match values:# matched?: true# matched_size: 3# pre_match: "foo"# matched : "bar"# post_match: "baz"# Captured match values:# size: 1# captures: []# named_captures: {}# values_at: ["bar", nil]# []:# [0]: "bar"# [1]: nil# => 0..1put_situation(scanner)# Situation:# pos: 3# charpos: 3# rest: "barbaz"# rest_size: 6
If the match fails:
Returns
nil.Clears allmatch values.
scanner.check(/nope/)# => nilmatch_values_cleared?(scanner)# => true
Source
static VALUEstrscan_check_until(VALUE self, VALUE re){ return strscan_do_scan(self, re, 0, 1, 0);}Attempts tomatch the givenpattern anywhere (at anyposition) in thetarget substring; does not modify thepositions.
If the match succeeds:
Sets allmatch values.
Returns the matched substring, which extends from the currentposition to the end of the matched substring.
scanner =StringScanner.new('foobarbazbatbam')scanner.pos =6scanner.check_until(/bat/)# => "bazbat"put_match_values(scanner)# Basic match values:# matched?: true# matched_size: 3# pre_match: "foobarbaz"# matched : "bat"# post_match: "bam"# Captured match values:# size: 1# captures: []# named_captures: {}# values_at: ["bat", nil]# []:# [0]: "bat"# [1]: nilput_situation(scanner)# Situation:# pos: 6# charpos: 6# rest: "bazbatbam"# rest_size: 9
If the match fails:
Clears allmatch values.
Returns
nil.
scanner.check_until(/nope/)# => nilmatch_values_cleared?(scanner)# => true
Source
static VALUEstrscan_concat(VALUE self, VALUE str){ struct strscanner *p; GET_SCANNER(self, p); StringValue(str); rb_str_append(p->str, str); return self;}Appends the given
more_stringto thestored string.Returns
self.Does not affect thepositions ormatch values.
scanner =StringScanner.new('foo')scanner.string# => "foo"scanner.terminatescanner.concat('barbaz')# => #<StringScanner 3/9 "foo" @ "barba...">scanner.string# => "foobarbaz"put_situation(scanner)# Situation:# pos: 3# charpos: 3# rest: "barbaz"# rest_size: 6
Source
static VALUEstrscan_eos_p(VALUE self){ struct strscanner *p; GET_SCANNER(self, p); return EOS_P(p) ? Qtrue : Qfalse;}Returns whether theposition is at the end of thestored string:
scanner =StringScanner.new('foobarbaz')scanner.eos?# => falsepos =3scanner.eos?# => falsescanner.terminatescanner.eos?# => true
Source
static VALUEstrscan_exist_p(VALUE self, VALUE re){ return strscan_do_scan(self, re, 0, 0, 0);}Attempts tomatch the givenpattern anywhere (at anyposition) n thetarget substring; does not modify thepositions.
If the match succeeds:
Returns a byte offset: the distance in bytes between the currentposition and the end of the matched substring.
Sets allmatch values.
scanner =StringScanner.new('foobarbazbatbam')scanner.pos =6scanner.exist?(/bat/)# => 6put_match_values(scanner)# Basic match values:# matched?: true# matched_size: 3# pre_match: "foobarbaz"# matched : "bat"# post_match: "bam"# Captured match values:# size: 1# captures: []# named_captures: {}# values_at: ["bat", nil]# []:# [0]: "bat"# [1]: nilput_situation(scanner)# Situation:# pos: 6# charpos: 6# rest: "bazbatbam"# rest_size: 9
If the match fails:
Returns
nil.Clears allmatch values.
scanner.exist?(/nope/)# => nilmatch_values_cleared?(scanner)# => true
Source
static VALUEstrscan_fixed_anchor_p(VALUE self){ struct strscanner *p; p = check_strscan(self); return p->fixed_anchor_p ? Qtrue : Qfalse;}Returns whether thefixed-anchor property is set.
Source
static VALUEstrscan_get_byte(VALUE self){ struct strscanner *p; GET_SCANNER(self, p); CLEAR_MATCH_STATUS(p); if (EOS_P(p)) return Qnil; p->prev = p->curr; p->curr++; MATCHED(p); adjust_registers_to_matched(p); return extract_range(p, adjust_register_position(p, p->regs.beg[0]), adjust_register_position(p, p->regs.end[0]));}call-seq:get_byte -> byte_as_character or nil
Returns the next byte, if available:
If theposition is not at the end of thestored string:
Returns the next byte.
Increments thebyte position.
Adjusts thecharacter position.
scanner =StringScanner.new(HIRAGANA_TEXT)# => #<StringScanner 0/15 @ "\xE3\x81\x93\xE3\x82...">scanner.string# => "こんにちは"[scanner.get_byte,scanner.pos,scanner.charpos]# => ["\xE3", 1, 1][scanner.get_byte,scanner.pos,scanner.charpos]# => ["\x81", 2, 2][scanner.get_byte,scanner.pos,scanner.charpos]# => ["\x93", 3, 1][scanner.get_byte,scanner.pos,scanner.charpos]# => ["\xE3", 4, 2][scanner.get_byte,scanner.pos,scanner.charpos]# => ["\x82", 5, 3][scanner.get_byte,scanner.pos,scanner.charpos]# => ["\x93", 6, 2]
Otherwise, returns
nil, and does not change the positions.scanner.terminate[scanner.get_byte,scanner.pos,scanner.charpos]# => [nil, 15, 5]
Source
static VALUEstrscan_getch(VALUE self){ struct strscanner *p; long len; GET_SCANNER(self, p); CLEAR_MATCH_STATUS(p); if (EOS_P(p)) return Qnil; len = rb_enc_mbclen(CURPTR(p), S_PEND(p), rb_enc_get(p->str)); len = minl(len, S_RESTLEN(p)); p->prev = p->curr; p->curr += len; MATCHED(p); adjust_registers_to_matched(p); return extract_range(p, adjust_register_position(p, p->regs.beg[0]), adjust_register_position(p, p->regs.end[0]));}call-seq: getch -> character or nil
Returns the next (possibly multibyte) character, if available:
If theposition is at the beginning of a character:
Returns the character.
Increments thecharacter position by 1.
Increments thebyte position by the size (in bytes) of the character.
scanner =StringScanner.new(HIRAGANA_TEXT)scanner.string# => "こんにちは"[scanner.getch,scanner.pos,scanner.charpos]# => ["こ", 3, 1][scanner.getch,scanner.pos,scanner.charpos]# => ["ん", 6, 2][scanner.getch,scanner.pos,scanner.charpos]# => ["に", 9, 3][scanner.getch,scanner.pos,scanner.charpos]# => ["ち", 12, 4][scanner.getch,scanner.pos,scanner.charpos]# => ["は", 15, 5][scanner.getch,scanner.pos,scanner.charpos]# => [nil, 15, 5]
If theposition is within a multi-byte character (that is, not at its beginning), behaves like
get_byte(returns a 1-byte character):scanner.pos =1[scanner.getch,scanner.pos,scanner.charpos]# => ["\x81", 2, 2][scanner.getch,scanner.pos,scanner.charpos]# => ["\x93", 3, 1][scanner.getch,scanner.pos,scanner.charpos]# => ["ん", 6, 2]
If theposition is at the end of thestored string, returns
niland does not modify the positions:scanner.terminate[scanner.getch,scanner.pos,scanner.charpos]# => [nil, 15, 5]
Source
static VALUEstrscan_inspect(VALUE self){ struct strscanner *p; VALUE a, b; p = check_strscan(self); if (NIL_P(p->str)) { a = rb_sprintf("#<%"PRIsVALUE" (uninitialized)>", rb_obj_class(self)); return a; } if (EOS_P(p)) { a = rb_sprintf("#<%"PRIsVALUE" fin>", rb_obj_class(self)); return a; } if (p->curr == 0) { b = inspect2(p); a = rb_sprintf("#<%"PRIsVALUE" %ld/%ld @ %"PRIsVALUE">", rb_obj_class(self), p->curr, S_LEN(p), b); return a; } a = inspect1(p); b = inspect2(p); a = rb_sprintf("#<%"PRIsVALUE" %ld/%ld %"PRIsVALUE" @ %"PRIsVALUE">", rb_obj_class(self), p->curr, S_LEN(p), a, b); return a;}Returns a string representation ofself that may show:
The currentposition.
The size (in bytes) of thestored string.
The substring preceding the current position.
The substring following the current position (which is also thetarget substring).
scanner =StringScanner.new("Fri Dec 12 1975 14:39")scanner.pos =11scanner.inspect# => "#<StringScanner 11/21 \"...c 12 \" @ \"1975 ...\">"
If at beginning-of-string, item 4 above (following substring) is omitted:
scanner.resetscanner.inspect# => "#<StringScanner 0/21 @ \"Fri D...\">"
If at end-of-string, all items above are omitted:
scanner.terminatescanner.inspect# => "#<StringScanner fin>"
Source
static VALUEstrscan_match_p(VALUE self, VALUE re){ return strscan_do_scan(self, re, 0, 0, 1);}Attempts tomatch the givenpattern at the beginning of thetarget substring; does not modify thepositions.
If the match succeeds:
Setsmatch values.
Returns the size in bytes of the matched substring.
scanner =StringScanner.new('foobarbaz')scanner.pos =3scanner.match?(/bar/)=>3put_match_values(scanner)# Basic match values:# matched?: true# matched_size: 3# pre_match: "foo"# matched : "bar"# post_match: "baz"# Captured match values:# size: 1# captures: []# named_captures: {}# values_at: ["bar", nil]# []:# [0]: "bar"# [1]: nilput_situation(scanner)# Situation:# pos: 3# charpos: 3# rest: "barbaz"# rest_size: 6
If the match fails:
Clears match values.
Returns
nil.Does not increment positions.
scanner.match?(/nope/)# => nilmatch_values_cleared?(scanner)# => true
Source
static VALUEstrscan_matched(VALUE self){ struct strscanner *p; GET_SCANNER(self, p); if (! MATCHED_P(p)) return Qnil; return extract_range(p, adjust_register_position(p, p->regs.beg[0]), adjust_register_position(p, p->regs.end[0]));}Returns the matched substring from the most recentmatch attempt if it was successful, ornil otherwise; seeBasic Matched Values:
scanner =StringScanner.new('foobarbaz')scanner.matched# => nilscanner.pos =3scanner.match?(/bar/)# => 3scanner.matched# => "bar"scanner.match?(/nope/)# => nilscanner.matched# => nil
Source
static VALUEstrscan_matched_p(VALUE self){ struct strscanner *p; GET_SCANNER(self, p); return MATCHED_P(p) ? Qtrue : Qfalse;}Returnstrue of the most recentmatch attempt was successful,false otherwise; seeBasic Matched Values:
scanner =StringScanner.new('foobarbaz')scanner.matched?# => falsescanner.pos =3scanner.exist?(/baz/)# => 6scanner.matched?# => truescanner.exist?(/nope/)# => nilscanner.matched?# => false
Source
static VALUEstrscan_matched_size(VALUE self){ struct strscanner *p; GET_SCANNER(self, p); if (! MATCHED_P(p)) return Qnil; return LONG2NUM(p->regs.end[0] - p->regs.beg[0]);}Returns the size (in bytes) of the matched substring from the most recent matchmatch attempt if it was successful, ornil otherwise; seeBasic Matched Values:
scanner =StringScanner.new('foobarbaz')scanner.matched_size# => nilpos =3scanner.exist?(/baz/)# => 9scanner.matched_size# => 3scanner.exist?(/nope/)# => nilscanner.matched_size# => nil
Source
static VALUEstrscan_named_captures(VALUE self){ struct strscanner *p; named_captures_data data; GET_SCANNER(self, p); data.self = self; data.captures = rb_hash_new(); if (!RB_NIL_P(p->regex)) { onig_foreach_name(RREGEXP_PTR(p->regex), named_captures_iter, &data); } return data.captures;}Returns the array of captured match values at indexes (1..) if the most recent match attempt succeeded, or nil otherwise; seeCaptured Match Values:
scanner =StringScanner.new('Fri Dec 12 1975 14:39')scanner.named_captures# => {}pattern =/(?<wday>\w+) (?<month>\w+) (?<day>\d+) /scanner.match?(pattern)scanner.named_captures# => {"wday"=>"Fri", "month"=>"Dec", "day"=>"12"}scanner.string ='nope'scanner.match?(pattern)scanner.named_captures# => {"wday"=>nil, "month"=>nil, "day"=>nil}scanner.match?(/nosuch/)scanner.named_captures# => {}
Source
static VALUEstrscan_peek(VALUE self, VALUE vlen){ struct strscanner *p; long len; GET_SCANNER(self, p); len = NUM2LONG(vlen); if (EOS_P(p)) return str_new(p, "", 0); len = minl(len, S_RESTLEN(p)); return extract_beg_len(p, p->curr, len);}Returns the substringstring[pos, length]; does not updatematch values orpositions:
scanner =StringScanner.new('foobarbaz')scanner.pos =3scanner.peek(3)# => "bar"scanner.terminatescanner.peek(3)# => ""
Source
static VALUEstrscan_peek_byte(VALUE self){ struct strscanner *p; GET_SCANNER(self, p); if (EOS_P(p)) return Qnil; return INT2FIX((unsigned char)*CURPTR(p));}Peeks at the current byte and returns it as an integer.
s =StringScanner.new('ab')s.peek_byte# => 97
call-seq: pos -> byte_position
Returns the integerbyte position, which may be different from thecharacter position:
scanner =StringScanner.new(HIRAGANA_TEXT)scanner.string# => "こんにちは"scanner.pos# => 0scanner.getch# => "こ" # 3-byte character.scanner.charpos# => 1scanner.pos# => 3
call-seq: pos = n -> n pointer = n -> n
Sets thebyte position and thecharacter position; returnsn.
Does not affectmatch values.
For non-negativen, sets the position ton:
scanner =StringScanner.new(HIRAGANA_TEXT)scanner.string# => "こんにちは"scanner.pos =3# => 3scanner.rest# => "んにちは"scanner.charpos# => 1
For negativen, counts from the end of thestored string:
scanner.pos =-9# => -9scanner.pos# => 6scanner.rest# => "にちは"scanner.charpos# => 2
Source
static VALUEstrscan_get_pos(VALUE self){ struct strscanner *p; GET_SCANNER(self, p); return LONG2NUM(p->curr);}call-seq: pos -> byte_position
Returns the integerbyte position, which may be different from thecharacter position:
scanner =StringScanner.new(HIRAGANA_TEXT)scanner.string# => "こんにちは"scanner.pos# => 0scanner.getch# => "こ" # 3-byte character.scanner.charpos# => 1scanner.pos# => 3
Source
static VALUEstrscan_set_pos(VALUE self, VALUE v){ struct strscanner *p; long i; GET_SCANNER(self, p); i = NUM2LONG(v); if (i < 0) i += S_LEN(p); if (i < 0) rb_raise(rb_eRangeError, "index out of range"); if (i > S_LEN(p)) rb_raise(rb_eRangeError, "index out of range"); p->curr = i; return LONG2NUM(i);}call-seq: pos = n -> n pointer = n -> n
Sets thebyte position and thecharacter position; returnsn.
Does not affectmatch values.
For non-negativen, sets the position ton:
scanner =StringScanner.new(HIRAGANA_TEXT)scanner.string# => "こんにちは"scanner.pos =3# => 3scanner.rest# => "んにちは"scanner.charpos# => 1
For negativen, counts from the end of thestored string:
scanner.pos =-9# => -9scanner.pos# => 6scanner.rest# => "にちは"scanner.charpos# => 2
Source
static VALUEstrscan_post_match(VALUE self){ struct strscanner *p; GET_SCANNER(self, p); if (! MATCHED_P(p)) return Qnil; return extract_range(p, adjust_register_position(p, p->regs.end[0]), S_LEN(p));}Returns the substring that follows the matched substring from the most recent match attempt if it was successful, ornil otherwise; seeBasic Match Values:
scanner =StringScanner.new('foobarbaz')scanner.post_match# => nilscanner.pos =3scanner.match?(/bar/)# => 3scanner.post_match# => "baz"scanner.match?(/nope/)# => nilscanner.post_match# => nil
Source
static VALUEstrscan_pre_match(VALUE self){ struct strscanner *p; GET_SCANNER(self, p); if (! MATCHED_P(p)) return Qnil; return extract_range(p, 0, adjust_register_position(p, p->regs.beg[0]));}Returns the substring that precedes the matched substring from the most recent match attempt if it was successful, ornil otherwise; seeBasic Match Values:
scanner =StringScanner.new('foobarbaz')scanner.pre_match# => nilscanner.pos =3scanner.exist?(/baz/)# => 6scanner.pre_match# => "foobar" # Substring of entire string, not just target string.scanner.exist?(/nope/)# => nilscanner.pre_match# => nil
Source
static VALUEstrscan_reset(VALUE self){ struct strscanner *p; GET_SCANNER(self, p); p->curr = 0; CLEAR_MATCH_STATUS(p); return self;}Sets bothbyte position andcharacter position to zero, and clearsmatch values; returnsself:
scanner =StringScanner.new('foobarbaz')scanner.exist?(/bar/)# => 6scanner.reset# => #<StringScanner 0/9 @ "fooba...">put_situation(scanner)# Situation:# pos: 0# charpos: 0# rest: "foobarbaz"# rest_size: 9# => nilmatch_values_cleared?(scanner)# => true
Source
static VALUEstrscan_rest(VALUE self){ struct strscanner *p; GET_SCANNER(self, p); if (EOS_P(p)) { return str_new(p, "", 0); } return extract_range(p, p->curr, S_LEN(p));}Returns the ‘rest’ of thestored string (all after the currentposition), which is thetarget substring:
scanner =StringScanner.new('foobarbaz')scanner.rest# => "foobarbaz"scanner.pos =3scanner.rest# => "barbaz"scanner.terminatescanner.rest# => ""
Source
static VALUEstrscan_rest_size(VALUE self){ struct strscanner *p; long i; GET_SCANNER(self, p); if (EOS_P(p)) { return INT2FIX(0); } i = S_RESTLEN(p); return INT2FIX(i);}Returns the size (in bytes) of therest of thestored string:
scanner =StringScanner.new('foobarbaz')scanner.rest# => "foobarbaz"scanner.rest_size# => 9scanner.pos =3scanner.rest# => "barbaz"scanner.rest_size# => 6scanner.terminatescanner.rest# => ""scanner.rest_size# => 0
Source
static VALUEstrscan_scan(VALUE self, VALUE re){ return strscan_do_scan(self, re, 1, 1, 1);}call-seq: scan(pattern) -> substring or nil
Attempts tomatch the givenpattern at the beginning of thetarget substring.
If the match succeeds:
Returns the matched substring.
Increments thebyte position by
substring.bytesize, and may increment thecharacter position.Setsmatch values.
scanner =StringScanner.new(HIRAGANA_TEXT)scanner.string# => "こんにちは"scanner.pos =6scanner.scan(/に/)# => "に"put_match_values(scanner)# Basic match values:# matched?: true# matched_size: 3# pre_match: "こん"# matched : "に"# post_match: "ちは"# Captured match values:# size: 1# captures: []# named_captures: {}# values_at: ["に", nil]# []:# [0]: "に"# [1]: nilput_situation(scanner)# Situation:# pos: 9# charpos: 3# rest: "ちは"# rest_size: 6
If the match fails:
Returns
nil.Does not increment byte and character positions.
Clears match values.
scanner.scan(/nope/)# => nilmatch_values_cleared?(scanner)# => true
Source
static VALUEstrscan_scan_byte(VALUE self){ struct strscanner *p; VALUE byte; GET_SCANNER(self, p); CLEAR_MATCH_STATUS(p); if (EOS_P(p)) return Qnil; byte = INT2FIX((unsigned char)*CURPTR(p)); p->prev = p->curr; p->curr++; MATCHED(p); adjust_registers_to_matched(p); return byte;}Scans one byte and returns it as an integer. This method is not multibyte character sensitive. See also:getch.
Source
static VALUEstrscan_scan_until(VALUE self, VALUE re){ return strscan_do_scan(self, re, 1, 1, 0);}call-seq:scan_until(pattern) -> substring or nil
Attempts tomatch the givenpattern anywhere (at anyposition) in thetarget substring.
If the match attempt succeeds:
Setsmatch values.
Sets thebyte position to the end of the matched substring; may adjust thecharacter position.
Returns the matched substring.
scanner =StringScanner.new(HIRAGANA_TEXT)scanner.string# => "こんにちは"scanner.pos =6scanner.scan_until(/ち/)# => "にち"put_match_values(scanner)# Basic match values:# matched?: true# matched_size: 3# pre_match: "こんに"# matched : "ち"# post_match: "は"# Captured match values:# size: 1# captures: []# named_captures: {}# values_at: ["ち", nil]# []:# [0]: "ち"# [1]: nilput_situation(scanner)# Situation:# pos: 12# charpos: 4# rest: "は"# rest_size: 3
If the match attempt fails:
Clears match data.
Returns
nil.Does not update positions.
scanner.scan_until(/nope/)# => nilmatch_values_cleared?(scanner)# => true
Source
static VALUEstrscan_size(VALUE self){ struct strscanner *p; GET_SCANNER(self, p); if (! MATCHED_P(p)) return Qnil; return INT2FIX(p->regs.num_regs);}Returns the count of captures if the most recent match attempt succeeded,nil otherwise; seeCaptures Match Values:
scanner =StringScanner.new('Fri Dec 12 1975 14:39')scanner.size# => nilpattern =/(?<wday>\w+) (?<month>\w+) (?<day>\d+) /scanner.match?(pattern)scanner.values_at(*0..scanner.size)# => ["Fri Dec 12 ", "Fri", "Dec", "12", nil]scanner.size# => 4scanner.match?(/nope/)# => nilscanner.size# => nil
Source
static VALUEstrscan_skip(VALUE self, VALUE re){ return strscan_do_scan(self, re, 1, 0, 1);}call-seq: skip(pattern) match_size or nil
Attempts tomatch the givenpattern at the beginning of thetarget substring;
If the match succeeds:
Increments thebyte position by substring.bytesize, and may increment thecharacter position.
Setsmatch values.
Returns the size (bytes) of the matched substring.
scanner =StringScanner.new(HIRAGANA_TEXT)scanner.string# => "こんにちは"scanner.pos =6scanner.skip(/に/)# => 3put_match_values(scanner)# Basic match values:# matched?: true# matched_size: 3# pre_match: "こん"# matched : "に"# post_match: "ちは"# Captured match values:# size: 1# captures: []# named_captures: {}# values_at: ["に", nil]# []:# [0]: "に"# [1]: nilput_situation(scanner)# Situation:# pos: 9# charpos: 3# rest: "ちは"# rest_size: 6scanner.skip(/nope/)# => nilmatch_values_cleared?(scanner)# => true
Source
static VALUEstrscan_skip_until(VALUE self, VALUE re){ return strscan_do_scan(self, re, 1, 0, 0);}call-seq:skip_until(pattern) -> matched_substring_size or nil
Attempts tomatch the givenpattern anywhere (at anyposition) in thetarget substring; does not modify the positions.
If the match attempt succeeds:
Setsmatch values.
Returns the size of the matched substring.
scanner =StringScanner.new(HIRAGANA_TEXT)scanner.string# => "こんにちは"scanner.pos =6scanner.skip_until(/ち/)# => 6put_match_values(scanner)# Basic match values:# matched?: true# matched_size: 3# pre_match: "こんに"# matched : "ち"# post_match: "は"# Captured match values:# size: 1# captures: []# named_captures: {}# values_at: ["ち", nil]# []:# [0]: "ち"# [1]: nilput_situation(scanner)# Situation:# pos: 12# charpos: 4# rest: "は"# rest_size: 3
If the match attempt fails:
Clears match values.
Returns
nil.
scanner.skip_until(/nope/)# => nilmatch_values_cleared?(scanner)# => true
Source
static VALUEstrscan_get_string(VALUE self){ struct strscanner *p; GET_SCANNER(self, p); return p->str;}Returns thestored string:
scanner =StringScanner.new('foobar')scanner.string# => "foobar"scanner.concat('baz')scanner.string# => "foobarbaz"
Source
static VALUEstrscan_set_string(VALUE self, VALUE str){ struct strscanner *p = check_strscan(self); StringValue(str); RB_OBJ_WRITE(self, &p->str, str); p->curr = 0; CLEAR_MATCH_STATUS(p); return str;}Replaces thestored string with the givenother_string:
Sets bothpositions to zero.
Clearsmatch values.
Returns
other_string.
scanner =StringScanner.new('foobar')scanner.scan(/foo/)put_situation(scanner)# Situation:# pos: 3# charpos: 3# rest: "bar"# rest_size: 3match_values_cleared?(scanner)# => falsescanner.string ='baz'# => "baz"put_situation(scanner)# Situation:# pos: 0# charpos: 0# rest: "baz"# rest_size: 3match_values_cleared?(scanner)# => true
Source
static VALUEstrscan_terminate(VALUE self){ struct strscanner *p; GET_SCANNER(self, p); p->curr = S_LEN(p); CLEAR_MATCH_STATUS(p); return self;}call-seq: terminate -> self
Sets the scanner to end-of-string; returnsself:
Sets bothpositions to end-of-stream.
Clearsmatch values.
scanner =StringScanner.new(HIRAGANA_TEXT)scanner.string# => "こんにちは"scanner.scan_until(/に/)put_situation(scanner)# Situation:# pos: 9# charpos: 3# rest: "ちは"# rest_size: 6match_values_cleared?(scanner)# => falsescanner.terminate# => #<StringScanner fin>put_situation(scanner)# Situation:# pos: 15# charpos: 5# rest: ""# rest_size: 0match_values_cleared?(scanner)# => true
Source
static VALUEstrscan_unscan(VALUE self){ struct strscanner *p; GET_SCANNER(self, p); if (! MATCHED_P(p)) rb_raise(ScanError, "unscan failed: previous match record not exist"); p->curr = p->prev; CLEAR_MATCH_STATUS(p); return self;}Sets theposition to its value previous to the recent successfulmatch attempt:
scanner =StringScanner.new('foobarbaz')scanner.scan(/foo/)put_situation(scanner)# Situation:# pos: 3# charpos: 3# rest: "barbaz"# rest_size: 6scanner.unscan# => #<StringScanner 0/9 @ "fooba...">put_situation(scanner)# Situation:# pos: 0# charpos: 0# rest: "foobarbaz"# rest_size: 9
Raises an exception if match values are clear:
scanner.scan(/nope/)# => nilmatch_values_cleared?(scanner)# => truescanner.unscan# Raises StringScanner::Error.
Source
static VALUEstrscan_values_at(int argc, VALUE *argv, VALUE self){ struct strscanner *p; long i; VALUE new_ary; GET_SCANNER(self, p); if (! MATCHED_P(p)) return Qnil; new_ary = rb_ary_new2(argc); for (i = 0; i<argc; i++) { rb_ary_push(new_ary, strscan_aref(self, argv[i])); } return new_ary;}Returns an array of captured substrings, ornil of none.
For eachspecifier, the returned substring is[specifier]; see[].
scanner =StringScanner.new('Fri Dec 12 1975 14:39')pattern =/(?<wday>\w+) (?<month>\w+) (?<day>\d+) /scanner.match?(pattern)scanner.values_at(*0..3)# => ["Fri Dec 12 ", "Fri", "Dec", "12"]scanner.values_at(*%i[wday month day])# => ["Fri", "Dec", "12"]
Private Instance Methods
Source
static VALUEstrscan_init_copy(VALUE vself, VALUE vorig){ struct strscanner *self, *orig; self = check_strscan(vself); orig = check_strscan(vorig); if (self != orig) { self->flags = orig->flags; RB_OBJ_WRITE(vself, &self->str, orig->str); self->prev = orig->prev; self->curr = orig->curr; if (rb_reg_region_copy(&self->regs, &orig->regs)) rb_memerror(); RB_GC_GUARD(vorig); } return vself;}Returns a shallow copy ofself; thestored string in the copy is the same string as inself.